SHARE
Facebook X Pinterest WhatsApp

DeepSeek Explodes on the Scene

thumbnail
DeepSeek Explodes on the Scene

The release of the new Chinese chatbot DeepSeek-R1, which uses lower-powered processors, sent shockwaves throughout the industry. Some have called it a Sputnik moment for AI.

Jan 28, 2025

Chinese startup DeepSeek saw DeepSeek-R1, its artificial intelligence chatbot announced last week, jump to the top of the Apple App Store downloads on Monday, sending shockwaves throughout the industry. Some have called it a Sputnik moment for AI. The stocks of major chip players, including NVIDIA, Arm, Broadcom, and more, were hit. (NVIDIA’s stock dropped more than 13%.) Additionally, the Nasdaq stock market fell by more than 3% on Monday, with the drop at one point wiping more than $1 trillion off the index of technology stocks, according to industry reports

What’s behind such a strong reaction? According to Technology Review, “DeepSeek aimed for accurate answers rather than detailing every logical step, significantly reducing computing time while maintaining a high level of effectiveness.” As such, the company’s large language model (LLM) touts powerful performance at a fraction of competitors’ steep training costs. Perhaps more importantly, the open-source AI assistant accomplishes its results using less advanced (and lower cost) chips than rival LLMs.

That latter point has significant implications in a number of ways. First, it means rather than requiring vast arrays of high-end GPUs, DeepSeek can run on more modest processors, potentially opening up AI to a wide range of organizations that may have been locked out in the past due to costs.

A second implication relates to the ongoing U.S./China trade wars. The U.S. has been trying to limit China’s access to advanced technology for AI. Earlier this month, before he left office, the Biden administration introduced export controls intended to limit China’s access to powerful GPUs, which underpin advanced AI projects. It appears DeepSeek works with processors that are still readily available.

However, NVIDIA said in a statement on Monday, “DeepSeek’s work illustrates how new models can be created using [test time scaling], leveraging widely-available models and compute that is fully export control compliant. The company stressed that “inference still requires significant numbers of NVIDIA GPUs and high-performance networking.” 

See also: 2025 Predictions: Year of the Commoditization of Large Language Models (LLMS)

What makes DeepSeek different?

Scientific American reported that while DeepSeek “reportedly had a stockpile of high-performance NVIDIA A100 chips from times prior to the U.S. ban. So, its engineers could have used those to develop the model. But in a key breakthrough, the startup says it instead used much lower-powered NVIDIA H800 chips to train the new model, dubbed DeepSeek-R1.” Additionally, the same article noted that because the solution requires less computational power, the cost of running DeepSeek-R1 is a tenth of the cost of similar competitors.

Other industry reports cite different numbers on cost savings, “DeepSeek claims its V3 large language model cost just $5.6 million to train, a fraction of ChatGPT’s reported training costs of more than $100 million. With comparable performance to OpenAI’s o1 model, a 95% cost cut may be especially attractive to cash-strapped companies looking to leverage generative AI (GenAI).”

A further distinction is that the company has made the code behind the product open source. It is available on GitHub.

The model differs from others, such as o1, in how it reinforces learning during training. “While many LLMs have an external “critic” model that runs alongside them, correcting errors and nudging the LLM toward verified answers, DeepSeek-R1 uses a set of rules internal to the model to teach it which of the possible answers it generates is best,” noted experts in the article.

thumbnail
Salvatore Salamone

Salvatore Salamone is a physicist by training who writes about science and information technology. During his career, he has been a senior or executive editor at many industry-leading publications including High Technology, Network World, Byte Magazine, Data Communications, LAN Times, InternetWeek, Bio-IT World, and Lightwave, The Journal of Fiber Optics. He also is the author of three business technology books.

Recommended for you...

AI Agents Need Keys to Your Kingdom
The Rise of Autonomous BI: How AI Agents Are Transforming Data Discovery and Analysis
Why the Next Evolution in the C-Suite Is a Chief Data, Analytics, and AI Officer
Digital Twins in 2026: From Digital Replicas to Intelligent, AI-Driven Systems

Featured Resources from Cloud Data Insights

The Difficult Reality of Implementing Zero Trust Networking
Misbah Rehman
Jan 6, 2026
Cloud Evolution 2026: Strategic Imperatives for Chief Data Officers
Why Network Services Need Automation
The Shared Responsibility Model and Its Impact on Your Security Posture
RT Insights Logo

Analysis and market insights on real-time analytics including Big Data, the IoT, and cognitive computing. Business use cases and technologies are discussed.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.