DeepSeek: Why is the whole world going crazy? What's going on behind the scenes?
DeepSeek is a name that has struck fear into the bones of America's largest companies. Not only OpenAI, which started the generative artificial intelligence (AI) fever, but also Google, Microsoft, Anthropic, Meta, Amazon and all the others who mistakenly thought that they would dominate for at least a little while and attract billions in investments, were scared.
When Chinese company DeepSeek unveiled its AI model last month, it shook the American establishment by claiming that its model was equivalent to, or in some ways superior to, the US model at a fraction of the cost. The DeepSeek-V3 model reportedly requires only $6 million in computing power to train, and that doesn’t include “prior research and ablation experiments on architectures, algorithms, or data.”
Immediately after its release, DeepSeek overtook ChatGPT in terms of ratings in the Apple mobile store, and the number of downloads is also steadily growing.
The consequences have been enormous. In the US, people are wondering whether the enormous investments in AI were really necessary when their Chinese counterparts could achieve the same result with meager investments. Shares of companies, including Nvidia, have fallen, and the question has arisen again whether this is the moment when the AI bubble will burst.
On the other hand, many are wondering if DeepSeek is really as revolutionary as the company claims. Are they hiding something? What did they use to train their model?
What is DeepSeek?
DeepSeek is the name of a startup, a large language model and chatbot that works in a similar way to ChatGPT, Gemini and Copilot. The look, the way it is used and also the way it communicates are almost identical to the American solutions, so the transition was very easy for users, and the use was already familiar.
How powerful is it, and is it really better than ChatGPT and the others? The company says that it is as powerful as OpenAI's o1 model, which was released late last year, on tasks like math and coding. OpenAI recently introduced a new model, o3, which is said to be more powerful than all the models in tests, but it is not yet available for public testing.
The latest R1 (DeepSeek) model is a reasoning language model. Similar to OpenAI's o1 model, these models generate answers incrementally, simulating the way humans think about problems or ideas.
The biggest shocker was that they spent just $6 million to train the V3 model that powers the chatbot. By comparison, OpenAI spent more than $100 million to develop the GPT-4 model, and Meta spent about $60 million on Llama. They did this despite trade restrictions that have officially prevented China from accessing the latest chips for some time.
The founder of DeepSeek is said to have stockpiled Nvidia A100 chips, the export of which to China is banned from September 2022. Some experts believe that he combined these chips with cheaper, less sophisticated ones, resulting in a much more efficient process. DeepSeek also uses less memory than its competitors, ultimately reducing the cost of performing tasks for users.
There are also rumors that the company is actually using the latest Nvidia H100 chips, but there is no concrete evidence, and the company has not yet commented on the "allegations".
A new independent study by SemiAnalysis says they have spent around $500 million on hardware. Their lightning-fast pace of developing equivalent AI models has also come under scrutiny from OpenAI, which suspects the Chinese company has been “distilling their models.”
Shortly after launching and taking the top spot in the Apple App Store, DeepSeek began experiencing outages. The chatbot was unavailable for extended periods of time, and companies and developers were unable to access its API. The company said it was the target of malicious attacks that slowed operations and temporarily restricted registration.
Who is leading China's AI revolution?
DeepSeek didn’t appear overnight, but it didn’t attract much media attention until last month, even though it was known that it was developing AI models. The startup is majority-owned by Liang Wenfeng, who is also the co-founder of the investment fund High-Flyer. In March 2023, the latter announced that he was starting a new project and establishing a “new and independent research group to explore the essence of general artificial intelligence.” A few months later, we got DeepSeek. Young and experienced talents were attracted primarily by the promise of high salaries and the opportunity to work on unique research projects.
It is unclear how much High-Flyer has invested in DeepSeek. High-Flyer has an office in the same building as DeepSeek and, according to Chinese company records, also holds patents related to chips used to train AI models.
How to proceed?
DeepSeek described in the research how it trains its models. Since the official company does not have access to the same chips as its American competitors, it had to find another way.
Leading AI systems learn their skills by finding patterns in large amounts of data, including text, images, and sounds. DeepSeek has described a way to distribute this data analysis across multiple specialized AI models, minimizing the time wasted moving data from one place to another.
Similar methods have been used by others before, but moving data between models has typically reduced performance. DeepSeek did this in a way that allowed it to use less computing power.
The cost and training method aren't the only differences compared to other AI models. DeepSeek is also open source, meaning it can be downloaded, used, and upgraded by virtually anyone.
In contrast, Meta and Google's models, while publicly available, are not considered truly open source, as the way users use the models is restricted by licenses, and the training datasets are not publicly available, leading to numerous lawsuits. For example, Facebook and Meta are in a legal battle with authors who accuse the company of using pirated copies of their books for training. The New York Times is suing Microsoft and OpenAI for allegedly using their content without permission for training.
One of the reasons why American UI models are not open source is the greater possibility of spreading false information, hate speech, and the like, but the main one is certainly profit and more opportunities to monetize UI models.
China's open-source models have the potential to democratize artificial intelligence, experts say, which could seriously undermine the strategy of American companies. There are fears that American companies and scientists could also start using DeepSeek to develop and build their own solutions.
In China, the latest model has already been used by telecommunications companies, and Geely is the first among automotive companies to integrate the DeepSeek model into the smart systems of its cars.
Meanwhile, other countries around the world are already considering a possible block. Italy, Ireland, Belgium, the Netherlands and France are among the countries that have already launched an investigation into how DeepSeek uses and stores data and whether it may be violating European data regulations. DeepSeek is being blocked in Italy as a precautionary measure, and its use is also prohibited in South Korea and Australia.
DeepSeek says it took every precaution to protect the data it stores in China. But it wasn't long before experts discovered that the company had inadvertently left millions of lines of data unsecured, including software keys, logs, chats, and more.
Cisco analyzed the latest AI actor and found that “DeepSeek R1 lacks robust safeguards, making it highly susceptible to algorithmic breakthrough and potential abuse.”
A new front has opened
Until January 2025, the US was the only horse in the race, and the American riders didn't need to look to anyone in the background. Now China is breathing down their necks. The trade and technology front between the US and China has been open for some time, and DeepSeek has opened a new front where the fate of artificial intelligence will be decided.