Hangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd.,[3][4][5] doing business as DeepSeek, is a Chinese artificial intelligence (AI) company that develops large language models (LLMs). Based in Hangzhou, Zhejiang, DeepSeek is owned and funded by High-Flyer, a Chinese hedge fund. DeepSeek was founded in July 2023 by Liang Wenfeng, the co-founder of High-Flyer, who also serves as the CEO for both of the companies.[7][8][9] The company launched an eponymous chatbot alongside its DeepSeek-R1 model in January 2025.
DeepSeek-R1 provided responses comparable to other contemporary large language models, such as OpenAI's GPT-4 and o1. Its training cost was reported to be significantly lower than other LLMs. The company claims that it trained its V3 model for US$6 million—far less than the US$100 million cost for OpenAI's GPT-4 in 2023[10]—and using approximately one-tenth the computing power consumed by Meta's comparable model, Llama 3.1.[10][11][12] DeepSeek's success against larger and more established rivals has been described as "upending AI".[13][14]
DeepSeek's models are described as "open-weight", meaning the exact parameters are openly shared, but the training data is not openly licensed.[15][16] Since the January 2025 debut of DeepSeek-R1, the company has made its new models available under free and open-source software licenses, primarily the MIT License.[17] The company reportedly recruits AI researchers from top Chinese universities[13] and also hires from outside traditional computer science fields to broaden its models' knowledge and capabilities.[11]
DeepSeek significantly reduced training expenses for their R1 model by incorporating techniques such as mixture of experts (MoE) layers.[18] The company also trained its models during ongoing trade restrictions on AI chip exports to China, using weaker AI chips intended for export and employing fewer units overall.[12][19] Observers say this breakthrough sent "shock waves" through the industry which were described as triggering a "Sputnik moment" for the US in the field of artificial intelligence, particularly due to its open-source, cost-effective, and high-performing AI models.[20][21][22] This threatened established AI hardware leaders such as Nvidia; Nvidia's share price dropped sharply, losing US$600 billion in market value, the largest single-company decline in U.S. stock market history.[23][24]
History
Founding and early years (2016–2023)
In February 2016, High-Flyer was co-founded by AI enthusiast Liang Wenfeng, who had been trading since the 2008 financial crisis while attending Zhejiang University.[25] The company began stock trading using a GPU-dependent deep learning model on 21 October 2016; before then, it had used CPU-based linear models. By the end of 2017, most of its trading was driven by AI.[26]
Liang established High-Flyer as a hedge fund focused on developing and using AI trading algorithms, and by 2021 the firm was using AI exclusively,[27] often using Nvidia chips.[28]
In 2019, the company began constructing its first computing cluster, Fire-Flyer, at a cost of 200 million yuan; it contained 1,100 GPUs interconnected at 200 Gbit/s and was retired after 1.5 years in operation.[26]
By 2021, Liang had started buying large quantities of Nvidia GPUs for an AI project,[28] reportedly obtaining 10,000 Nvidia A100 GPUs[29] before the United States restricted chip sales to China.[27] Computing cluster Fire-Flyer 2 began construction in 2021 with a budget of 1 billion yuan.[26]
It was reported that in 2022, Fire-Flyer 2's capacity had been used at over 96%, totaling 56.74 million GPU hours. 27% was used to support scientific computing outside the company.[26]
During 2022, Fire-Flyer 2 had 5,000 PCIe A100 GPUs in 625 nodes, each containing 8 GPUs. At the time, it exclusively used PCIe instead of the DGX version of A100, since at the time the models it trained could fit within a single 40 GB GPU VRAM and so there was no need for the higher bandwidth of DGX (i.e., it required only data parallelism but not model parallelism). Later, it incorporated NVLinks and NCCL (Nvidia Collective Communications Library) to train larger models that required model parallelism.
On 14 April 2023,[30] High-Flyer announced the launch of an artificial general intelligence (AGI) research lab, stating that the new lab would focus on developing AI tools unrelated to the firm's financial business.[31][32] Two months later, on 17 July 2023,[1] that lab was spun off into an independent company, DeepSeek, with High-Flyer as its principal investor and backer.[27][33][32] Venture capital investors were reluctant to provide funding, as they considered it unlikely that the venture would be able to quickly generate an "exit".[27]
Model releases (2023–present)
DeepSeek released its first model, DeepSeek Coder, on 2 November 2023, followed by the DeepSeek-LLM series on 29 November 2023. In January 2024, it released two DeepSeek-MoE models (Base and Chat), and in April 3 DeepSeek-Math models (Base, Instruct, and RL).[34]
DeepSeek-V2 was released in May 2024, followed a month later by the DeepSeek-Coder V2 series. In September 2024, DeepSeek V2.5 was introduced and revised in December.[35] On 20 November 2024, the preview of DeepSeek-R1-Lite became available via chat. In December, DeepSeek-V3-Base and DeepSeek-V3 (chat) were released.
On 20 January 2025, DeepSeek launched the DeepSeek chatbot—based on the DeepSeek-R1 model—free for iOS and Android. By 27 January, DeepSeek surpassed ChatGPT as the most downloaded freeware app on the iOS App Store in the United States,[13] triggering an 18% drop in Nvidia's share price.[36][37]
On 24 March 2025, DeepSeek released DeepSeek-V3-0324 under the MIT License.[38][39]
On 28 May 2025, DeepSeek released DeepSeek-R1-0528 under the MIT License.[40] The model has been noted for more tightly following official Chinese Communist Party ideology and censorship in its answers to questions than prior models.[41]
On 21 August 2025, DeepSeek released DeepSeek V3.1 under the MIT License.[42] This model features a hybrid architecture with thinking and non-thinking modes. It also surpasses prior models like V3 and R1, by over 40% on certain benchmarks like SWE-bench and Terminal-bench.[43] It was updated to V3.1-Terminus on 22 September 2025.[44] V3.2-Exp was released on 29 September 2025. It uses DeepSeek Sparse Attention, a more efficient attention mechanism based on previous research published in February.[45][46] DeepSeek-V3.2 was released on 1 December 2025, alongside a DeepSeek-V3.2-Speciale variant that focused on reasoning.[47][48]
In February 2026, Anthropic accused DeepSeek of using thousands of fraudulent accounts to generate millions of conversations with Claude to train its own large language models.[49]
Company operation
DeepSeek is headquartered in Hangzhou, Zhejiang, and is owned and funded by High-Flyer. Its co-founder, Liang Wenfeng, serves as CEO. As of May 2024, Liang personally held an 84% stake in DeepSeek through two shell corporations.[50]
Strategy
DeepSeek has stated that it focuses on research and does not have immediate plans for commercialization.[51] This posture also means it can skirt certain provisions of China's AI regulations aimed at consumer-facing technologies.[11]
DeepSeek's hiring approach emphasizes skills over lengthy work experience, resulting in many hires fresh out of university.[32][11] The company likewise recruits individuals without computer science backgrounds to expand the range of expertise incorporated into the models, for instance in poetry or advanced mathematics.[13][11] According to The New York Times, dozens of DeepSeek researchers have or have previously had affiliations with People's Liberation Army laboratories and the Seven Sons of National Defence.[52]
Due to the impact of United States restrictions on chips, DeepSeek refined its algorithms to maximise computational efficiency and thereby leveraged older hardware and reduced energy consumption.[53]
DeepSeek also expanded on the African continent as it offers more affordable and less power-hungry AI solutions. The company has bolstered African language models and generated a number of startups, for example in Nairobi. Along with Huawei's storage and cloud computing services, the impact on the tech scene in sub-saharan Africa is considerable. DeepSeek offers local data sovereignty and more flexibility compared to Western AI platforms.[54]
Training framework
High-Flyer/DeepSeek had operated at least two primary computing clusters: Fire-Flyer (萤火一号) and Fire-Flyer 2 (萤火二号). Fire-Flyer 1 was constructed in 2019 and was retired after 1.5 years of operation. Fire-Flyer 2 is still in operation as of 2025. Fire-Flyer 2 consists of co-designed software and hardware architecture. On the hardware side, Nvidia GPUs use 200 Gbps interconnects. The cluster is divided into two "zones", and the platform supports cross-zone tasks. The network topology was two fat trees, chosen for high bisection bandwidth. On the software side are:[55][26]
Distilled models were trained by SFT on 800K data synthesized from DeepSeek-R1, in a similar way as step 3. They were not trained with RL.[106]
There were reports that R2, the intended successor to R1, was originally planned for release in early May 2025.[107] However, on 28 May 2025, R1 was instead updated to version R1-0528.[108] As of early July, R2 was not yet released, as Liang Wenfeng was not yet satisfied with its performance. Most Chinese cloud providers of R1 used Nvidia H20.[109] As of August, R2 was not yet released. Sources cite slow data labelling and chip problems. Specifically, DeepSeek was encouraged by authorities to adopt Huawei's Ascend chips for training, but it had stability issues, slower inter-chip connectivity and inferior software. Consequently, it has opted to use Nvidia chips for training and Huawei chips for inference.[110] It is also reported that the Cyberspace Administration of China requested several large corporations to stop buying Nvidia H20 and buy from domestic suppliers instead.[111]
With the release of R1 in January 2025, the DeepSeek team published a preprint on arXiv.[106] Later, an updated version was published in Nature in September 2025.[112]
- , designed to improve model output readability.
- 1) Apply the same GRPO RL process as R1-Zero, adding a "language consistency reward" to encourage it to respond monolingually. This produced an un released internal model.
- 2) Synthesize 600K reasoning data from the internal model, with rejection sampling (i.e. if the generated reasoning had a wrong final answer, then it is removed). Synthesize 200K non-reasoning data (writing, factual QA, self-cognition, translation) using DeepSeek-V3.
- 3) SFT DeepSeek-V3-Base on the 800K synthetic data for 2 epochs.
- 4) Apply the same GRPO RL process as R1-Zero with rule-based reward (for reasoning tasks), but also model-based reward (for non-reasoning tasks, helpfulness, and harmlessness). This produced DeepSeek-R1.
Significance
DeepSeek's success against larger and more established rivals was a surprise to both the industry and to markets,[13][113] and has been compared by investors and pundits to the "Sputnik moment".[13][114][115][22][21][20]
The DeepSeek-R1 model provides responses comparable to other contemporary large language models, such as OpenAI's GPT-4o and o1.[16] Its training cost is reported to be significantly lower than other LLMs.[116][117]
The company claims that it trained V3, a predecessor of R1, for US$6 million compared to US$100 million for OpenAI's GPT-4 in 2023,[10] and approximately one tenth of the computing power used for Meta's comparable model, LLaMA 3.1.[10][11][12]
After the January 2025 release of the R1 model, which offered significantly lower costs than competing models, some investors anticipated a price war in the American AI industry.[118] It was dubbed the "Pinduoduo of AI", and other Chinese tech giants such as ByteDance, Tencent, Baidu, and Alibaba cut the price of their AI models. Despite its low price, it was profitable compared to its money-losing rivals.[51]
See also
- Reasoning model
- List of large language models
- Lists of open-source artificial intelligence software
- Zhejiang University
External links
References
- DeepSeek突传消息 Sina Corporation, 1 February 2025, retrieved 1 February 2025^
- Zijing Wu. DeepSeek focuses on research over revenue in contrast to Silicon Valley Financial Times, 14 March 2025, retrieved 14 March 2025^
- Hangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd. Bloomberg L.P.^
- DeepSeek Coder Model Service Agreement DeepSeek, 19 October 2023, retrieved 11 February 2025^
- DeepSeek Coder Privacy Policy DeepSeek, retrieved 2025-02-19^
- 全国互联网安全管理平台 beian.mps.gov.cn, Ministry of Public Security of the People's Republic of China, retrieved 9 February 2025^
- Beijing puts spotlight on China's new face of AI, DeepSeek's Liang Wenfeng South China Morning Post, 2025-01-21, retrieved 2025-03-04^
- Eduardo Baptista. Who is Liang Wenfeng, the founder of DeepSeek? Reuters, January 28, 2025, retrieved 2025-03-04^
- Behind DeepSeek lies a dazzling Chinese university The Economist, retrieved 2025-03-05^
- James Vincent. The DeepSeek panic reveals an AI world ready to blow The Guardian, 28 January 2025^
- Cade Metz, Meaghan Tobin. How Chinese A.I. Start-Up DeepSeek Is Competing With Silicon Valley Giants The New York Times, 23 January 2025, retrieved 27 January 2025^
- Emma Cosgrove. DeepSeek's cheaper models and weaker chips call into question trillions in AI infrastructure spending Business Insider, 27 January 2025, retrieved 27 January 2025^
- Cade Metz. What is DeepSeek? And How Is It Upending A.I.? The New York Times, 27 January 2025, retrieved 27 January 2025^
- Kevin Roose. Why DeepSeek Could Change What Silicon Valley Believes About A.I. The New York Times, 28 January 2025, retrieved 28 January 2025^
- Caroline Delbert. DeepSeek Is Cracking the 'Black Box' of Corporate AI Wide Open Popular Mechanics, 31 January 2025, retrieved 12 February 2025^
- Elizabeth Gibney. China's cheap, open AI model DeepSeek thrills scientists Nature, 23 January 2025, retrieved 12 February 2025^
- Caiwei Chen. What's next for Chinese open-source AI MIT Technology Review, 12 February 2026, retrieved 12 April 2026^
- Cade Metz. How Did DeepSeek Build Its A.I. With Less Money? The New York Times, 12 February 2025, retrieved 21 March 2025^
- Gregory C. Allen. DeepSeek, Huawei, Export Controls, and the Future of the U.S.-China AI Race Center for Strategic and International Studies, March 7, 2025^
- Amy Hawkins. Who is behind DeepSeek and how did it achieve its AI 'Sputnik moment'? The Guardian, 28 January 2025^
- John Cassidy. Is DeepSeek China's Sputnik Moment? The New Yorker, 3 February 2025^
- John Ruwitch. DeepSeek: Did a little-known Chinese startup cause a 'Sputnik moment' for AI? NPR, 2025-01-28, retrieved 2025-08-02^
- Jasper Saah. DeepSeek sends shock waves across Silicon Valley Liberation News – The Newspaper of the Party for Socialism and Liberation, 13 February 2025, retrieved 13 February 2025^
- James Sillars. DeepSeek: Tech firm suffers biggest drop in US stock market history as low-cost Chinese AI company bites Silicon Valley Sky News, 28 January 2025, retrieved 13 February 2025^
- Caiwei Chen. How a top Chinese AI model overcame US sanctions MIT Technology Review, 24 January 2025, retrieved 25 January 2025^
- 幻方 High-Flyer, retrieved 2025-02-02^
- Lily Ottinger. Deepseek: From Hedge Fund to Frontier Model Maker ChinaTalk, 9 December 2024, retrieved 28 December 2024^
- Eleanor Olcott, Zijing Wu. How small Chinese AI start-up DeepSeek shocked Silicon Valley Financial Times, 24 January 2025, retrieved 31 January 2025^
- Kif Leswing. Meet the $10,000 Nvidia chip powering the race for A.I. CNBC, 23 February 2023, retrieved 30 January 2025^
- 独家 Yicai, retrieved 2025-02-03^
- Xu Yu. [Exclusive] Chinese Quant Hedge Fund High-Flyer Won't Use AGI to Trade Stocks, MD Says Yicai Global, 17 April 2023, retrieved 28 December 2024^
- Ben Jiang, Bien Perezi. Meet DeepSeek: the Chinese start-up that is changing how AI models are trained South China Morning Post, 1 January 2025, retrieved 1 January 2025^
- Ryan McMorrow, Eleanor Olcott. The Chinese quant fund-turned-AI pioneer Financial Times, 9 June 2024, retrieved 28 December 2024^
- . Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Xiao Bi, Haowei Zhang, Mingchuan Zhang. DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models 27 April 2024^
- deepseek-ai/DeepSeek-V2.5 · Hugging Face Hugging Face, 3 January 2025, retrieved 28 January 2025^
- Hayden Field. China's DeepSeek AI dethrones ChatGPT on App Store: Here's what you should know CNBC, 27 January 2025, retrieved 27 January 2025^
- Aimee Picchi. What is DeepSeek, and why is it causing Nvidia and other stocks to slump? CBS News, 27 January 2025, retrieved 27 January 2025^
- Michael Nuñez. DeepSeek-V3 now runs at 20 tokens per second on Mac Studio, and that's a nightmare for OpenAI VentureBeat, 24 March 2025, retrieved 24 March 2025^
- deepseek-ai/DeepSeek-V3-0324 · Hugging Face Hugging Face, retrieved 2025-03-24^
- deepseek-ai/DeepSeek-R1-0528 · Hugging Face huggingface.co, 2025-05-28, retrieved 2025-05-28^
- Alex Colville. China's Global AI Firewall China Media Project, 2025-06-12, retrieved 2025-06-30^
- deepseek-ai/DeepSeek-V3.1 · Hugging Face huggingface.co, 2025-08-21, retrieved 2025-08-25^
- DeepSeek-V3.1 Release api-docs.deepseek.com, retrieved 2025-08-25^
- deepseek-ai/DeepSeek-V3.1-Terminus · Hugging Face huggingface.co, 2025-09-22, retrieved 2025-09-24^
- Jingyang Yuan, Huazuo Gao, Damai Dai, Junyu Luo, Liang Zhao, Zhengyan Zhang, Zhenda Xie, Y. X. Wei. Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention 2025-02-27^
- deepseek-ai/DeepSeek-V3.2-Exp · Hugging Face huggingface.co, 2025-09-29, retrieved 2025-10-02^
- Matt Binder. DeepSeek v3.2: What it is, how it compares to ChatGPT, how to try it Mashable, 3 December 2025, retrieved 12 April 2026^
- DeepSeek-V3.2 Release DeepSeek API Docs, 1 December 2025, retrieved 12 April 2026^
- Cade Metz. Anthropic Accuses 3 Chinese Companies of Harvesting Its Data The New York Times, 2026-02-23, retrieved 2026-02-24^
- 大模型价格又砍一刀 这次"屠夫"竟是量化私募? www.cls.cn, 10 May 2024, retrieved 3 February 2025^
- Jordan Schneider. Deepseek: The Quiet Giant Leading China's AI Race ChinaTalk, 27 November 2024, retrieved 28 December 2024^
- Tripp Mickle, Ana Swanson, Meaghan Tobin, Cade Metz. US Officials Target Nvidia and DeepSeek Amid Fears of China's A.I. Progress The New York Times, 2025-04-16, retrieved 2025-04-17^
- Anna Greenspan, Bogna Konior. Machine Decision is Not Final: China and the History and Future of Artificial Intelligence Urbanomic, MIT Press, 2025^
- Rai, Saritha, Loni Prinsloo, and Helen Nyambura "China's DeepSeek Is Beating Out OpenAI and Google in Africa" Bloomberg Technology. Accessed 27 Oct 2025.^
- Wei An, Xiao Bi, Guanting Chen, Shanhuang Chen, Chengqi Deng, Honghui Ding, Kai Dong, Qiushi Du. SC24: International Conference for High Performance Computing, Networking, Storage and Analysis IEEE, 17 November 2024^
- 幻方力量 High-Flyer, June 13, 2019, retrieved 2025-02-03^
- deepseek-ai/3FS DeepSeek, 2025-02-28, retrieved 2025-02-28^
- hfreduce High-Flyer, March 4, 2020, retrieved 2025-02-03^
- HFAiLab/hai-platform High-Flyer, February 2, 2025, retrieved 2025-02-03^
- LICENSE · deepseek-ai/deepseek-coder-33b-base Hugging Face, 28 October 2023, retrieved 12 April 2026^
- DeepSeek-LLM/LICENSE-MODEL GitHub, 29 November 2023, retrieved 12 April 2026^
- DeepSeek-MoE/LICENSE-MODEL 11 January 2024, retrieved 12 April 2026^
- LICENSE · deepseek-ai/deepseek-math-7b-base Hugging Face, 6 February 2024, retrieved 12 April 2026^
- LICENSE · deepseek-ai/deepseek-math-7b-instruct Hugging Face, 6 February 2024, retrieved 12 April 2026^
- LICENSE · deepseek-ai/deepseek-math-7b-rl Hugging Face, 6 February 2024, retrieved 12 April 2026^
- LICENSE · deepseek-ai/DeepSeek-V2.5 5 September 2024, retrieved 12 April 2026^
- LICENSE-MODEL · deepseek-ai/DeepSeek-V3-Base 26 December 2024, retrieved 12 April 2026^
- DeepSeek-Prover-V2/LICENSE-MODEL 30 April 2025, retrieved 12 April 2026^
- deepseek-ai/deepseek-vl2 27 November 2025, retrieved 12 April 2026^
- LICENSE · deepseek-ai/DeepSeek-R1-0528 28 May 2025, retrieved 12 April 2026^
- LICENSE · deepseek-ai/DeepSeek-R1-Distill-Qwen-32B 20 January 2025, retrieved 12 April 2026^
- LICENSE · deepseek-ai/DeepSeek-V3.1-Base 19 August 2025, retrieved 12 April 2026^
- LICENSE · deepseek-ai/DeepSeek-V3.1-Terminus 22 September 2025, retrieved 12 April 2026^
- LICENSE · deepseek-ai/DeepSeek-Math-V2 27 November 2025, retrieved 12 April 2026^
- LICENSE · deepseek-ai/DeepSeek-V3.2 1 December 2025, retrieved 12 April 2026^
- DeepSeek-Coder/LICENSE-MODEL at main · deepseek-ai/DeepSeek-Coder GitHub, retrieved 24 January 2025^
- Daya Guo, Qihao Zhu, Dejian Yang, Zhenda Xie, Kai Dong, Wentao Zhang, Guanting Chen, Xiao Bi. DeepSeek-Coder: When the Large Language Model Meets Programming – The Rise of Code Intelligence 26 January 2024^
- DeepSeek Coder deepseekcoder.github.io, retrieved 27 January 2025^
- deepseek-ai/DeepSeek-Coder DeepSeek, 27 January 2025, retrieved 27 January 2025^
- deepseek-ai/deepseek-coder-5.7bmqa-base · Hugging Face Hugging Face, retrieved 27 January 2025^
- Xiao Bi, Deli Chen, Guanting Chen, Shanhuang Chen, Damai Dai, Chengqi Deng, Honghui Ding. DeepSeek LLM: Scaling Open-Source Language Models with Longtermism 5 January 2024^
- deepseek-ai/DeepSeek-LLM DeepSeek, 27 January 2025, retrieved 27 January 2025^
- Damai Dai, Chengqi Deng, Chenggang Zhao, R. X. Xu, Huazuo Gao, Deli Chen, Jiashi Li, Wangding Zeng. DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models 11 January 2024^
- . Peiyi Wang, Lei Li, Zhihong Shao, R. X. Xu, Damai Dai, Yifei Li, Deli Chen, Y. Wu. Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations 19 February 2024^
- . Aixin Liu, Bei Feng, Bin Wang, Bingxuan Wang, Bo Liu, Chenggang Zhao, Chengqi Dengr. DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model 19 June 2024^
- . Bowen Peng, Jeffrey Quesnelle, Honglu Fan, Enrico Shippole. YaRN: Efficient Context Window Extension of Large Language Models 1 November 2023^
- config.json · deepseek-ai/DeepSeek-V2-Lite at main Hugging Face, 15 May 2024, retrieved 28 January 2025^
- config.json · deepseek-ai/DeepSeek-V2 at main Hugging Face, 6 May 2024, retrieved 28 January 2025^
- Qihao Zhu, Daya Guo, Zhihong Shao, Dejian Yang, Peiyi Wang, Runxin Xu, Y. Wu. DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence 17 June 2024^
- Aixin Liu, Bei Feng, Bing Xue, Bingxuan Wang, Bochao Wu, Chengda Lu, Chenggang Zhao. DeepSeek-V3 Technical Report 27 December 2024^
- Coco Feng. DeepSeek wows coders with more powerful open-source V3 model South China Morning Post, 25 March 2025, retrieved 6 April 2025^
- config.json · deepseek-ai/DeepSeek-V3 at main Hugging Face, 26 December 2024, retrieved 28 January 2025^
- Dylan Patel, AJ Kourabi, Dylan O'Laughlin, Doug Knuhtsen. DeepSeek Debates: Chinese Leadership On Cost, True Training Cost, Closed Model Margin Impacts SemiAnalysis, 31 January 2025, retrieved 13 February 2025^
- Rob Thubron. DeepSeek's AI costs far exceed $5.5 million claim, may have reached $1.6 billion with 50,000 Nvidia GPUs TechSpot, 3 February 2025, retrieved 13 February 2025^
- Kapil Kajal. Research exposes DeepSeek's AI training cost is not $6M, it's a staggering $1.3B Yahoo News, 31 January 2025, retrieved 13 February 2025^
- Martin Vechev of INSAIT: "DeepSeek $6M Cost Of Training Is Misleading" TheRecursive.com, 28 January 2025, retrieved 13 February 2025^
- Ben Jiang. Chinese start-up DeepSeek's new AI model outperforms Meta, OpenAI products South China Morning Post, 27 December 2024, retrieved 28 December 2024^
- Shubham Sharma. DeepSeek-V3, ultra-large open-source AI, outperforms Llama and Qwen on launch VentureBeat, 26 December 2024, retrieved 28 December 2024^
- Kyle Wiggers. DeepSeek's new AI model appears to be one of the best 'open' challengers yet TechCrunch, 26 December 2024, retrieved 31 December 2024^
- Benj Edwards. Cutting-edge Chinese "reasoning" model rivals OpenAI o1—and it's free to download Ars Technica, 21 January 2025, retrieved 16 February 2025^
- Deepseek Log in page DeepSeek, retrieved 30 January 2025^
- News DeepSeek API Docs, retrieved 28 January 2025^
- Carl Franzen. DeepSeek's first reasoning model R1-Lite-Preview turns heads, beating OpenAI o1 performance VentureBeat, 20 November 2024, retrieved 28 December 2024^
- Raffaele Huang. Don't Look Now, but China's AI Is Catching Up Fast The Wall Street Journal, 24 December 2024, retrieved 28 December 2024^
- Release DeepSeek-R1 · deepseek-ai/DeepSeek-R1@23807ce GitHub, retrieved 21 January 2025^
- Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu. DeepSeek-R1 incentivizes reasoning in LLMS through reinforcement learning Nature, 22 January 2025^
- DeepSeek rushes to launch new AI model as China goes all in Reuters, February 25, 2025, retrieved February 25, 2025^
- Luz Ding. DeepSeek Says Upgraded Model Reasons Better, Hallucinates Less Bloomberg, 29 May 2025, retrieved 9 June 2025^
- DeepSeek R2 launch stalled as CEO balks at progress, The Information reports Reuters, 2025-06-26, retrieved 2025-07-05^
- Eleanor Olcott, Zijing Wu. DeepSeek's next AI model delayed by attempt to use Chinese chips Financial Times, 2025-08-14, retrieved November 13, 2025^
- China cautions tech firms over Nvidia H20 AI chip purchases, sources say Reuters, 2025-08-12^
- Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Peiyi Wang, Qihao Zhu, Runxin Xu, Ruoyu Zhang. DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning Nature, September 2025^
- Kevin Roose. Why DeepSeek Could Change What Silicon Valley Believe About A.I. The New York Times, 28 January 2025, retrieved 28 January 2025^
- Beyond the Headlines on DeepSeek's Sputnik Moment: A Conversation with Jimmy Goodrich - IGCC UC Institute on Global Conflict and Cooperation (IGCC), February 12, 2025^
- Is 'Sputnik Moment' an appropriate analogy for the launch of DeepSeek? - LCFI LCFI - Leverhulme Centre for the Future of Intelligence, 2 February 2025^
- Mary Whitfill Roeloffs. What Is DeepSeek? New Chinese Artificial Intelligence Rivals ChatGPT, OpenAI Forbes, retrieved 2025-08-05^
- Aixin Liu, Bei Feng, Bing Xue, Bingxuan Wang, Bochao Wu, Chengda Lu, Chenggang Zhao. DeepSeek-V3 Technical Report 2024^
- Andrew R. Chow, Billy Perrigo. Is the DeepSeek Panic Overblown? TIME, 30 January 2025, retrieved 17 March 2025^