By Raffaele Huang and Tracy Qu
Dec. 24, 2024 The Wall Street Journal
SINGAPORE—Chinese startups show signs of catching up with America’s leading artificial-intelligence models more quickly than many in the industry had expected, despite the restrictions China faces in buying advanced chips.
DeepSeek, a startup funded by one of China’s most successful hedge-fund managers, released a preview version of its latest large language model in November. It said the program’s abilities compared favorably with OpenAI’s reasoning model called o1, which came out in preview form in September.
Other Chinese companies have made similar claims in recent weeks. Moonshot AI, a startup backed by Chinese internet giants Alibaba and Tencent, said it developed a model specializing in math with capabilities close to o1, while Alibaba said one of its own experimental research models outperformed the preview version of the U.S. model on math.
The companies haven’t published papers describing their models, and evaluating the claims is difficult because there isn’t a single agreed-upon test of an AI model’s abilities. Still, some U.S. specialists said they were impressed.
China is “catching up faster,” said Andrew Carr, a former fellow at OpenAI and currently an AI entrepreneur. He said DeepSeek researchers trying to replicate OpenAI’s reasoning model “figured it out within a few months, and frankly many of my colleagues are surprised by that.”
One test used for comparison is the American Invitational Mathematics Examination, which is designed to challenge the brightest high-school math students.
DeepSeek said its model bested OpenAI’s on the AIME. An experiment by The Wall Street Journal using 15 problems from this year’s AIME found that OpenAI’s o1 preview model got to the answers faster than DeepSeek, Moonshot and the experimental Alibaba model. In one word puzzle involving strategy in a hypothetical two-player game, the OpenAI program gave the answer in 10 seconds while DeepSeek took more than two minutes.
Getting the correct answer on the first try is still a feat because word problems often stump AI programs.
Chinese AI developers have faced U.S. restrictions on access to the world’s most advanced AI chips, including those from chip leader Nvidia, since 2022. The Biden administration in December again tightened export control rules.
But the developers have found workarounds.
At Moonshot, the startup backed by Alibaba and Tencent, founder Yang Zhilin has said the company is focusing on reinforcement learning, which mimics humans’ trial and error. The approach might use computing power less intensively in improving performance.
Since late last year, AI developers have increasingly been using a technique called “mixture of experts,” or MoE, in which an initial routing mechanism directs the problem to a specialized expert model like a head chef directing a spaghetti order to the kitchen’s Italian cook. This process also eases the demands on chips.
Tencent said its MoE model, released in November, delivered performance comparable to a Llama 3.1 model introduced in July by Facebook owner Meta Platforms. Researchers who reviewed papers published by the two companies said Tencent’s model was likely trained with around a 10th of the computing power Meta used.
DeepSeek started as the AI research unit of High-Flyer, a quantitative hedge-fund manager with $8 billion in assets that is known for leveraging AI to trade. In 2021, DeepSeek connected around 10,000 of Nvidia’s A100 chips to form a cluster for AI training, which it called Fire-Flyer 2.
In a paper published this August, DeepSeek said Fire-Flyer 2 achieved performance close to an Nvidia system containing similar chips, but the Chinese system cost less and consumed less energy. DeepSeek’s May paper on its MoE model, which incorporated a technique to process data more efficiently, was widely noted in the industry.
“One way China will get around export controls—building extremely good software and hardware training stacks using the hardware it can access,” Jack Clark, co-founder of AI startup Anthropic, wrote in his blog, referring to DeepSeek’s cluster. “Made in China will be a thing for AI models, same as electric cars, drones, and other technologies,” he wrote.
Many Chinese AI developers have found ways to access restricted Nvidia chips, including through trades with middlemen and overseas data centers.
Nonetheless, the lack of cutting-edge chips is painful to the Chinese startups, according to Chinese executives, and the gap is poised to widen. Nvidia customers are preparing to deploy its latest AI data-center chip, called Blackwell, at significant scale.
Elon Musk’s xAI has constructed a data center with 100,000 Nvidia chips and recently raised $5 billion to do more. Amazon Web Services plans to build a massive AI supercomputer with hundreds of thousands of its homegrown chips.
DeepSeek, which focuses on open-source models, emphasizes math and coding. Moonshot has gained popularity among Chinese consumers with its ChatGPT-like chatbot Kimi and is known for its ability to handle long-form text.
Chinese AI startups are currently valued at a fraction of U.S. companies such as OpenAI—which was recently valued at $157 billion—because financiers are unsure about their ability to monetize their advances. Fierce competition has led to a price war among AI model vendors.
Beijing-based Zhipu AI, which was valued at around $3 billion in its latest fundraising round this month, has pushed back its plan to go public as soon as the second half of 2025 after investment bankers told the company it was unlikely to get the valuation it wanted, people familiar with the matter said. Zhipu showcased its AI agent in late November and released a video-generating model similar to OpenAI’s Sora in July.
Howard Huang, a former AI-infrastructure executive at a Beijing-based AI-model company, compared the Chinese industry to people trying to dance while wearing shackles. “Focusing on what we have been good at is the only opportunity to survive, and probably to win,” he said.
Write to Raffaele Huang at raffaele.huang@wsj.com and Tracy Qu at tracy.qu@wsj.com