DeepSeek: The Chinese Start-Up Revolutionising AI Model Training

tech360.tv
Jan 2
2 min read

Chinese start-up DeepSeek has taken the tech world by storm, emerging as a significant player in the open-source large language model (LLM) sector in 2025. Just recently, the company made headlines globally with the launch of its latest release, DeepSeek V3. Jim Fan, a senior research scientist at Nvidia, hailed DeepSeek as "the biggest dark horse" in this arena, praising the innovative approach the company has taken.

Credit: Shutterstock

Fan highlighted the remarkable achievement of DeepSeek in developing the DeepSeek V3 model with 671 billion parameters in just two months, at a cost of US$5.58 million. This cost-efficient training process used significantly fewer computing resources compared to industry giants like Meta Platforms and OpenAI.

The development of DeepSeek V3 showcases the company's ability to create a powerful LLM at a fraction of the cost typically invested by larger tech firms. Despite facing US sanctions limiting access to advanced semiconductors, DeepSeek leveraged new architecture to achieve cost-effective training, using Nvidia's China-tailored H800 GPUs.

DeepSeek's training process, requiring only 2.78 million GPU hours for the V3 model, stands in stark contrast to the 30.8 million GPU hours Meta needed for its Llama 3.1 model. Computer scientist Andrej Karpathy commended DeepSeek's efficiency, noting the model's strength in a significantly shorter training time.

The tech community has been closely following DeepSeek's progress, with industry experts recognising the company's impact on the AI landscape. Fan and Jia Yangqing, founder of cloud computing start-up Lepton AI, praised DeepSeek's approach, highlighting the pressure it puts on other companies to innovate and move swiftly in the competitive AI market.

DeepSeek's unique recruitment strategy, focusing on ability rather than experience, has attracted fresh graduates and early-career professionals to join its ranks. The company's commitment to smart research and resourceful development has positioned it as a key player in the AI industry.

Founded in 2023 by Liang Wenfeng, DeepSeek was spun off from hedge-fund manager High Flyer Quant. Liang's background in AI and the strategic GPU investments made by High Flyer Quant have laid a solid foundation for DeepSeek's success in the LLM domain. The company continues to receive funding support from its parent company, enabling further growth and innovation.

DeepSeek emerges as a major player in the open-source LLM sector in 2025.
The company develops the powerful DeepSeek V3 model with 671 billion parameters at a fraction of the cost of larger tech firms.
DeepSeek's cost-effective training process utilises fewer computing resources, showcasing innovation in AI model development.

Source: SCMP

Comments