top of page

DeepSeek Unveils New AI Reasoning Method Ahead of Anticipated Model Launch

  • Writer: tech360.tv
    tech360.tv
  • 14 hours ago
  • 2 min read

Chinese AI start-up DeepSeek has introduced a new technique to enhance the reasoning capabilities of large language models, as anticipation builds for the release of its next-generation model.


Glowing blue whale logo and "deepssek" text projected above a lit keyboard, creating a futuristic and tech-focused ambiance.
Credit: Andrey Rudakov/Bloomberg

In collaboration with researchers from Tsinghua University, DeepSeek developed a dual-method approach that combines generative reward modelling (GRM) and self-principled critique tuning. The method aims to guide AI models to deliver faster and more accurate responses aligned with human preferences.


The resulting DeepSeek-GRM models achieved competitive performance compared with existing public reward models, according to a paper published Friday on arXiv, an online scientific paper repository.


Reward modelling is a process used to align AI outputs with human values and expectations. DeepSeek’s new approach integrates this with a self-assessment mechanism to improve model reasoning.


The company plans to open source the GRM models, though no timeline has been provided.


The announcement comes amid growing speculation about the release of DeepSeek-R2, the successor to its R1 reasoning model. Reuters reported last month that the new model could launch as early as this month.


DeepSeek has not confirmed the release date. A customer service account reportedly denied the rumour in a group chat with business clients, according to Chinese media.


Founded in 2023 by entrepreneur Liang Wenfeng, DeepSeek has gained global attention for its cost-efficient models. Its R1 model was noted for rivaling leading AI systems in performance.


In March, the company upgraded its V3 model, DeepSeek-V3-0324, with improved reasoning, front-end web development capabilities, and enhanced Chinese writing proficiency.


In February, DeepSeek open-sourced five code repositories and pledged transparency in its development process. Liang also published a technical study on “native sparse attention,” a method to improve data processing efficiency in large language models.


Liang, who also founded DeepSeek’s parent company High-Flyer Quant, participated in a February symposium hosted by Chinese President Xi Jinping. The event highlighted DeepSeek as a symbol of China’s technological resilience amid US efforts to limit its AI development.

 
  • DeepSeek introduced a new AI reasoning method combining GRM and critique tuning

  • The GRM models outperformed existing public reward models

  • The company plans to open source the models but gave no timeline


Source: SCMP

As technology advances and has a greater impact on our lives than ever before, being informed is the only way to keep up.  Through our product reviews and news articles, we want to be able to aid our readers in doing so. All of our reviews are carefully written, offer unique insights and critiques, and provide trustworthy recommendations. Our news stories are sourced from trustworthy sources, fact-checked by our team, and presented with the help of AI to make them easier to comprehend for our readers. If you notice any errors in our product reviews or news stories, please email us at editorial@tech360.tv.  Your input will be important in ensuring that our articles are accurate for all of our readers.

Tech360tv is Singapore's Tech News and Gadget Reviews platform. Join us for our in depth PC reviews, Smartphone reviews, Audio reviews, Camera reviews and other gadget reviews.

  • YouTube
  • Facebook
  • TikTok
  • Instagram
  • Twitter
  • LinkedIn

© 2021 tech360.tv. All rights reserved.

bottom of page