US Researchers Launch A DeepSeek Competitor
A small team of researchers at Stanford and Washington Universities have created an advanced and very significant AI reasoning model, named s1, for an incredibly low cost of under $50.
This is highly significant in an industry where developing similar models takes many millions of dollars in resource and infrastructure costs at a time of growing competition in the AI reasoning field.
For the purpose of comparion, Chinese startup DeepSeek recently made a big impact with its own reasoning model, R1, which the company claims to have been developed for just $6 million.
The s1 model can complete complex reasoning tasks, and has performed in similar ways to OpenAI’s o1 and DeepSeek’s R1 with maths and coding. However, critics are questioning the accuracy of DeepSeek’s claims, and also expressed their concerns regarding the safety and security of its models.
Low Cost Of s1’s Development
This process involves training s1 to mimic the reasoning abilities of an existing AI model, specifically, Google’s Gemini 2.0 Flash Thinking experimental model. By using a curated dataset of 1,000 questions and answers, paired with reasoning traces from the Gemini model, s1 learned how to arrive at accurate solutions in a fraction of the time and cost compared to traditional methods.
According to the researchers, training s1 took just 26 minutes using 16 Nvidia H100 GPUs, costing just $20 in total.
The researchers used what they call Supervised Fine-Tuning (SFT), a method that involves guiding the model with explicit instructions to accelerate the learning process. One particularly interesting development during s1’s creation was the introduction of a “wait” instruction, which helped improve its accuracy. By incorporating pauses into the model’s reasoning process, the researchers found that s1 was able to double-check its responses, often correcting errors and leading to more accurate conclusions.
The researchers behind s1 hope their work will drive open innovation, making powerful reasoning models more accessible to the global community and accelerating advancements in AI technology for the benefit of society.
However, a higher level of investment may still be necessary to push the envelope of AI innovation.
The shrort-cut methods used by s1 and R1 (sometimes referred to as distillation) are demonstrably a good method for cheaply re-creating an AI model’s capabilities, but they don’t create new AI models vastly better than what is already available.
arXiv | I-HLS | Interesting Engineering | Tech Xplore | Mashable | Tech Crunch | Yahoo
Image: Igor Kutyaev
You Might Also Read:
A History Of Artificial Intelligence: Its Current & Future Development:
If you like this website and use the comprehensive 6,500-plus service supplier Directory, you can get unrestricted access, including the exclusive in-depth Directors Report series, by signing up for a Premium Subscription.
- Individual £5 per month or £50 per year. Sign Up
- Multi-User, Corporate & Library Accounts Available on Request
- Inquiries: Contact Cyber Security Intelligence
Cyber Security Intelligence: Captured Organised & Accessible