model with a budget under $50 in cloud computing credits.
The model is known as s1 and is able to perform as well as the latest models of artificial intelligence, such as OpenAI’s 01 and DeepSeek’s R1 models. The reasoning model works best in math and coding abilities and is available on GitHub, as well as the data and code that have been used for its training.
The team that worked on this project said that they had started with an off-the-shelf base model and fine-tuned it through distillation. The distillation process is used in reasoning training and training on its answers.
Even more so, the researchers talked about s1, saying that it is distilled from one of Google’s reasoning models, Gemini 2.0 Flash Thinking Experimental. It is also worth mentioning that distillation is the same approach that has been used by Berkley researchers in order to create an AI reasoning model that cost them nearly $450.
The idea behind s1 is the commoditization of AI models. The idea is also innovative and encouraging for researchers, as they can be able to create innovative models. And, as you would expect, big AI companies are not happy with these new capabilities. OpenAI has already accused DeepSeek of using data from its API for the purpose of model distillation, reported TechCrunch.
Before creating s1, the purpose of the researchers was to find a simple approach that could help them achieve strong reasoning performance and “test-time-scaling” or even give AI models the possibility to have more time for thinking before they answer a request.
The results from the s1 paper suggest that reasoning can be achieved with a small dataset, as they have the possibility of being distilled in the dataset in a process named supervised fine-tuning, also known as (SFT). In this process, the AI model is instructed to replicate a certain behavior. The STF process tends to be cheaper than how the large-scale reinforcement would be.
It is also worth mentioning that S1 is based on a free, small, off-the-shelf AI model that is owned by Alibaba. In order to train the AI model, the researcher created a database of 1000 carefully chosen questions, that were paired with answers to those questions and their “thinking” process that happened behind.
According to the researchers, the training process took only 30 minutes and used 16 Nvidia H100 GPUs, achieving important performance benchmarks. Even if the distillation process has proven worthy, it does not mean that it creates vastly better AI models than the ones currently available.