So-Called Reasoning AI Models Become Easier and Cheaper to Develop
On Friday, NovaSky, a team of researchers based out of UC Berkeley’s Sky Computing Lab, released Sky-T1-32B-Preview, a reasoning model that’s competitive with an earlier version of OpenAI’s o1 on a number of key benchmarks.
A New Era of Affordable Reasoning Models
$450 might not sound that affordable. But it wasn’t long ago that the price tag for training a model with comparable performance often ranged in the millions of dollars. Synthetic training data, or training data generated by other models, has helped drive costs down.
How Synthetic Training Data is Revolutionizing AI Development
Palmyra X 004, a model recently released by AI company Writer, trained almost entirely on synthetic data, reportedly cost just $700,000 to develop. This significant reduction in cost is making it possible for more researchers and developers to create high-level reasoning capabilities affordably and efficiently.
What Sets Reasoning Models Apart
Unlike most AI, reasoning models effectively fact-check themselves, which helps them to avoid some of the pitfalls that normally trip up models. Reasoning models take a little longer — usually seconds to minutes longer — to arrive at solutions compared to a typical non-reasoning model. The upside is, they tend to be more reliable in domains such as physics, science, and mathematics.
The NovaSky Team’s Approach
The NovaSky team says it used another reasoning model, Alibaba’s QwQ-32B-Preview, to generate the initial training data for Sky-T1, then “curated” the data mixture and leveraged OpenAI’s GPT-4o-mini to refactor the data into a more workable format. Training the 32-billion-parameter Sky-T1 took about 19 hours using a rack of 8 Nvidia H100 GPUs.
Performance and Comparison
According to the NovaSky team, Sky-T1 performs better than an early preview version of o1 on MATH500, a collection of “competition-level” math challenges. The model also beats the preview of o1 on a set of difficult problems from LiveCodeBench, a coding evaluation. However, Sky-T1 falls short of the o1 preview on GPQA-Diamond, which contains physics, biology, and chemistry-related questions a PhD graduate would be expected to know.
Future Plans
But the NovaSky team says that Sky-T1 only marks the start of their journey to develop open source models with advanced reasoning capabilities. “Moving forward, we will focus on developing more efficient models that maintain strong reasoning performance and exploring advanced techniques that further enhance the models’ efficiency and accuracy at test time,” the team wrote in the post.
Conclusion
The release of Sky-T1-32B-Preview marks a significant milestone in the development of reasoning AI models. With its competitive performance and affordable price tag, it has the potential to democratize access to advanced reasoning capabilities. As the NovaSky team continues to improve and refine their models, we can expect to see even more exciting developments in the field of AI.
FAQs
Q: What is a reasoning AI model?
A: A reasoning AI model is a type of artificial intelligence that can effectively fact-check itself and arrive at solutions through logical reasoning.
Q: How does synthetic training data help reduce costs?
A: Synthetic training data is generated by other models, which reduces the need for expensive human-generated data. This helps drive costs down and makes it possible for more researchers and developers to create high-level reasoning capabilities affordably and efficiently.
Q: What are the benefits of reasoning AI models?
A: Reasoning AI models tend to be more reliable in domains such as physics, science, and mathematics, and can avoid some of the pitfalls that normally trip up models. They also take a little longer to arrive at solutions, but the upside is, they tend to be more accurate and reliable.

