Boosting AI Research with Test-Time Scaling on Steroids

Google’s AI Co-scientist Ranks Its Own Reasoning to Formulate Scientific Hypotheses in Days Rather than Years

Google on Wednesday said it has adapted its Gemini 2.0 large language model artificial intelligence offering to make it generate novel scientific hypotheses in a fraction of the time taken by teams of human lab researchers.

The company bills the "AI co-scientist" built upon Gemini as "a promising advance toward AI-assisted technologies for scientists to help accelerate discovery," and a program meant to be run with a human "in the loop" to "act as a helpful assistant and collaborator to scientists and to help accelerate the scientific discovery process."

Hypothesis-formulation machine

Google’s design for AI co-scientist has a person input a research goal at the prompt, whereupon a series of agents work in parallel to review the literature, formulate and evaluate hypotheses.

The structure of AI co-scientist is designed to perform multiple agent tasks in parallel, backed up by a memory-management function for storing intermediate results.

Test-time scaling on steroids

The adaptation of Gemini 2.0 emphasizes the use of "test-time scaling," where AI agents use increasing amounts of computing power to iteratively review and re-formulate their output. Test-time scaling has been seen most dramatically not only in Gemini, but also OpenAI’s o1 model, and DeepSeek AI, all examples of so-called reasoning models that spend much more time responding to a prompt, generating intermediate results.

Surpasses models and unassisted human experts

According to fifteen human experts who reviewed the co-scientist’s output, the program gets better as it spends more computing time formulating hypotheses and evaluating them.

Conclusion

Google’s AI co-scientist is a promising advance toward AI-assisted technologies for scientists to help accelerate discovery. By leveraging Gemini 2.0 and the power of test-time scaling, the AI co-scientist can generate novel scientific hypotheses in a fraction of the time taken by teams of human lab researchers.

FAQs

Q: What is Google’s AI Co-scientist?
A: Google’s AI Co-scientist is a program that uses a combination of agents to review literature, formulate and evaluate hypotheses, and generate research proposals.

Q: How does the AI Co-scientist work?
A: The AI Co-scientist works by having a person input a research goal, and then a series of agents work in parallel to review the literature, formulate and evaluate hypotheses.

Q: What is test-time scaling?
A: Test-time scaling is a technique used in AI models that allows them to use increasing amounts of computing power to iteratively review and re-formulate their output.

Q: How does the AI Co-scientist surpass models and unassisted human experts?
A: According to human experts, the AI Co-scientist’s output is rated higher for novelty and impact, and preferred over other models and unassisted human experts.

Post Views: 42

Boosting AI Research with Test-Time Scaling on Steroids

Generate single title from this title Train CodeFu-7B with veRL and Ray on Amazon SageMaker Training jobs in 100 -150 characters. And it must...

Generate single title from this title Nearly half of high school students now use AI in college search in 100 -150 characters. And it...

Engineering confidence to navigate uncertainty | MIT News

Generate single title from this title Best of MWC 2026: Live updates on phones, concepts, and robots we’re seeing in 100 -150 characters. And...

Featured video: Coding for underwater robotics | MIT News

Generate single title from this title Train CodeFu-7B with veRL and Ray on Amazon SageMaker Training jobs in 100 -150 characters. And it must...

Generate single title from this title Nearly half of high school students now use AI in college search in 100 -150 characters. And it...

Engineering confidence to navigate uncertainty | MIT News

Generate single title from this title Best of MWC 2026: Live updates on phones, concepts, and robots we’re seeing in 100 -150 characters. And...

Featured video: Coding for underwater robotics | MIT News

Generate single title from this title Upgrading agentic AI for finance workflows in 100 -150 characters. And it must return only title i dont...

Generate single title from this title Making Softmax More Efficient with NVIDIA Blackwell Ultra in 100 -150 characters. And it must return only title...

Generate single title from this title Nvidia shares fall as blockbuster results fail to dazzle in 100 -150 characters. And it must return only...

LEAVE A REPLY Cancel reply

Latest

Generate single title from this title Train CodeFu-7B with veRL and Ray on Amazon SageMaker Training jobs in 100 -150 characters. And it must...

Generate single title from this title Nearly half of high school students now use AI in college search in 100 -150 characters. And it...

Engineering confidence to navigate uncertainty | MIT News

Categories

Useful Links

Our Newsletter