Revelations that OpenAI Secretly Funded and Had Access to FrontierMath Benchmarking Dataset Raise Concerns
OpenAI’s Secret Involvement Raises Questions about the Validity of o3 Model’s High Scores
Revelations that OpenAI secretly funded and had access to the FrontierMath benchmarking dataset are raising concerns about whether it was used to train its reasoning o3 AI model, and the validity of the model’s high scores.
Access to the Benchmarking Dataset
In addition to accessing the benchmarking dataset, OpenAI funded its creation, a fact that was withheld from the mathematicians who contributed to developing FrontierMath. Epoch AI belatedly disclosed OpenAI’s funding only in the final paper published on Arxiv.org, which announced the benchmark. Earlier versions of the paper omitted any mention of OpenAI’s involvement.
Screenshot of FrontierMath Paper
[Image: Screenshot of FrontierMath paper]
Closeup of Acknowledgement
[Image: Closeup of acknowledgement]
Previous Version of Paper that Lacked Acknowledgement
[Image: Previous version of paper that lacked acknowledgement]
OpenAI 03 Model Scored Highly on FrontierMath Benchmark
The news of OpenAI’s secret involvement is raising questions about the high scores achieved by the o3 reasoning AI model and causing disappointment with the FrontierMath project. Epoch AI responded with transparency about what happened and what they’re doing to check if the o3 model was trained with the FrontierMath dataset.
Giving OpenAI Access to the Dataset was Unintended
Giving OpenAI access to the dataset was unexpected because the whole point of it is to test AI models, but that can’t be done if the models know the questions and answers beforehand.
Reddit Discussion
A post on the r/singularity subreddit expressed disappointment and cited a document that claimed that the mathematicians didn’t know about OpenAI’s involvement:
"Frontier Math, the recent cutting-edge math benchmark, is funded by OpenAI. OpenAI allegedly has access to the problems and solutions. This is disappointing because the benchmark was sold to the public as a means to evaluate frontier models, with support from renowned mathematicians. In reality, Epoch AI is building datasets for OpenAI. They never disclosed any ties with OpenAI before."
Epoch AI’s Response
Tamay Besiroglu, associated director at Epoch AI, acknowledged that OpenAI had access to the datasets but also asserted that there was a "holdout" dataset that OpenAI didn’t have access to.
Holdout Dataset
Glazer, lead mathematician at Epoch AI, confirmed that OpenAI has the dataset and that they were allowed to use it to evaluate OpenAI’s o3 large language model, which is their next state-of-the-art AI that’s referred to as a reasoning AI model.
More Facts About OpenAI & FrontierMath Revealed
Elliot Glazer, lead mathematician at Epoch AI, confirmed that OpenAI has the dataset and that they were allowed to use it to evaluate OpenAI’s o3 large language model, which is their next state-of-the-art AI that’s referred to as a reasoning AI model.
Conclusion
The drama stands until the Epoch AI evaluation is completed, which will indicate whether or not OpenAI had trained their AI reasoning model with the dataset or only used it for benchmarking it.
FAQs
Q: What is the FrontierMath benchmarking dataset?
A: FrontierMath is a benchmarking dataset for evaluating AI models.
Q: Who funded the creation of the FrontierMath dataset?
A: OpenAI funded the creation of the FrontierMath dataset.
Q: Did OpenAI have access to the FrontierMath dataset?
A: Yes, OpenAI had access to the FrontierMath dataset.
Q: Was OpenAI’s involvement disclosed to the mathematicians who contributed to the development of FrontierMath?
A: No, OpenAI’s involvement was not disclosed to the mathematicians who contributed to the development of FrontierMath.
Q: What is the holdout dataset?
A: The holdout dataset is a separate dataset that is not accessible to OpenAI, which is used to verify the performance of AI models.

