Microsoft and OpenAI Investigate Potential Breach of AI System by Chinese Start-up DeepSeek
According to Bloomberg, Microsoft and OpenAI are investigating a potential breach of OpenAI’s system by a group allegedly linked to Chinese AI start-up DeepSeek. The investigation stems from suspicious data extraction activity detected in late 2024 via OpenAI’s application programming interface (API), sparking broader concerns over international AI competition.
Large-scale Data Extraction
Microsoft, OpenAI’s largest financial backer, first identified the large-scale data extraction and informed the ChatGPT maker of the incident. Sources believe the activity may have violated OpenAI’s terms of service or that the group may have exploited loopholes to bypass restrictions limiting how much data they could collect.
DeepSeek’s Rise to Prominence
DeepSeek has quickly risen to prominence in the competitive AI landscape, particularly with the release of its latest model, R-1, on January 20. Billed as a rival to OpenAI’s ChatGPT in performance but developed at a significantly lower cost, R-1 has shaken up the tech industry. Its release triggered a sharp decline in tech and AI stocks that wiped billions from US markets in a single week.
Model Distillation
David Sacks, the White House’s newly appointed "crypto and AI czar," alleged that DeepSeek may have employed questionable methods to achieve its AI’s capabilities. In an interview with Fox News, Sacks noted evidence suggesting that DeepSeek had used "distillation" to train its AI models using outputs from OpenAI’s systems.
"There’s substantial evidence that what DeepSeek did here is they distilled knowledge out of OpenAI’s models, and I don’t think OpenAI is very happy about this," Sacks told the network.
Geopolitical and Security Concerns
The growing competition between the US and China in the AI sector has underscored wider concerns regarding technological ownership, ethical governance, and national security. The US Navy has banned its personnel from using DeepSeek’s products, citing fears that the Chinese government could exploit the platform to access sensitive information.
Conclusion
The case highlights the risks posed by model distillation and the need for stricter measures to protect intellectual property and prevent data breaches. As AI systems advance and become increasingly integral to global economic and strategic planning, disputes over data usage and intellectual property are likely to intensify. The investigation into the alleged breach by DeepSeek has set a precedent for how AI developers police model usage and enforce terms of service.
FAQs
- What is model distillation?
Model distillation is a technique that involves training one AI system using data generated by another, potentially allowing a competitor to develop similar functionality. - What is the concern about DeepSeek’s data collection practices?
Critics have highlighted DeepSeek’s privacy policy, which permits the collection of data such as IP addresses, device information, and even keystroke patterns, a scope of data collection considered excessive by some experts. - Why has the US Navy banned its personnel from using DeepSeek’s products?
The US Navy has banned its personnel from using DeepSeek’s products due to "potential security and ethical concerns associated with the model’s origin and usage." - What are the implications for the AI industry?
The growing competition between the US and China in the AI sector has underscored wider concerns regarding technological ownership, ethical governance, and national security, highlighting the need for stricter measures to protect intellectual property and prevent data breaches.

