Date:

OpenAI Says DeepSeek May Have Improperly Harvested Its Data

OpenAI Reviews Evidence of Data Harvesting by Chinese Start-up DeepSeek

OpenAI’s Terms of Service Allegedly Breached

OpenAI, a San Francisco-based start-up valued at $157 billion, is reviewing evidence that DeepSeek, a Chinese company, may have harvested large amounts of data from its AI technologies to teach similar skills to its own systems. This process, known as distillation, is common in the AI field, but OpenAI’s terms of service prohibit anyone from using data generated by its systems to build technologies that compete in the same market.

"We know that groups in the P.R.C. are actively working to use methods, including what’s known as distillation, to replicate advanced U.S. AI models," said OpenAI spokeswoman Liz Bourgeois. "We are aware of and reviewing indications that DeepSeek may have inappropriately distilled our models, and will share information as we know more."

DeepSeek’s A.I. Technologies

DeepSeek, which has sent shockwaves through Silicon Valley, has been making waves by releasing AI technologies that match the performance of anything else on the market. The prevailing wisdom had been that the most powerful systems could not be built without billions of dollars in specialized computer chips, but DeepSeek claims to have created its technologies using far fewer resources.

AI Companies and Open Sourcing

AI companies like DeepSeek and OpenAI rely heavily on a practice called open sourcing, freely sharing code that underpins their technologies and reusing code shared by others. They see this as a way of accelerating technological development. AI companies also need massive amounts of online data to train their systems, which learn their skills by pinpointing patterns in text, computer programs, images, sounds, and videos.

Distillation and Data Harvesting

Distillation is often used to train new systems. While it is allowed by open source technologies, it may be legally problematic if a company takes data from proprietary technology. OpenAI’s terms of service explicitly prohibit anyone from using data generated by its systems to build technologies that compete in the same market.

Lawsuits Against OpenAI

OpenAI is facing more than a dozen lawsuits accusing it of illegally using copyrighted internet data to train its systems. One of these lawsuits was brought by The New York Times against OpenAI and its partner Microsoft, alleging that millions of articles published by The Times were used to train automated chatbots that now compete with the news outlet as a source of reliable information. Both OpenAI and Microsoft deny the claims.

Conclusion

The situation highlights the complex issues surrounding AI development and data harvesting. While OpenAI’s terms of service prohibit data harvesting, the company’s own practices have been called into question. The case is ongoing, and the outcome will have significant implications for the AI industry.

FAQs

Q: What is distillation in AI?
A: Distillation is a process used to train new AI systems by using data generated by existing systems.

Q: What is open sourcing in AI?
A: Open sourcing is the practice of freely sharing code that underpins AI technologies and reusing code shared by others to accelerate technological development.

Q: What are the implications of OpenAI’s terms of service breach?
A: The breach could have significant implications for the AI industry, including the potential for legal action and reputational damage.

Q: What is the current status of the lawsuit against OpenAI and Microsoft?
A: The lawsuit is ongoing, with both parties denying the claims.

Latest stories

Read More

Christie’s first AI art auction provokes fierce debate

Christie's AI Art Auction Sparks Controversy We've already seen auction...

iPhone SE 4

Apple Expected to Announce New iPhone SE This Week The...

Design Disaster or Stroke of Genius?

In a world obsessed with sleek aesthetics and pixel-perfect...

ImagineFX 250 Resource Download

ImagineFX Issue 250: Download Accompanying Files Accessing the Download Link To...

DeepSeek Shows Meta’s A.I. Strategy Is Working

The Rise of Open-Source AI: A New Era of...

Accessibility in Frontend Development: Building Inclusive Web Experiences

As Frontend Developers, Let's Prioritize Accessibility Hey Devs! As frontend developers,...

The Monkey’s Bold Statement

Movie Poster Design: The Monkey's Unconventional Ad We like a...

LEAVE A REPLY

Please enter your comment!
Please enter your name here