OpenAI has responded to the lawsuit filed by The New York Times accusing the ChatGPT-maker of using its stories to train AI models, by claiming that the lawsuit is “without merit” and defending the use of publicly available internet materials for AI training as fair use.
The lawsuit came after talks between the two companies, with NYT alleging that ChatGPT generated sections of its publication’s stories almost verbatim in response to certain prompts.
OpenAI disagreed with this analysis and said such output was a result of old content being available across multiple locations online.
In a statement published on Monday, OpenAI highlighted its tie-ups with news publications, as well as its own efforts to reduce the regurgitation of data when training its AI models.
(For top technology news of the day, subscribe to our tech newsletter Today’s Cache)
“Interestingly, the regurgitations The New York Times induced appear to be from years-old articles that have proliferated on multiple third-party websites. It seems they intentionally manipulated prompts, often including lengthy excerpts of articles, in order to get our model to regurgitate,” said OpenAI in its statement.
The AI startup pointed to its partnerships with Associated Press, Axel Springer, American Journalism Project and NYU as examples of positive relationships with the media industry, and noted it also had an opt-out process that NYT adopted in August last year.
“We had explained to The New York Times that, like any single source, their content didn’t meaningfully contribute to the training of our existing models and also wouldn’t be sufficiently impactful for future training. Their lawsuit on December 27—which we learned about by reading The New York Times—came as a surprise and disappointment to us,” noted OpenAI in its statement.
Multiple authors have sued not just OpenAI but also its backer Microsoft, claiming that the companies scraped their copyrighted data without permission or payment in order to train their AI models.
Google and Meta have also been hit with lawsuits over their alleged harvesting of user data for AI training.
Published - January 09, 2024 04:21 pm IST