Wednesday 28 February 2024 6:00 am | Updated: Monday 20 May 2024 2:02 pm

As The New York Times and OpenAI go to war, smaller publishers are being left behind

TMT Reporter

While the outcome of the landmark legal battle between The New York Times (NYT) and OpenAI remains uncertain, the case, whichever way it goes, will likely have huge implications for smaller publishers around the world.

In December, the NYT filed a lawsuit against OpenAI and its parent company, Microsoft, alleging copyright infringement and is seeking billions of dollars in damages.

It has accused the tech companies of seeking to “free-ride” on its “massive investment in its journalism”, threatening its ability to deliver “trustworthy information, news analysis, and commentary”.

OpenAI, which trained its generative AI chatbot by sourcing data from across the web, has responded in a blog post saying that “training AI models using publicly available Internet materials is fair use.”

While copyright laws vary slightly between the US and UK, the outcome of the lawsuit will likely dictate the future direction of laws and regulations on the issues, both here and in other countries.

The British government, like many others worldwide, aims for the UK to become an AI development hub. However, this ambition could face obstacles if AI developers are restricted from training their models in the UK.

Developers train and enhance their AI tools by hoovering up vast amounts of data, often scraping content from providers across the internet.

In the UK, as the law stands, this constitutes copyright infringement, save for specific exceptions, such as text and data mining for non-commercial research purposes.

A licensing deal between publishers and AI firms appears to be the only way out.

Some publishers, such as the Associated Press and Axel Springer have pursued individual licensing arrangements with OpenAI already.

Similar negotiations between the NYT and OpenAI for a licensing deal reportedly broke down before the lawsuit was filed, likely due to disagreement over the licence’s value.

It is possible the companies could settle the dispute if they are able to reach an agreement on a licensing deal.

While any content provider could demand a licence, it is hard to imagine how an AI developer could secure agreements with every publisher worldwide, let alone every academic, freelancer or author.

Smaller news publications, which lack the legal and financial backing of US media behemoths, may struggle to obtain such favourable licensing terms, especially since it is almost impossible to quantify the value of their content to AI developers.

Cerys Wyn Davies, an IP and technology partner at Pinsent Masons, suggested one model that could solve the problem.

National copyright bodies could collect royalties from AI companies on behalf of publishers.

In the event some smaller providers fail to reap revenue from their work, they may be forced to shutter, which leaves AI tools with less data to learn from, potentially creating a vicious circle.

“If we stop humans writing, where does that leave society?” said Davies.

So far, non-legislative approaches have failed to protect smaller players, and an attempt at creating a code of conduct in the UK was recently shelved by the government.

A solution that includes, and compensates, smaller publishers is urgently needed.