OpenAI: Creation of AI tools 'impossible' without copyrighted material

As analysis shows cloud giants are failing to protect customers from IP claims

clock • 3 min read
OpenAI: Creation of AI tools 'impossible' without copyrighted material

OpenAI has said it would be "impossible" to develop AI tools like its chatbot, ChatGPT, without access to copyrighted material.

Several AI firms are currently facing lawsuits over the content used to train their products.

Chatbots and image generators, such as ChatGPT and Stable Diffusion, rely on vast datasets sourced from the internet, much of which falls under copyright protection.

The New York Times recently filed a lawsuit against OpenAI and its investor Microsoft, accusing them of "unlawful use" of its work in creating their AI products.

The NYT claimed "millions" of its articles were used in the training of ChatGPT, accusing OpenAI of "massive copyright infringement, commercial exploitation, and misappropriation" of its intellectual property.

The newspaper further argued that the AI tool now competes with it as an information source.

OpenAI defended its practices in a submission [pdf] to the House of Lords communications and digital select committee, pointing out that without access to copyrighted materials, it would be impossible to develop large language models like GPT-4.

"Because copyright today covers virtually every sort of human expression... it would be impossible to train today's leading AI models without using copyrighted materials," stated OpenAI in its submission.

The organisation argued that limiting training data to out-of-copyright works would lead to AI systems that could not meet the needs of contemporary society.

The defence presented by AI companies, including OpenAI, often hinges on the legal doctrine of "fair use," allowing the use of copyrighted content in specific circumstances without obtaining the owner's permission.

OpenAI reiterated in its submission that it believes "legally, copyright law does not forbid training."

A new era for copyright law

The New York Times lawsuit is not the only legal challenge launched against OpenAI and its competitors.

Last year, the company faced a federal class-action lawsuit in California, accusing the company of unlawfully using personal data for training purposes. The lawsuit cited multiple violations, including breaches of the US Computer Fraud and Abuse Act and the Electronic Communications Privacy Act.

Getty Images is suing Stability AI, the creator of Stable Diffusion, for alleged copyright breaches.

Responding to concerns about AI safety, OpenAI expressed support for independent analysis of its security measures. The organisation advocates for "red-teaming," where third-party researchers assess the safety of AI products by simulating the behaviour of rogue actors.

Cloud giants failing to protect AI customers

While cloud giants such as Amazon, Microsoft and Google are eager to promote their new AI tools, they are leaving their business customers exposed to the risk of copyright lawsuits, a new report by The Financial Times has warned.

It says that while the big three cloud companies boast of defending customers from IP claims, analysis of their indemnity clauses shows that these protections only apply to the use of AI models that were developed by or with the oversight of Google, Amazon and Microsoft.

This means that businesses that use AI models developed by other companies are not protected from copyright lawsuits.

So, if you're using an AI tool from, say, Anthropic (backed, but not developed, by Amazon and Google), a copyright lawsuit could land at your doorstep, even though you're using it on their platform.

This selective protection has businesses wary.

According to the FT, Amazon only extends coverage to content generated by its proprietary models, such as Titan, and various AI applications it has developed. Likewise, Microsoft offers protection exclusively for tools operating on its internal models and those created by OpenAI.

Despite the limited protection, there are silver linings for users. Legal experts believe claims might be difficult to win.

A recent US court case dismissed part of a lawsuit against AI companies, highlighting the "problem" of proving every generated image relies on copyrighted material.

While generative AI technology holds immense potential, the unprecedented claims in copyright law require caution. Businesses that are considering using AI tools should carefully review the terms of service and indemnity clauses before making a decision.

You may also like
Microsoft, Google and Snap report strong quarterly results, IBM and Intel less so

Finance and Reporting

Microsoft and Google see AI investments bearing fruit

clock 26 April 2024 • 4 min read
GenAI at the heart of AWS Summit

Artificial Intelligence

AWS has gone all in on GenAI – and so have its customers

clock 26 April 2024 • 4 min read
Big Tech's AI spending spree worries investors

Artificial Intelligence

Zuckerberg says building a leading AI system will take several years and require significant investment

clock 26 April 2024 • 3 min read

More on Big Data and Analytics

Even CERN has to queue for GPUs. Here's how they optimise what they have

Even CERN has to queue for GPUs. Here's how they optimise what they have

'There's a tendency to say that all ML workloads need a GPU, but for inference you probably don't need them'

John Leonard
clock 17 April 2024 • 4 min read
Partner Content: Why good data is the foundation of AI success

Partner Content: Why good data is the foundation of AI success

Does your organisation have the right quantity and quality of data to make its AI ambitions a reality?

Arrow
clock 04 April 2024 • 2 min read
Partner Content: Human-in-the-loop - How AI can boost your organisational culture

Partner Content: Human-in-the-loop - How AI can boost your organisational culture

Why it’s vital to consider your organisation’s people when implementing AI

Arrow
clock 26 March 2024 • 2 min read