September 8, 2024


The developer OpenAI said it would be impossible to use tools like its groundbreaking chatbot ChatGPT without access to copyright material as pressure mounts on artificial intelligence firms over the content used to train their products.

Chatbots like ChatGPT and image generators like Stable Diffusion are “trained” on a large amount of data taken from the Internet, much of it covered by copyright – a legal protection against using someone’s work without permission.

Last month the New York Times sued OpenAI and Microsoftwhich is a leading investor in OpenAI and uses its tools in its products, accusing them of “illegal use” of its work to create their products.

In a submission to the House of Lords communications and digital select committeeOpenAI said it can’t train large language models such as its GPT-4 model – the technology behind ChatGPT – without access to copyright work.

“Because copyright today covers virtually every kind of human expression — including blog posts, photos, forum posts, bits of software code and government documents — it would be impossible to train today’s leading AI models without using copyrighted material,” said OpenAI in his submission, which was first reported by the Telegraph.

It added that limiting training material to books and drawings without copyright would produce inadequate AI systems: “Limiting training data to books and drawings in the public domain created more than a century ago could produce an interesting experiment , but will not provide AI systems that meet the needs of today’s citizens.”

OpenAI responded to the NYT lawsuit last month, saying it respects “the rights of content creators and owners”. AI companies’ defenses to the use of copyrighted material tend to rely on the legal doctrine of “fair use”, which allows the use of content in certain circumstances without asking the owner’s permission.

In its submission, OpenAI said it believed that “legally, copyright law does not prohibit training”.

The NYT lawsuit followed numerous other legal complaints against OpenAI. John Grisham, Jodi Picoult and George RR Martin were among 17 authors who sued OpenAI in September, alleging “systematic theft on a mass scale”.

skip past newsletter promotion

Getty Images, which owns one of the largest photo libraries in the world, is suing the creator of Stable Diffusion, Stability AI, in the US and in England and Wales for alleged copyright infringements. In the US, a group of music publishers including Universal Music sued Anthropic, the Amazon-backed company behind the Claude chatbotaccusing it of misusing “countless” song lyrics to train its model.

Elsewhere in its House of Lords submission, in response to a question about AI safety, OpenAI said it supported independent analysis of its safety measures. The submission said it supports “red teaming” of AI systems, where third-party researchers test the security of a product by mimicking the behavior of rogue actors.

OpenAI is one of the companies that has agreed to work with governments safety test their most powerful models before and after their deployment, after a deal was struck at a global security summit in the UK last year.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *