Three organisations argue that more transparency is needed after research found AI models have been trained using pirated works by authors such as Zadie Smith and Stephen King.
Three major European publishing trade bodies have urged the EU to “act now” on transparency over artificial intelligence for the sake of “the book chain and democracy”.
As the world’s largest book fair began in Frankfurt, an online statement published on Wednesday called for co-legislators to “seize the opportunity” of the EU’s AI act – which classifies AI systems according to the risk they pose to users and applies regulation accordingly – to “ensure the transparency” of generative AI and “make it safer for European citizens”.
The European Writers’ Council, the Federation of European Publishers and the European and International Booksellers Federation signed the statement. It comes after analysis by The Atlantic revealed that the pirated works of authors such as Zadie Smith and Stephen King had been used to train artificial intelligence tools run by companies including Meta and Bloomberg.
“Generative AI models have been developed in an opaque and unfair way, illegally making use of millions of copyright-protected books without permission from authors or publishers,” read the statement. “This practice impacts negatively not only rightsholders, but also democracy itself, by facilitating the mass creation of misleading, biased, and even dangerous content which has the potential to undermine European democracy.”
The trio of organisations argued that transparency is essential to the development of a “fair and safe AI ecosystem” and said the AI act is an “ideal opportunity” for the EU to take a “leading role in protecting its citizens”. If passed, the legislation would be the first comprehensive legal framework for AI regulation.
Under the act, due to be finalised by the end of the year, AI systems are classified into four groups based on the risks they pose to users: unacceptable, high, limited, and minimal or no risk. For foundation models – which underpin generative AI tools like ChatGPT – companies would be required to disclose a detailed summary of data sources used for training.
The three trade bodies said that this proposed requirement is a “good first step” and called on member states and the Commission to “improve the proposal and finally put an end to the illegal sourcing and data-laundering abuses of generative AI developers”. They added that transparency is “the only way to ensure quality and legitimacy of outputs”.
“Meaningful transparency obligations allowing a rightsholder to assess whether their work was used are easy for the innovative AI operators to comply with,” continued the statement. “They are technologically simple to apply and rely on data that AI developers already collect and organise. And they are needed now, as damage is already done since existing generative text models used works [for] years without consent, credit or compensation to the authors and publishers.”