The Norwegian authors react strongly to the fact that their books are being used to train the language models behind several large international AI robots.
– I think it’s shameful that companies that know better have a business idea to steal intellectual property, says Erling Kagge for VG.
– Robbery in broad daylight, says crime writer Frode Sander-Owen, known by the pseudonym Samuel Bjork.
Books3 contains 183,000 works – and has been used by major tech companies such as Facebook owner Meta, Bloomberg and many others to train the language models behind many large AI bots.
In the wake of the VG case, the Norwegian Writers’ Association audited its 750 members – and found 32 of its members affected. 11 of them are new names – so we can determine that at least 41 Norwegian authors are represented in the dataset.
VG asked the owner of Facebook Meta for comment.
– It is very dangerous and sad that our members’ actions are being misused in this way. Because this is a misuse of copyrighted material, says Authors’ Association leader Brynjolf Young-tune.
The Writers Guild has now sent a letter containing information to its members.
– Now we have to work out what we can do to protect the interests of authors. The most important thing now is to put in place regulations to prevent this from happening again, and then politicians and authorities need to step in, says Yong Jun.
The Authors Guild is closely following the lawsuit filed against Open AI by several American authors – which TEK.no has been mentioned here.
According to The Atlantic, Book3 consists primarily of pirated e-books.
“This is daylight robbery. I was furious when I heard about the way they did it,” says Frode Sunder-Owen, who writes under the pseudonym Samuel Björk.
-This is not something that was created to “serve humanity”, it is for pure profit, taking action in this way is completely reprehensible, and makes me absolutely sick.
Frode Sander Øien himself tested several tech companies’ AI robots, and was surprised by how well they knew his new personalities.
– I asked them to write an opening chapter about Mia Kruger and Holger Munk from my crime books, and there all the character traits from my books came out, down to the smallest detail.
– It’s a bit scary, I must admit.
Author Maja Lunde is also following developments, and is one of the authors with several works in Books3.
– I don’t like this new development, says Lundy and continues:
– I have never consented to my work being used for machine learning and I find this to be very troubling. For example, the language model trained in my books will be able to produce texts in a “style,” which is obviously only well copyrighted, Lundy tells VG.
It is supported by Erling Kaag:
– My books have been published in 41 languages, and it seems very difficult to let all the publishers and me do the work for them. After all, companies like Meta have made it a sport to move along the boundaries between what is right and what is wrong.
– Someone should pull down Mark Zuckerberg’s pants and give him rice on his ass, Erling Kagge tells VG.
He is one of several non-fiction authors included in the dataset. The Association of Norwegian Non-Fiction Writers and Translators informed VG that it had not been given an overview of how many of its members were included in the dataset.
– We take very seriously the fact that the works of Norwegian non-fiction writers are being used illegally in this way, says association president Arne Vestbo.
The Atlantic has written several stories about the “Books3” dataset and has now done so Create a search engine To help authors know if their work can be found in the dataset.
The Atlantic tried to get comment from Meta about, among other things, the use of pirated books in training its AI, but a Meta spokesperson instead points to a recent lawsuit — in which Meta’s lawyers argued that their AI model ( LLaMA) and the “output” is not “very similar” to the authors’ books.
“Web specialist. Lifelong zombie maven. Coffee ninja. Hipster-friendly analyst.”