this post was submitted on 08 Jan 2024
335 points (96.1% liked)

Technology

59148 readers
2294 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

Microsoft, OpenAI sued for copyright infringement by nonfiction book authors in class action claim::The new copyright infringement lawsuit against Microsoft and OpenAI comes a week after The New York Times filed a similar complaint in New York.

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 18 points 10 months ago (1 children)

Already seeing people come in to defend these suits. I just see it like this: AI is a tool, much like a computer or a pencil are tools. You can use a computer to copyright infringe all day, just like a pencil can. To me, an AI is only going to be plagiarizing or infringing if you tell it to. How often does AI plagiarize without a user purposefully trying to get it to do so? That’s a genuine question.

You are misrepresenting the issue. The issue here is not if a tool just happens to be able to be used for copyright infringement in the hands of a malicious entity. The issue here is whether LLM outputs are just derivative works of their training data. This is something you cannot compare to tools like pencils and pcs which are much more general purpose and which are not built on stole copyright works. Notice also how AI companies bring up "fair use" in their arguments. This means that they are not arguing that they are not using copryighted works without permission nor that the output of the LLM does not contain any copyrighted part of its training data (they can't do that because you can't trace the flow of data through an LLM), but rather that their use of the works is novel enough to be an exception. And that is a really shaky argument when their services are actually not novel at all. In fact they are designing services that are as close as possible to the services provided by the original work creators.