this post was submitted on 28 Jun 2024
581 points (95.0% liked)

Technology

59174 readers
3285 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 2 points 4 months ago (1 children)

You've probably not infringed the copyright, only the court can decide though; if you were to be challenged by the rights holder.

I think there are lots of factors in your defence:

  • you're not selling it , your use is an example for education
  • I don't think you're reducing the market value for the original(s) in any way
  • you've not included substantial verbaitim sections of the original works , but I think you have used more than just facts and ideas (not sure though).

But add in some more quotes, flesh it out, and then try to sell it . . . each step weakens the 'fair use' defence.

This the the problem for the LLM, it can be used for many things, and if it has no filter or limit, then eventually the collective derived works might add up to commercial, substantial reuse, and might include enough to have copied a substantial portion of the original. Very hard to determine I'd think. Each individual use might be fair, but did the LLM itself go too far at some point?

Copyright holder probably struggles to challenge the LLM on the basis of all the things infinite mokeys might use it for in future.

[–] [email protected] 1 points 4 months ago

This the the problem for the LLM, it can be used for many things, and if it has no filter or limit

I agree with pretty much everything before this but that particular comment was just talking about summaries, which imo is a lot more cut and dry. (SparkNotes, for example)

An LLM by itself is unlimited and unfiltered, but it's not impossible to limit one and sell it. For all the shit OpenAI deserves to get, I have to give them one thing, their copyright restriction system seems to be on par with YouTube. I paid for a month of it when GPT4 came out and tried my hardest to bypass it, but it won't even give me copyrighted texts when the words are all replaced with synonyms or jumbled around.

I think if someone's offering their LLM as a service and has a system like that in place, they aren't stealing any more than YouTube is stealing. Otherwise I agree that there's a strong argument for copyright infringement.