Technology

69041 readers

2989 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

[email protected]

Training Generative AI Models on Copyrighted Works Is Fair Use - Change My Mind (mastodon.lawprofs.org)

submitted 1 year ago by [email protected] to c/[email protected]

112 comments fedilink hide all child comments

I fucked with the title a bit. What i linked to was actually a mastodon post linking to an actual thing. but in my defense, i found it because cory doctorow boosted it, so, in a way, i am providing the original source here.

please argue. please do not remove.

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 58 points 1 year ago (3 children)

I think we should have a rule that says if a LLM company invokes fair use on the training inputs then the outputs are public domain.

[–] [email protected] 26 points 1 year ago* (last edited 1 year ago) (1 children)

That's already been ruled on once.

A recent lawsuit challenged the human-authorship requirement in the context of works purportedly “authored” by AI. In June 2022, Stephen Thaler sued the Copyright Office for denying his application to register a visual artwork that he claims was authored “autonomously” by an AI program called the Creativity Machine. Dr. Thaler argued that human authorship is not required by the Copyright Act. On August 18, 2023, a federal district court granted summary judgment in favor of the Copyright Office. The court held that “human authorship is an essential part of a valid copyright claim,” reasoning that only human authors need copyright as an incentive to create works. Dr. Thaler has stated that he plans to appeal the decision.

Why would companies care about copyright of the output? The value is in the tool to create it. The whole issue to me revolves around the AI company profiting on it's service. A service built on a massive library of copyrighted works. It seems clear to me, a large portion of their revenue should go equally to the owners of the works in their database.

[–] [email protected] 11 points 1 year ago (1 children)

You can still copyright AI works, you just can't name an AI as the author.

[–] [email protected] 9 points 1 year ago (1 children)

That's just saying you can claim copyright if you lie about authorship. The problem then is, you may step into the realm of fraud.

[–] [email protected] 8 points 1 year ago (1 children)

You don't have to lie about authorship. You should read the guidance.

[–] [email protected] 4 points 1 year ago (1 children)

Well, what you initially said sounded like fraud, but the incredibly long page indeed doesn't talk about fraud. However, it also seems a bit vague. What counts as your contributions to the work? Is it part of the input the model was trained on, "I wrote the prompt", or making additionally changes based on the result?

[–] [email protected] 4 points 1 year ago

The vagueness surrounding contributions is particularly troubling. Without clearer guidelines, this seems like a recipe for lawsuits.

[–] [email protected] 1 points 1 year ago

Not just the outputs but the models as well

[–] [email protected] 1 points 1 year ago

The outputs are not copyrightable.

But something not being copyrightable doesn't necessarily mean openly distributed.

It does mean OpenAI can't really restrict or go after other companies training off of GPT-4 outputs though, which is occurring broadly.