this post was submitted on 31 Aug 2023
596 points (97.9% liked)
Technology
59148 readers
2352 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
How is "don't rely on content you have no right to use" litteraly impossible?
We teach to children that there is a Google filter to include only the CC images (that they should use for their presentations).
Also it's not like we are talking small companies here, a new billion-making industry is being born and it could totally afford contracts with big platforms that would allow to use their content.
This is an article about unlearning data, not about not consuming it in the first place.
LLM's are not storing learned data in it's raw, original form. They are injesting it and building an understanding of language based off of it.
Attempting to peel out that knowledge would be incredibly difficult, if not impossible because there's really no way to identify it.
And we're saying that if peeling out knowledge that someone has a right to have forgotten is difficult or impossible, that knowledge should not have been used to begin with. If enforcement means big tech companies have to throw out models because they used personal information without knowledge or consent, boo fucking hoo, let me find a Lilliputian to build a violin for me to play.
A) this article isn't about a big tech company, it's about an academic researcher. B) he had consent to use the data when he trained the model. The participants later revoked their consent to have their data used.