this post was submitted on 31 Aug 2023
596 points (97.9% liked)
Technology
59148 readers
2352 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
it's crazy that "it's too hard :(" has become an acceptable justification for just ignoring the law within tech circles
It's more like the law is saying you must draw seven red lines, all of them strictly perpendicular, some with green ink and some with transparent ink.
It's not "virtually" impossible, it's literally impossible. If the law requires that it be possible then it's the law that must change. Otherwise it's simply a more complicated way of banning AI entirely, which means that some other jurisdiction will become the world leader in such things.
How is "don't rely on content you have no right to use" litteraly impossible?
We teach to children that there is a Google filter to include only the CC images (that they should use for their presentations).
Also it's not like we are talking small companies here, a new billion-making industry is being born and it could totally afford contracts with big platforms that would allow to use their content.
At the time they used the data, they had a right to use it. The participants later revoked their consent for their data to be used, after the model was already trained at an enormous cost.
I have to admit my comment is not really relevant to the article itself (also, I read only the free part of it).
It was more a reaction to the comment above, which felt more generic. My concern about LLMs is that I could never find an auditable list of websites that were crawled, which would be reasonable to ask for, I think.