I feel like this changes nothing. All they did was apply another algorithm to the copyrighted work before feeding it to the AI.
That algorithm masks the original work so it looks different to the human eye, but the AI still gets what it needs from it, and it still needs the real picture, made by a human who didn't get credit.
Technology
This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.
Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.
Rules:
1: All Lemmy rules apply
2: Do not post low effort posts
3: NEVER post naziped*gore stuff
4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.
5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)
6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist
7: crypto related posts, unless essential, are disallowed
Right? Like, by this definition, the training algorithm is already "corrupting" the images by vectorizing them. This is just an overly roundabout way of saying "See? The image is cropped, so we good now!"
That's not how any of this works. Copyright is a legal concept, not a technological one. You can't strip the copyright off something by deleting part of it; the result is still a derivative work.
It's not what the paper is about at all, seems this is just shit journalism again.
All the paper says about copyright is that this method is more secure because AI can sometimes spit out training examples.
Why... why is it more secure? Does it mean AI training is actively abusing copyright law? And this is more secure because they can hide it better?
No, you have it the other way around. It means copyright owners can share "corrupted" versions of their works and the AI can still use it. Possible AI leaks won't return the original work, since it was never used.
Of course I think this is only one aspect of why artists wouldn't share their works, but it's not the point the paper is trying to make. They're just giving an aspect of how it could be useful.
"I only took a bite out of the bread, therefore I didnt eat any bread."
Did the image get copied onto their servers in a manner they were not provided a legal right to? Then they violated copyright. Whatever they do after that isn't the copyright violation.
And this is obvious because they could easily assemble a dataset with no copyright issues. They could also attempt to get permission from the copyright holders for many other images, but that would be hard and/or costly and some would refuse. They want to use the extra images, but don't want to get permission, so they just take it, just like anyone else who would like an image but doesn't want to pay for it.
All this really does is show how flawed the current concept of copyright is. But at some point, a huge corpus of images owned by other people was assembled to create a derivative work (the training corpus).
Shows how useless those "image poisoning" services some artists boast about really are
Doesn't this just do what gets done through convolution anyway?
What's the point of this.
I love how copyrights are such a consistent constant hurdle for any new developments are are turning everything they touch into shit.
Good. Then it's working.