this post was submitted on 19 Aug 2024
661 points (96.7% liked)
Technology
60052 readers
2809 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Sums up all AI
EDIT: meant all gen AI
Does it? I worked on training a classifier and a generative model on freely available galaxy images taken by Hubble and labelled in a citizen science approach. Where's the theft?
Hard to say. Training models is generative; training a model from scratch is costly. Your input may not infringe copyright but the input before or after may have.
I trained the generative models all from scratch. Pretrained models are not that helpful when it's important to accurately capture very domain specific features.
One of the classifiers I tried was based on zoobot with a custom head. Assuming the publications around zoobot are truthful, it was trained exclusively on similar data from a multitude of different sky surveys.
I assume you mean all generative AI? Because I don't think AI that autonomously learns to play Super Mario is theft https://youtu.be/qv6UVOQ0F44
Nintendo probably thinks it's theft lol
No, it sums up a very specific type of AI...
Blanket statement are dumb.
Can you explain how you came to that conclusion?
The way I understand it, generative AI training is more like a single person analyzing art at impossibly fast speeds, then using said art as inspiration to create new art at impossibly fast speeds.
The art isn't being made btw so much as being copy and pasted in a way that might convince you it was new.
Since the AI cannot create a new style or genre on its own, without source material that already exists to train it, and that source material is often scraped up off of databases, often against the will and intent of the original creators, it is seen as theft.
Especially if the artists were in no way compensated.
To add to your excellent comment:
It does not ask if it can copy the art nor does it attribute its generated art with: "this art was inspired by ..."
I can understand why creators unhappy with this situation.
Do you go into a gallery and scream "THIS ART WAS INSPIRED BY PICASSO. WHY DOESN'T IT SAY THAT! tHIS IS THEFT!" - no, I suspect you don't because that would be stupid. That's what you sound like here
This is absolutely wrong about how something like SD generates outputs. Relationships between atomic parts of an image are encoded into the model from across all training inputs. There is no copying and pasting. Now whether you think extracting these relationships from images you can otherwise access constitutes some sort of theft is one thing, but characterizing generative models as copying and pasting scraped image pieces is just utterly incorrect.
While, yes it is not copy and paste in the literal sense, it does still have the capacity to outright copy the style of an artist's work that was used to train it.
If teaching another artist's work is already frowned upon when trying to pass the trace off as one's own work, then there's little difference when a computer does it more convincingly.
Maybe a bit off tangent here, since I'm not even sure if this is strictly possible, but if a generative system was only trained off of, say, only Picasso's work, would you be able to pass the outputs off as Picasso pieces? Or would they be considered the work of the person writing a prompt or built the AI? What if the artist wasn't Picasso but someone still alive, would they get a cut of the profits?
The outputs would be considered no one's outputs as no copyright is afforded to AI general content.
That feels like it's rather besides the point, innit? You've got AI companies showing off AI art and saying "look at what this model can do," you've got entire communities on Lemmy and Reddit dedicated to posting AI art, and they're all going "look at what I made with this AI, I'm so good at prompt engineering" as though they did all the work, and the millions of hours spent actually creating the art used to train the model gets no mention at all, much less any compensation or permission for their works to be used in the training. Sure does seem like people are passing AI art off as their own, even if they're not claiming copyright.
I'm not sure how it could be besides the point, though it may not be entirely dispositive. I take ownership to be a question of who has a controlling and exclusionary right to something--in this case thats copyright. Copyright allows you to license these things and extract money for their use. If there is no copyright, there is no secure monetization (something companies using AI generated materials absolutely keep high in mind). The question was "who would own it" and I think it's pretty clear cut who would own it. No one.
With this logic photography is a painting, painted at an impossible high speed - but for some reasons we make a difference between something humans make and machines make.
Amusingly, every argument against ai art was made against photography over a hundred years ago, and I bet you own a camera - possibly even on the device you wrote your stupid comment on!
Sure, I even do photography professionally form time to time - I just don't consider it to be a painting.
But art, right?
(edit for clarity: at least in some cases)
Photography can be art as well as AI generated images can be art as well. AI is a tool and people can create art with it. But also what is art is completely subjective to the viewer.
That's a blanket statement. While I understand the sentiment, what about the thousands of "AIs" trained on private, proprietary data for personal or private use by organizations that own the said data. It's the not the technology but the lack of regulation and misaligned incentives.
Nope. Stop with the luddite lies please