this post was submitted on 04 Dec 2023
887 points (97.9% liked)

Technology

59374 readers
6873 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 312 points 11 months ago (22 children)

How can the training data be sensitive, if noone ever agreed to give their sensitive data to OpenAI?

[–] [email protected] 137 points 11 months ago (15 children)

Exactly this. And how can an AI which "doesn't have the source material" in its database be able to recall such information?

[–] [email protected] 70 points 11 months ago (7 children)

Model is the right term instead of database.

We learned something about how LLMs work with this.. its like a bunch of paintings were chopped up into pixels to use to make other paintings. No one knew it was possible to break the model and have it spit out the pixels of a single painting in order.

I wonder if diffusion models have some other wierd querks we have yet to discover

[–] [email protected] 9 points 11 months ago* (last edited 11 months ago)

The technology of compression a diffusion model would have to achieve to realistically (not too lossily) store “the training data” would be more valuable than the entirety of the machine learning field right now.

They do not “compress” images.

load more comments (6 replies)
load more comments (13 replies)
load more comments (19 replies)