this post was submitted on 20 Sep 2023
318 points (96.2% liked)

Technology

59312 readers
5268 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 3 points 1 year ago (1 children)

They'll probably isolate the models from each other, but yeah, if they want to train shared models from private data then that could happen.

[–] [email protected] 2 points 1 year ago (1 children)

But bard is the public one, right?

[–] [email protected] 1 points 1 year ago (1 children)

Bard is the name of the service, they can create account specific models trained on your user data which aren't shared with other accounts (as an extension of the base model built on public data). I've already read about companies doing this to avoid cross contamination. Pretty sure Google is aware of this.

[–] [email protected] 1 points 1 year ago (1 children)

But I don't know if Google cares enough about privacy to bother training individual models to avoid cross contamination. Each model takes years worth of super computer time, so the fewer they'd need to train, the less costly.

[–] [email protected] 1 points 1 year ago (1 children)

Extending existing models (retraining) doesn't need years, it can be done in far less time.

[–] [email protected] 1 points 1 year ago (1 children)

Hmm, I thought one of the problems with LLMs was they're pretty baked in in the training process. Maybe that was only with respect to removing information?

[–] [email protected] 1 points 1 year ago

Yeah, it's hard to remove data already trained into a model. But you can retrain them to add capabilities to an existing model, so if you copy one based on public data multiple times and then retrain with different sets of private data then you can save a lot of work