Technology

59421 readers

5022 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

[email protected]

371

Meta won't train AI on Euro posts after all, as watchdogs put their paws down (www.theregister.com)

submitted 5 months ago by [email protected] to c/[email protected]

36 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 29 points 5 months ago (9 children)

So Meta's AIs will mainly reflect non-EU cultural values.

[–] [email protected] 44 points 5 months ago (6 children)

EU cultural values include resisting against corporations doing whatever they want with our data. Let's see meta try to reflect those.

[+] [email protected] -16 points 5 months ago (5 children)

So you want Meta's AI to have values that don't include resisting against corporations doing whatever they want with your data?

This is a seriously double-edged sword here. The training data of these AIs is what gives these AIs their capabilities and biases.

[–] [email protected] 15 points 5 months ago (1 children)

Anyway, no matter from which parts of the world it's trained, we're talking about 2024 Facebook content. We've seen what Reddit does to an AI.

Can't wait for meta's cultured AI to share its wisdom with us.

[+] [email protected] -12 points 5 months ago* (last edited 5 months ago) (1 children)

Reddit is actually extremely good for AI. It's a vast trove of examples of people talking to each other.

When it comes to factual data then there are better sources, sure, but factual data has never been the key deficiency of AI. We've long had search engines for that kind of thing. What AIs had trouble with was human interaction, which is what Reddit and Facebook are all about. These datasets train the AI to be able to communicate.

If the Fediverse was larger we'd be a significant source of AI training material too. Would be surprised if it's not being collected already.

[–] [email protected] 10 points 5 months ago (1 children)

I think they were referencing glue on pizza ring and stuf .like that

[–] [email protected] -5 points 5 months ago

The "glue on pizza" thing wasn't a result of the AI's training, the AI was working fine. It was the search result that gave it a goofy answer to summarize.

The problem here is that it seems people don't really understand what goes into training an LLM or how the training data is used.

load more comments (3 replies)

load more comments (5 replies)