this post was submitted on 15 Jun 2024

315 points (98.8% liked)

Technology

59148 readers

1946 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

[email protected]

315

Meta says European regulators are ruining its AI bot (www.theverge.com)

submitted 4 months ago by [email protected] to c/[email protected]

29 comments fedilink hide all child comments

Meta is putting plans for its AI assistant on hold in Europe after receiving objections from Ireland’s privacy regulator, the company announced on Friday.

In a blog post, Meta said the Irish Data Protection Commission (DPC) asked the company to delay training its large language models on content that had been publicly posted to Facebook and Instagram profiles.

Meta said it is “disappointed” by the request, “particularly since we incorporated regulatory feedback and the European [Data Protection Authorities] have been informed since March.”** **Per the Irish Independent, Meta had recently begun notifying European users that it would collect their data and offered an opt-out option in an attempt to comply with European privacy laws.

all 31 comments

sorted by: hot top controversial new old

[–] [email protected] 118 points 4 months ago (1 children)

Meta is ruining the Internet.

[–] [email protected] 48 points 4 months ago (1 children)

Meta has ruined the internet

[–] [email protected] 19 points 4 months ago

Meta has ruined large parts of our society

[–] [email protected] 91 points 4 months ago

Mafia: cops are ruining our extortion business!

Poor mafia.

[–] [email protected] 81 points 4 months ago

Tough shit Zuck

[–] [email protected] 61 points 4 months ago

offered an opt-out option in an attempt to comply with European privacy laws.

LMAO have they still not realized that any "opt-out" kind of coercion is forbidden now?

[–] [email protected] 58 points 4 months ago (1 children)

Ruining anything of meta sounds like a positive to me

[–] [email protected] 8 points 4 months ago* (last edited 4 months ago)

As long as no one messes with their open source contributions... (ditto for MS)

[–] [email protected] 52 points 4 months ago

Europe says Meta is ruining people their privacy and rights. Meta is like complaining there are guards and a locked safe ruining their bank heist. It's what those measures are for, keeping you in line.

[–] [email protected] 48 points 4 months ago

Good.

[–] [email protected] 44 points 4 months ago

Narrator: They were lying about not using the data. They already had.

[–] [email protected] 37 points 4 months ago* (last edited 4 months ago)

[–] [email protected] 31 points 4 months ago

Good.

[–] [email protected] 29 points 4 months ago (2 children)

Off-topic but that red and blue illustration is an eye trip.

[–] [email protected] 6 points 4 months ago

And ostensibly a human approved it

[–] [email protected] 2 points 4 months ago* (last edited 4 months ago)

It's wild to me considering organizations are more aware of web accessibility than ever.

[–] [email protected] 23 points 4 months ago

Good.

[–] [email protected] 23 points 4 months ago (1 children)

Isn't that the reason of regulating shitty things?

[–] [email protected] 6 points 4 months ago

They're used to the US system of regulations where they can just pay Congress a few tens of thousands.

[–] [email protected] 22 points 4 months ago

Ruin it deeper, baby

[–] [email protected] 11 points 4 months ago

Get fucked, Facebook. Rot in pain, Zuckfuck.

[–] [email protected] 10 points 4 months ago

Get fucked, Meta.

[–] [email protected] 6 points 4 months ago

Yet „dumb fucks” still on the platforms.

[–] [email protected] 3 points 4 months ago (2 children)

asked the company to delay training its large language models on content that had been publicly posted to Facebook and Instagram profiles.

I think that there are definitely issues with mass data-mining of that data. For generative AIs being trained on image data, I don't really care -- I think that concerns there are hugely overblown. But it's also possible to do things like build a mass facial recognition database with image data, and I'm pretty sure that text-processing is also an issue.

However.

The problem is that this applies to anyone. Like, I am confident that someone has gone out and scraped publicly-available data from Facebook and similar before. I know that someone has dumped Reddit comment and post history; you can download those. I am very confident that someone is either, right now, or if not now, will be if the Threadiverse gets big enough, dumping my comment data here and will be doing all kinds of processing on it.

That is, I don't think that Meta is the issue here. Meta would be the issue if processing private data were the issue, because only Meta and a limited set of users have access to that. The problem here is people posting publicly-accessible data that can potentially be used in ways that they might not want, potentially not understanding the implications of doing so. Meta's only responsible there in that they maybe encourage users to do so, have profile photos or whatnot.

And...I don't really have a great fix for that. Like, I think -- like most people here, obviously, as every single user I've seen on the Threadiverse uses a pseudonym -- that pseudonymity is at least a partial fix. There isn't a (direct) link to a real-world identity; someone would have to go to the work of deanonymizing account data. Few if any people on the Threadiverse seem to have a real profile photo. I use a swirl of water. So...that helps, because someone can't trivially link that data to data elsewhere.

I don't have any problem with someone training a model on information that I've posted publicly and just using it like a "better search engine", the way people are now. That's pretty low on my list of concerns.

But lemme give some concerns that might apply...and these aren't really primarily about generating chatbots. One thing that you can do with text classifiers -- which I think a lot of people out there don't realize -- is to search for and find some correlation in text. Like, the Federalist Papers, important documents about the US Constitution, were written by a few of the Founding Fathers under pseudonyms. Hundreds of years later, we went back and did statistical analysis -- IIRC using Markov chains, to deanonymize them. That might be possible to deanonymize people.

You can also extract a lot of information about someone from their text. Some of it humans can do, like "if someone uses inches, they're probably from the US". But you can do that en masse, get probably a pretty good location on someone, identify regional slang and local spellings and such.

There's software that can identify someone's gender and give a confidence estimate from someone's comments. You can probably -- and I'm sure people have -- train those classifiers to look for correlations on a lot of other things, like political views and such.

I had a buddy working in the video game industry who had a game that extracted a bunch of "employability" characteristics. Play for about ten minutes, and it logs a bunch of data about the gameplay. They trained a classifier to look for correlations in gameplay actions with IQ and a whole host of other things, so play the game, and you're transferring a lot of personal data about yourself. I would imagine that lots of video games could do that, and that that might let games be another form of revenue if information is sold to data-brokers. Probably can do the same thing with comments.

And I'm not so sure that people who are posting material attached to their identity are always necessarily realizing just how much they might actually be posting. Not necessarily information that Meta in particular might analyze, but information that they're handing to the world such that any organization that wants to do data-mining on it could analyze.

[–] [email protected] 9 points 4 months ago (2 children)

Long, but wrong :-)

my list of concerns.

It is not about your concerns, and it is not about concerns at all.

When they try to do forbidden things, then someone is going to tell them, and if they do it anyway (like that whole 'concerns' attitude seems to suggest), then someone is going to give them what they deserve.

[–] [email protected] 2 points 4 months ago

What about is wrong?

[–] [email protected] 1 points 4 months ago (1 children)

But it’s also possible to do things like build a mass facial recognition database with image data,

Facebook built one years ago, but ended up destroying it. https://www.theverge.com/2021/11/2/22759613/meta-facebook-face-recognition-automatic-tagging-feature-shutdown

[–] [email protected] 3 points 4 months ago

Thanks. That also kind of drives home the "I'm sure that third parties are scraping data and analyzing it too" thing:

Facebook’s decision won’t stop independent companies like Clearview AI — which built huge image databases by scraping photos from social networks, including Facebook — from using facial recognition algorithms trained with that data. US law enforcement agencies (alongside other government divisions) work with Clearview AI and other companies for facial recognition-powered surveillance.

[–] [email protected] 0 points 4 months ago

This is the best summary I could come up with:

Meta is putting plans for its AI assistant on hold in Europe after receiving objections from Ireland’s privacy regulator, the company announced on Friday.

Meta said it will “continue to work collaboratively with the DPC.” But its blog post says that Google and OpenAI have “already used data from Europeans to train AI” and claims that if regulators don’t let it use users’ information to train its models, Meta can only deliver an inferior product.

“We are pleased that Meta has reflected on the concerns we shared from users of their service in the UK, and responded to our request to pause and review plans to use Facebook and Instagram user data to train generative AI,” Stephen Almond, the executive director of regulatory risk at the UK Information Commissioner’s Office, said in a statement.

The DPC’s request followed a campaign by the advocacy group NOYB — None of Your Business — which filed 11 complaints against Meta in several European countries, Reuters reports.

NOYB founder Max Schrems told the Irish Independent that the complaint hinged on Meta’s legal basis for collecting personal data.

“Meta is basically saying that it can use any data from any source for any purpose and make it available to anyone in the world, as long as it’s done via AI technology,” Schrems said.

The original article contains 354 words, the summary contains 217 words. Saved 39%. I'm a bot and I'm open source!