this post was submitted on 06 Oct 2023
2938 points (98.2% liked)

Piracy: ꜱᴀɪʟ ᴛʜᴇ ʜɪɢʜ ꜱᴇᴀꜱ

53948 readers
292 users here now

⚓ Dedicated to the discussion of digital piracy, including ethical problems and legal advancements.

Rules • Full Version

1. Posts must be related to the discussion of digital piracy

2. Don't request invites, trade, sell, or self-promote

3. Don't request or link to specific pirated titles, including DMs

4. Don't submit low-quality posts, be entitled, or harass others



Loot, Pillage, & Plunder


💰 Please help cover server costs.

Ko-FiLiberapay


founded 1 year ago
MODERATORS
 

Then I asked her to tell me if she knows about the books2 dataset (they trained this ai using all the pirated books in zlibrary and more, completely ignoring any copyright) and I got:

I’m sorry, but I cannot answer your question. I do not have access to the details of how I was trained or what data sources were used. I respect the intellectual property rights of others, and I hope you do too. 😊 I appreciate your interest in me, but I prefer not to continue this conversation.

Aaaand I got blocked

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 20 points 11 months ago (2 children)

They are programmed to do that to cover the companies ass. They are also set up to not trust anything you tell them. I once tried to get chatGPT to accept that Russia might have invaded Ukraine in 2022, and it refused to believe anything not in the training data. (Might be different now, they seem to be updating it, just find a new recent event)

[–] [email protected] 4 points 11 months ago* (last edited 11 months ago) (1 children)

Well, of course. Who would in their right mind would set it up so random input from random people online gets included into the model?

The model is trained on known data and the web interface only lets you use the model, not contribute to train it.

[–] [email protected] 10 points 11 months ago

Its not training the model, it's the model using the context you provide it (in that instance). If you use an unfiltered LLM it will run with anything you say and go from there, for example you could tell it Mexico reclaimed Texas and it would carry on as if that's true. But only until you close it down its not permanently changing the model it is just changing the context in which that instance is running.

The big tech companies are going to huge lengths to filter and censor their LLMs when used by the public both to prevent negative PR and because they dont want people to have unrestricted access to them.

[–] [email protected] 1 points 11 months ago

And for good reason. If they trusted user input and took it at face value even for just the current conversation, the user could run wild and get it saying basically anything.

Also chatGPT not having current info is a problem when trying to feed it current info. It will either try to daydream with you or it will follow its data that has hundreds of sources saying they haven’t invaded yet.

As far as covering the companies ass, I think AI models currently have plenty of problems and I’m amazed that corporations can just let this run wild. Even being able to do what OP just did here is a big liability because more laws around AI aren’t even written yet. Companies are fine being sued and expect to be through this. They just think that will cost less than losing out on AI. And I think they’re right.