Technology

69041 readers

2275 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

[email protected]

264

AI girlfriend bots are already flooding OpenAI’s GPT store (qz.com)

submitted 1 year ago by [email protected] to c/[email protected]

75 comments fedilink hide all child comments

AI girlfriend bots are already flooding OpenAI’s GPT store::OpenAI’s store rules are already being broken, illustrating that regulating GPTs could be hard to control

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 2 points 1 year ago (1 children)

The other issue will probably be harder to solve. There is less high quality long context training data. Most datasets were created for small context models.

I never considered that this was a dynamic that was involved. Thats interesting. So each piece of data fed into a model during training also has to fit into a “context window” of a certain size too?

[–] [email protected] 3 points 1 year ago

Yes to your question, but that's not what I was saying.

Here is one of the most popular training datasets : https://pile.eleuther.ai/

If you look at the pdf describing the dataset, you'll find the mean length of these documents to be somewhat short with mean length being less than 20kb (20000 characters) for most documents.

You are asking for a model to retain a memory for the whole duration of a discussion, which can be very long. If I chat for one hour I'll type approximately 8400 words, or around 42KB. Longer than most documents in the training set. If I chat for 20 hours, It'll be longer than almost all the documents in the training set. The model needs to learn how to extract information from a long context and it can't do that well if the documents on which it trained are short.

You are also right that during training the text is cut off. A value I often see is 2k to 8k tokens. This is arbitrary, some models are trained with a cut off of 200k tokens. You can use models on context lengths longer than that what they were trained on (with some caveats) but performance falls of badly.