this post was submitted on 07 Feb 2024
218 points (98.7% liked)
Technology
59374 readers
3463 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Putting aside the merits of trying to trademark gpt, which like the examiner says is commonly used term for a specific type of AI (there are other open source "gpt" models that have nothing to do with OpenAI), I just wanted to take a moment to appreciate how incredibly bad OpenAI is at naming things. Google has Bard and now Gemini.Microsoft has copilot. Anthropic has Claude (which does sound like the name of an idiot, so not a great example). Voice assistants were Google Assistant, Alexa, seri, and Bixby.
Then openai is like ChatGPT. Rolls right off the tounge, so easy to remember, definitely feels like a personable assistant. And then they follow that up with custom "GPTs", which is not only an unfriendly name, but also confusing. If I try to use ChatGPT to help me make a GPT it gets confused and we end up in a "who's on first" style standoff. I've reported to just forcing ChatGPT to do a websearch for "custom GPT" so I don't have to explain the concept to it each time.
You can't really say any GPT model has nothing to do with OpenAI. They invented the architecture. But the name GPT predates their commercial products using the technology.
I don't know enough to know whether or not that's true. My understanding was that Google's Deep mind invented the transformer architecture with their paper "all you need is attention." A lot, if not most, LLMs use a transformer architecture, though your probably right a lot of them base it on the open source models OpenAI made available. The "generative" part is just descriptive of the model generating outputs (as opposed to classification and the like), and pre trained just refers to the training process.
But again I'm a dummy so you very well may be right.
The attention paper from Google introduced transformers, OpenAI introduced generative pretraining as a technique that allows transformers to achieve very good performance on downstream tasks with very little additional fine tuning. This paper and the subsequent release of the pretrained GPT models directly lead to the LLM boom.
https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf