this post was submitted on 17 Mar 2024
461 points (95.5% liked)

Technology

59374 readers
3040 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 19 points 8 months ago (5 children)

Maybe this comment will age poorly, but I think AGI is a long way off. LLMs are a dead-end, IMO. They are easy to improve with the tech we have today and they can be very useful, so there's a ton of hype around them. They're also easy to build tools around, so everyone in tech is trying to get their piece of AI now.

However, LLMs are chat interfaces to searching a large dataset, and that's about it. Even the image generators are doing this, the dataset just happens to be visual. All of the results you get from a prompt are just queries into that data, even when you get a result that makes it seem intelligent. The model is finding a best-fit response based on billions of parameters, like a hyperdimensional regression analysis. In other words, it's pattern-matching.

A lot of people will say that's intelligence, but it's different; the LLM isn't capable of understanding anything new, it can only generate a response from something in its training set. More parameters, better training, and larger context windows just refine the search results, they don't make the LLM smarter.

AGI needs something new, we aren't going to get there with any of the approaches used today. RemindMe! 5 years to see if this aged like wine or milk.

[–] [email protected] 0 points 8 months ago (3 children)

How does this amazing prediction engine discovery that basically works like our brain does not fit in a larger solution?

The way emergent world simulation can be found in the larger models definitely point to this being a cornerstone, as it provides functional value in both image and text recall.

Nevermid that tools like memgpt doesn't satisfy long term memory and context windows doesn't satisfy attention functions properly, I need a much harder sell on LLM technology not proving an important piece of agi

[–] [email protected] 1 points 7 months ago* (last edited 7 months ago) (2 children)

I didn't say it wasn't amazing nor that it couldn't be a component in a larger solution but I don't think LLMs work like our brains and I think the current trend of more tokens/parameters/training LLMs is a dead-end. They're simulating the language area of human brains, sure, but there's no reasoning or understanding in an LLM.

In most cases, the responses from well-trained models are great, but you can pretty easily see the cracks when you spend extended time with them on a topic. You'll start to get oddly inconsistent answers the longer the conversation goes and the more branches you take. The best fit line (it's a crude metaphor, but I don't think it's wrong) starts fitting less and less well until the conversation completely falls apart. That's generally called "hallucination" but I'm not a fan of that because it implies a lot about the model that isn't really true. Y

You may have already read this, but if you haven't: Steven Wolfram wrote a great overview of how GPT works that isn't too technical. There's also a great sci-fi novel from 2006 called Blindsight that explores the way facsimiles of intelligence can be had without consciousness or even understanding and I've found it to be a really interesting way to think about LLMs.

It's possible to build a really good Chinese room that can pass the Turing test, and I think LLMs are exactly that. More tokens/parameters/training aren't going to change that, they'll just make them better Chinese rooms.

[–] [email protected] 1 points 7 months ago (1 children)

Thanks, I'll check those out. The entire point of your comment was that llm is a dead end. The branching as you call it is just more parameters which approach, in lower token models a collapse. Which is why more tokens and larger context does improve accuracy and why it does make sense to increase them. LLMs have also proven to in some cases have what you call reason and what many call reason but which is not a good word for the error. Larger models provide a way to stimulate the world which in turn gives us access to the sensing mechanism of our brain, which is to stimulate and then attend to disparages between the simulation and actual. This in turn gives access to action which unfortunately is not very well understood. Simulation, or prediction, is what our brains constantly do to be able to react and adapt to the world without massive timing failure and massive energy cost, for instance consider driving where you focus on unusual sensing and let action be an extension of purpose by just allowing constant prediction to happen where your muscles have already prepared to commit even precise movements due to enough practice with your "model" of how wheel and foot apply to the vehicle.

[–] [email protected] 1 points 7 months ago

*Simulate, not stimulate lol

load more comments (1 replies)