this post was submitted on 20 Mar 2024
22 points (65.3% liked)

Technology

59374 readers
7033 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[โ€“] [email protected] 3 points 8 months ago (1 children)

Other than maybe pattern recognition, they literally have no mechanism to do any of those things. People say that it recursively spits out the next word, because that is literally how it works on a coding level. It's called an LLM for a reason.

[โ€“] [email protected] 8 points 8 months ago* (last edited 8 months ago)

they literally have no mechanism to do any of those things.

What mechanism does it have for pattern recognition?

that is literally how it works on a coding level.

Neural networks aren't "coded".

It's called an LLM for a reason.

That doesn't mean what you think it does. Another word for language is communication. So you could just as easily call it a Large Communication Model.

Neural networks have hundreds of thousands (at the minimum) of interconnected ~~layers~~ neurons. Llama-2 has 76 billion parameters. The newly released Grok has over 300 billion. And though we don't have official numbers, ChatGPT 4 is said to be close to a trillion.

The interesting thing is that when you have neural networks of such a size and you feed large amounts of data into it, emergent properties start to show up. More than just "predicting the next word", it starts to develop a relational understanding of certain words that you wouldn't expect. It's been shown that LLMs understand things like Miami and Houston are closer together than New York and Paris.

Those kinds of things aren't programmed, they are emergent from the dataset.

As for things like creativity, they are absolutely creative. I have asked seemingly impossible questions (like a Harlequin story about the Terminator and Rambo) and the stuff it came up with was actually astounding.

They regularly use tools. Lang Chain is a thing. There's a new LLM called Devin that can program, look up docs online, and use a command line terminal. That's using a tool.

That also ties in with problem solving. Problem solving is actually one of the benchmarks that researchers use to evaluate LLMs. So they do problem solving.

To problem solve requires the ability to do analysis. So that check mark is ticked off too.

Just about anything that's a neutral network can be called an AI, because the total is usually greater than the sum of its parts.

Edit: I wrote interconnected layers when I meant neurons