Technology

68306 readers

4250 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

[email protected]

299

Anthropic has developed an AI 'brain scanner' to understand how LLMs work and it turns out the reason why chatbots are terrible at simple math and hallucinate is weirder than you thought (www.pcgamer.com)

submitted 1 day ago by [email protected] to c/[email protected]

117 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 36 points 1 day ago (26 children)

this is one of the most interesting things about Llms that i have ever read

[–] [email protected] 13 points 1 day ago (25 children)

That bit about how it turns out they aren't actually just predicting the next word is crazy and kinda blows the whole "It's just a fancy text auto-complete" argument out of the water IMO

[–] [email protected] 36 points 23 hours ago (6 children)

It really doesn't. You're just describing the "fancy" part of "fancy autocomplete." No one was ever really suggesting that they only predict the next word. If that was the case they would just be autocomplete, nothing fancy about it.

What's being conveyed by "fancy autocomplete" is that these models ultimately operate by combining the most statistically likely elements of their dataset, with some application of random noise. More noise creates more "creative" (meaning more random, less probable) outputs. They do not actually "think" as we understand thought. This can clearly be seen in the examples given in the article, especially to do with math. The model is throwing together elements that are statistically proximate to the prompt. It's not actually applying a structured, logical method the way humans can be taught to.

[–] [email protected] 15 points 20 hours ago (1 children)

Unfortunately, these articles are often written by people who don't know enough to realize they're missing important nuances.

[–] [email protected] 6 points 9 hours ago

It also doesn't help that the AI companies deliberately use language to make their models seem more human-like and cogent. Saying that the model e.g. "thinks" in "conceptual spaces" is misleading imo. It abuses our innate tendency to anthropomorphize, which I guess is very fitting for a company with that name.

On this point I can highly recommend this open access and even language-wise accessible article: https://link.springer.com/article/10.1007/s10676-024-09775-5 (the authors also appear on an episode of the Better Offline podcast)

load more comments (4 replies)

load more comments (22 replies)