The Anthropic one is saying they think they have a way to figure it out, but it hasn’t been tested on large models. This is their last paragraph:
Again, all your quotes indicate that what they've figured out is a way to inspect the interior state of models and transform the vector space into something humans can understand without analyzing the output.
I think your confusion is you believe that because we don't know what the vector space is on the inside, we don't know how AI works. But we actually do know how it accomplishes what it accomplishes. Simply because its interior is a black box doesn't mean we don't understand how we built that black box, or how it operates and functions.
For an overview of how many different kinds of LLMs function, here's a good paper: https://arxiv.org/pdf/2307.06435.pdf You'll note that nowhere is there any confusion about the process of how they generate input or produce output. It is all extremely well-understood. You are correct that we cannot interrogate their internals, but that is also not what I mean, at least, when I say that we can understand them and how they work.
I also can't inspect the electrons moving through my computer's CPU. Does that mean we don't understand how computers work? Is there intelligence in there?
I think you’re maybe having a hard time with using numbers to represent concepts. While a lot less abstract, we do this all the time in geometry. ((0, 0), (10, 0), (10, 10), (0, 10), (0, 0)) What’s that? It’s a square. Word vectors work differently but have the same outcome (albeit in a more abstract way).
No, that is not my main objection. It is your anthropomorphization of data and LLMs -- your claim that they "have intelligence." From your initial post:
But also, can you define what intelligence is? Are you sure it isn’t whatever LLMs are doing under the hood, deep in hidden layers?
I think you're getting caught up in trying to define what intelligence is; but I am simply stating what it is not. It is not a complex statistical model with no self-awareness, no semantic understanding, no ability to learn, no emotional or ethical dimensionality, no qualia...
((0, 0), (10, 0), (10, 10), (0, 10), (0, 0))
is a square to humans. This is the crux of the problem: it is not a "square" to a computer because a "square" is a human classification. Your thoughts about squares are not just more robust than GPT's, they are a different kind of thing altogether. For GPT, a square is a token that it has been trained to use in a context-appropriate manner with no idea of what it represents. It lacks semantic understanding of squares. As do all computers.
If you’re saying that intelligence and understanding is limited to the human mind, then please point to some non-religious literature that backs up your assertion.
I'm disappointed that you're asking me to prove a negative. The burden of proof is on you to show that GPT4 is actually intelligent. I don't believe intelligence and understanding are for humans only; animals clearly show it too. But GPT4 does not.
I mean, your argument is still basically that it's thinking inside there; everything I've said is germane to that point, including what GPT4 itself has said.