I don't really follow your logic, how else would you propose to shape the audio that is not "just an effect".
Your analogy to real life does not take into account that the audio source itself is moving, so their is an extra variable outside of just stereo signal -which is what spatial audio is modelling
And your muffling example sounds a bit over simplified maybe? My understanding is that the spatial stuff is produced by phase shifting the LR signals slightly
Finally why not go further? "I don't listen to speaker audio because it's all just effects and mirages to sound like a real sound, what only 2^16 discrete positions the diaphragm can be in" :p
Relative point to point
You could say the Linux kernel is an astronomically terrible idea because it doesn't do anything...but it is just the platform, the good comes from what people build on top of it that add all these quality of life features you miss
Buy ydy