this post was submitted on 20 Oct 2024
93 points (100.0% liked)

Technology

34883 readers
45 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.


Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.


Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 11 points 3 weeks ago (3 children)

CPUs have so many cores these days, that seems like a perfectly reasonable option. Declare a process 'security sensitive,' give it it's own core & memory, then wipe it when done.

[–] [email protected] 4 points 3 weeks ago

The trouble is that "core" is just that. The heart of the processor. There's a lot of shared state in the caches and the TLBs which is all common to multiple cores.

[–] [email protected] 4 points 3 weeks ago

I wish it were that easy, there's a lot of shared architecture in CPU design. So maybe there's cache lines that are shared, those have to be disabled.

Architecturally, maybe memory tagging for cash lines that in addition to looking at the TLB and physical addresses also looks at memory spaces. So if you're addressing something that's in the cache Even for another complete processor, you have to take the full hit going out to main memory.

But even then it's not perfect, because if you're invalidating the cache of another core there is going to be some memory penalty, probably infotesimal compared to going to main memory, but it might be measurable. I'm almost certain it would be measurable. So still a side channel attack

One mitigation that does come to mind, is running each program in a virtual machine, that way it's guaranteed to have completely different physical address space. This is really heavy-handed, and I have seen some papers about the side channel attacks getting leaked information from co guest VMs in AWS. But it certainly reduces the risk surface

[–] [email protected] 2 points 3 weeks ago* (last edited 3 weeks ago)

The only way to do that is to completely disable Out-of-order execution to begin with and disable any shared caches, which would completely neuter modern CPUs. Not a little bit, that's going to be around ~30% of the prior performance - not a 30% loss, a 70% loss...


From ChatGPT- (query: How much performance would a modern Zen 5 or Intel Alder Lake CPU lose if you completely stripped out/disabled SMT, Out of Order Execution and shared caches - operating in-order and only using dedicated (non-shared) caches?)

Stripping out or disabling key performance-enhancing features like Simultaneous Multithreading (SMT), Out-of-Order Execution (OoOE), and shared caches from a modern CPU based on architectures like AMD's Zen 5 or Intel's Alder Lake would result in a significant performance loss. Here's an overview of the potential impact from disabling each feature:

  1. Simultaneous Multithreading (SMT)

    Impact: SMT allows a single core to execute multiple threads simultaneously, improving CPU throughput, especially in multi-threaded applications. Disabling SMT would reduce the ability to handle multiple threads per core, decreasing performance for multi-threaded workloads. Expected Loss: Performance drop can be around 20-30% in workloads like video encoding, rendering, and heavily threaded applications. However, single-threaded performance would remain relatively unaffected.

  2. Out-of-Order Execution (OoOE)

    Impact: OoOE allows the CPU to execute instructions as resources become available, rather than in strict program order, maximizing utilization of execution units. Disabling OoOE forces the CPU to operate in-order, meaning that it would stall frequently when waiting for data dependencies or slower operations, like memory access. Expected Loss: This could lead to performance drops of 50% or more in general-purpose workloads because modern software is optimized for OoOE processors. Tasks like complex branching, memory latency hiding, and speculative execution would suffer greatly.

  3. Shared Caches (L2, L3)

    Impact: Shared caches (particularly L3 caches) help reduce memory latency by sharing frequently accessed data among multiple cores. Disabling shared caches would increase memory access latency, causing more frequent trips to slower main memory. Expected Loss: Performance could drop by 15-30% depending on the workload, especially for applications that benefit from high cache locality, such as database operations, scientific simulations, and gaming.

  4. Operating In-Order Only with Dedicated Caches

    Overall Impact: Without OoOE and SMT, and with only in-order execution and dedicated caches, the CPU would be much less efficient at handling multiple tasks and hiding latency. Modern CPUs rely heavily on OoOE to keep execution units busy while waiting for slow memory operations, so forcing in-order execution would significantly stall the CPU. Expected Loss: Depending on the workload, the overall performance degradation could be upwards of 70-80%. Some specialized applications that rely on high parallelism and efficient cache usage might perform even worse.

Summary of Overall Performance Impact:

  • Single-threaded tasks: May see performance drop by 50-70% depending on reliance on OoOE and cache efficiency.

  • Multi-threaded tasks: Could experience a combined drop of 70-80%, as the lack of SMT, OoOE, and shared caches compound the inefficiencies.

This hypothetical CPU configuration would essentially mimic designs seen in early microprocessors or microcontrollers, sacrificing the massive parallelism, latency hiding, and overall efficiency that modern architectures provide. The performance would be more in line with processors from a couple of decades ago, despite the higher clock speeds and core counts.


Case in point, it's not feasible, if you're looking for that in your own computer, you can do it already. I doubt anyone will follow you though.