Trust Boundaries in Kernel Anti-Cheat Systems

Disclosure note: I completed responsible disclosure before publishing this writeup. The goal here is to discuss trust boundaries and defensive architecture, not to provide a cheating guide.

Abstract

This post is about a question I kept running into while working on Hv2, a small AMD-V/SVM hypervisor I built as an undergraduate systems project:

How much can a kernel anti-cheat actually prove if the thing it is trying to reason about may be below the kernel?

That question is more interesting than "can you hide X from Y?" I do not want this writeup to be a playbook for cheating in games. The useful part, at least to me, is the trust-boundary problem. Kernel anti-cheat is powerful, and systems like Easy Anti-Cheat and Riot Vanguard do a lot of real work. But ring-0 is still not the bottom of the machine. If a hypervisor is underneath the OS, then the OS is now a guest, and guest-visible evidence has limits.

My conclusion after building and testing this is pretty simple: software checks inside the guest can raise the cost of abuse, but they cannot be the final source of truth. For strong platform integrity, anti-cheat vendors need hardware-backed measured boot and remote attestation.

All findings have been responsibly disclosed to EAC and Vanguard prior to publication.

Background

What is a Type-1 Hypervisor?

A type-1, or bare-metal, hypervisor runs below the operating system. The OS still thinks it is running normally, but memory translation, CPU state, and some execution events are now mediated by a lower layer.

That placement is the whole point. A guest can inspect what it can see. It can ask the CPU questions, walk kernel lists, time operations, and scan memory. What it cannot always do is prove that its view is the raw hardware view. Without some hardware-rooted measurement, the guest is partly measuring a world that can be shaped from below it.

Hv2 virtualizes a running Windows system using AMD Secure Virtual Machine (SVM) extensions. It uses Nested Page Tables (NPT) for second-level address translation and includes some memory introspection primitives for lab testing. I built it to understand the mechanics, not to make a production tool.

The Threat Model

Kernel anti-cheat systems usually assume they are one of the most privileged pieces of software on the client. That is a reasonable assumption for a lot of commodity cheating, and it is why kernel drivers are useful in the first place.

Typical signals include:

Loaded kernel modules
Driver signing state
Hypervisor-related CPU features
Timing artifacts
Unknown executable memory
Odd physical allocation patterns

All of these are useful. None of them are magic.

The uncomfortable part is that a hypervisor changes the observer relationship. The anti-cheat is still privileged inside Windows, but Windows is now the guest. So the question becomes less "can the kernel inspect the machine?" and more "which parts of the machine is the kernel actually in a position to inspect authoritatively?"

Hv2 Architecture

Hv2 is small and very much a research project. At a high level, it:

Starts virtualization across all logical processors.
Maintains per-core SVM state.
Uses nested paging to control the guest's memory view.
Records timing and memory-observation behavior from below the Windows kernel.

I am not going to turn this into source-level documentation, but the implementation details do matter. The trust-boundary argument is only convincing if it is grounded in real mechanisms: VMEXIT behavior, nested paging, process CR3 translation, CPU feature reporting, and timing.

The important fact is this: after launch, Windows keeps running, but it is running as a guest. From inside Windows, things still look mostly normal. From the hypervisor layer, the kernel is no longer the deepest observer.

What I Actually Validated

The main technical work was building enough of the hypervisor to test real trust-boundary questions instead of only reasoning about them abstractly. In the lab, I validated:

Live virtualization of an already-running Windows 11 system on AMD hardware.
Per-core SVM initialization and guest-state setup across 24 logical processors.
Nested page table management for controlling guest memory visibility.
CR3-based guest page-table walking for process memory read/write tests without relying on Windows memory-copy APIs.
VMEXIT handling for selected CPU events and memory-translation faults.
Timing behavior around virtualization events and guest-visible measurements.
The difference between module-level visibility and physical memory attribution.
The practical limits of trying to make a guest-local scanner authoritative.

That last point was the most important one. The interesting result was not a single detector or countermeasure. It was seeing how quickly every software signal becomes a question of observer placement: who is measuring, from what layer, and what can that layer actually prove?

Technical Mechanisms Used

The core of the project was not one trick. It was a set of low-level mechanisms working together:

Live SVM launch: Hv2 virtualizes the already-running Windows system rather than booting a separate guest. That means it has to capture enough live CPU state for Windows to keep executing after VMRUN.
Per-core VMCB state: Each logical processor needs its own virtualization state. Segment registers, control registers, syscall MSRs, interrupt state, and guest RIP/RSP all have to be reconstructed accurately.
Nested Page Tables: NPT gives the hypervisor a second translation layer below guest physical memory. This is the basis for observing memory from outside the guest's normal page-table model.
CR3-based page-table walking: For process memory introspection, Hv2 resolves virtual addresses by walking the target process's page tables from its CR3. That avoids relying on Windows kernel memory-copy functions for the actual translation.
Read/write memory primitives: Once a guest virtual address is resolved to a physical page, Hv2 can perform controlled reads and writes through its own translation path. In this project, that was used only for lab testing and validation.
VMEXIT handling: Selected CPU events trap into the VMM, where the handler can inspect guest state, update virtualized state, and resume execution.
MSR virtualization: Some MSR reads and writes need to be virtualized so the guest sees consistent architectural state while the host keeps the state required for SVM operation.
Timing observation: VMEXITs, nested page faults, and intercepted operations all have measurable cost. A lot of the project came down to understanding which timing sources can see that cost.
Physical allocation analysis: VMCBs, host save areas, MSR bitmaps, and paging structures leave physical memory footprints. These are useful to think about because they are below ordinary module enumeration.

That combination is what made the project interesting. It was not just "kernel driver does weird thing." It was CPU virtualization, Windows state capture, page-table translation, timing, and memory attribution all meeting at the same boundary.

Memory Introspection

Hv2 includes memory introspection code for resolving and inspecting guest memory from the hypervisor layer. That sounds spooky in an anti-cheat context, but the same broad class of technique shows up in normal security work too: VM monitors, sandboxes, malware analysis, forensics, and endpoint research.

The important technical piece is the address translation path. A normal kernel component usually asks Windows to help with process memory: attach to a process, copy memory, or use OS-provided helpers. Hv2 does not need that for the core translation. It starts from the target process CR3, walks the guest page tables, resolves the backing physical page, and then accesses that page from the hypervisor layer.

That matters because it changes who is doing the observation. A guest scanner and a lower-level monitor can both talk about "the same" process address, but they do not necessarily reach it through the same authority chain.

Execute-Only Visibility and Timing

Nested paging lets a hypervisor reason about instruction fetches and data reads differently. That is useful for research because it gives you a way to study what happens when code visibility is mediated below the guest kernel.

There is a catch, though: changing what the guest sees usually costs something. Even if one timer looks clean, another measurement source may still expose the work being done underneath. That became one of the more important results of the project. A lower layer can shape visibility, but it does not get to delete physics.

Detection Surfaces

1. Hypervisor Presence Checks

Anti-cheat systems and diagnostic tools can look for signs that Windows is running under a hypervisor: CPU feature leaves, vendor identifiers, strange instruction timing, and so on.

The common example is CPUID. A lot of VM detection code checks the hypervisor-present bit or looks for a hypervisor vendor leaf. That catches many normal virtualized environments, but it is not a strong signal against a hypervisor that is intentionally trying to look ordinary. On AMD, the CPU does not magically force a useful "I am virtualized" answer for every possible setup. Much of what a guest sees through CPUID is a policy decision made by the layer below it.

There is also a timing problem. If a hypervisor intercepts CPUID to edit the result, that interception itself can become measurable because the instruction now causes a VMEXIT. A more careful hypervisor may choose not to intercept CPUID at all, or may minimize interception enough that CPUID-based checks stop telling you much. In that case the absence of a CPUID signal does not prove bare metal. It only proves that this particular interface did not report virtualization.

This is the pattern that shows up over and over: hypervisor presence is not one bit. CPUID, MSR behavior, control register state, descriptor tables, loaded modules, pool allocation patterns, physical memory layout, nested paging side effects, and timing behavior can all provide evidence. But none of them, alone, gives the guest a complete view of the platform.

In my testing, CPUID and simple timing checks were only the starting point. I also looked at signals around intercepted MSRs, guest-visible CPU state, module attribution, pool and contiguous allocation fingerprints, and nested paging behavior. Some of those signals are much better than others. The common weakness is that they are still observations made from inside the guest.

The guest can observe the CPU interface it was given. It cannot automatically prove that the interface is the full platform state.

2. MSR and CPU-State Artifacts

MSRs are a useful place to look because they sit right at the boundary between architectural CPU state and OS expectations. A hypervisor may need to virtualize selected MSRs so the guest sees one value while the host keeps another value required for virtualization.

That sounds like an obvious detection angle, and sometimes it is. If the guest can observe inconsistent MSR behavior, unexpected intercept latency, or state that does not match the rest of the machine, that is useful evidence. But it is also easy to overstate. A careful VMM can keep guest-visible state coherent for common reads and writes, and some MSR paths are not hot on every Windows configuration.

The engineering lesson for me was that "intercept everything" is the wrong default. Every intercept is both a correctness burden and a timing burden. The cleaner design is to intercept only what the hypervisor actually needs to control, keep guest-visible state internally consistent, and understand which events are frequent enough to become timing signals.

3. Kernel Module Enumeration

Module enumeration is still one of the first things I would expect a kernel anti-cheat to do. If an unknown driver was loaded through normal Windows paths, the module list is a good place to find it.

The limitation is that the loader's view is not the same thing as a complete map of privileged state. The list tells you what Windows knows it loaded. It does not prove that nothing else has influenced the system from outside that accounting model.

So module enumeration is valuable, but it is evidence, not proof of absence.

4. Timing-Based Detection

Virtualization tends to leave timing artifacts. VM exits, nested translation behavior, intercepted operations, and page-table effects can all add latency.

Timing is appealing because it can reveal things that do not show up in a clean module list. For example, an instruction that normally runs mostly in hardware may become much slower if the hypervisor intercepts it. The same applies to memory accesses that trigger nested page faults or permission changes. If a scanner repeatedly measures an operation and sees a consistent latency cliff, that can be evidence that something below the guest is handling the event.

The hard part is choosing the clock. RDTSC and RDTSCP are easy to use from the guest, but they are also exactly the kind of source a virtualization layer may be able to virtualize or offset. Hardware performance counters are more interesting because they can measure actual retired cycles or core events depending on configuration. Server-side timing is another category entirely: it does not need to see the client internals if it can collect enough samples and notice a statistical pattern.

The most interesting timing cases in my testing were not generic "VMs are slower" checks. They were specific sources of exit or translation cost:

Instruction interception: CPUID, selected MSRs, or other trapped instructions can become visibly slower if they cause VMEXITs.
Nested page faults: NPT permission changes or unmapped regions can introduce latency when the CPU has to leave the guest to resolve the access.
Timer normalization: A VMM may be able to keep guest-visible TSC deltas looking reasonable, but that does not mean every timing source agrees.
External measurement: Server-side timing and hardware counters are harder for the guest to explain away because they are not just self-reported wall-clock measurements.

This is why I think timing is one of the better software signals, but not a clean proof. CPU model, firmware, power management, topology, background load, and the exact measurement source all matter. A small timing difference can be a signal, or it can be noise wearing a convincing hat. The more independent the measurement source is from the guest, the more useful it becomes.

The strongest timing evidence comes from outside the guest's influence: hardware counters that are not virtualized in the same way, or server-side statistics gathered over many samples. Guest-local timing is useful, but I would not want it to carry the whole case by itself.

5. Pool and Allocation Fingerprints

Low-level systems code often leaves fingerprints in memory. Tags, allocation sizes, alignment requirements, and physical layout can all be suspicious if they do not match normal kernel behavior.

This is one of the better software-level angles because it can catch things that do not appear in normal module enumeration. But again, the hard part is attribution. Some virtualization structures have to exist because the hardware requires them: VMCBs, host save areas, MSR bitmaps, NPT tables, and per-core state all leave memory somewhere. Even if names and module metadata are clean, physical memory layout can still look unusual.

A guest scanner might find weird-looking physical memory, but deciding exactly what it means from inside the guest is harder. Is it a driver allocation? Firmware state? Hypervisor state? Device memory? A tool can build suspicion from allocation shape and ownership gaps, but it is still reasoning from the guest side of the boundary.

This was one of the places where the research felt less like "one clever detector wins" and more like accumulating evidence. You can build a case. You still need to know where your authority stops.

6. Nested Paging Side-Channels

Nested paging is powerful, but it also creates side-channels. If memory visibility depends on how a page is accessed, transitions between those views can leave timing evidence.

That is the tradeoff I kept coming back to. You can move a signal around. You can reduce one artifact and accidentally make another one more important. But if a lower layer is doing work on behalf of the guest, some measurement source may still notice.

The broader lesson is that nested paging does not make the trust-boundary problem disappear. It just makes the boundary more obvious.

What Can Software Actually Prove?

After going through the detection surfaces, I think the picture looks roughly like this:

CPU feature checks: useful, but not authoritative from inside the guest.
Kernel module lists: useful, but only describe what the guest loader knows about.
Driver signing state: useful, but not enough to prove there is no lower layer.
Guest-local timing: useful, but only partially authoritative.
Pool and allocation patterns: useful, especially for suspicious memory shape, but still attribution-limited.
Physical memory attribution: useful, but difficult to interpret completely from the guest.
Hardware-backed attestation: authoritative when verified externally.
Server-side statistical analysis: authoritative for behavior observed outside the client.

I do not read this as "kernel anti-cheat is useless." That would be wrong. Kernel anti-cheat raises the bar a lot.

The real point is narrower: if the thing you care about happened before the OS booted, or below the OS after boot, then the OS should not be the only witness. You need some evidence rooted outside the guest.

Recommendations for Anti-Cheat Vendors

Primary: TPM 2.0 Remote Attestation

Measured boot is the cleanest answer I found. Components in the boot path are measured into TPM Platform Configuration Registers (PCRs), and those measurements can be quoted to a remote verifier.

That gives the server something stronger than "the client says it looks clean from inside Windows." It gives the server hardware-backed evidence about how the platform booted.

At a high level, the server can check:

The quote came from a real TPM.
Secure Boot policy is in the expected state.
Boot measurements match a known-good baseline for that machine and OS configuration.
The client is not relying only on guest-local self-reporting.

This does not make cheating impossible. Nothing client-side does. But it moves the argument to a much better place.

Secondary: Secure Boot Enforcement

Secure Boot helps prevent unsigned pre-OS components from loading. By itself, it is not enough, because the server still needs to know what policy was active and what actually got measured.

Combined with TPM attestation, though, Secure Boot becomes part of a useful chain. Vanguard's Secure Boot requirement on Windows 11 is pointing in that direction. The missing piece is using measured boot evidence as part of the server-side trust decision.

Tertiary: HVCI and Expected Hypervisor Policy

HVCI puts Microsoft's hypervisor in the privileged virtualization role and strengthens kernel code integrity. If the platform expects that hypervisor and verifies the boot chain, then unexpected lower layers become much harder to introduce quietly.

The model is basically:

TPM attestation -> Secure Boot state
Secure Boot     -> trusted boot path
Trusted boot    -> expected hypervisor and kernel integrity policy

If one of those links changes, the server should be able to see it.

Conclusion

Hv2 made one thing very clear to me: kernel-level visibility is not the same as platform-level authority.

A kernel scanner can collect a lot of useful signals. It can make cheating harder, catch sloppy components, and force attackers into narrower paths. But it cannot fully prove the absence of a lower-level monitor from inside the environment being monitored.

That is not a failure of EAC, Vanguard, or any specific detector. It is just the shape of the trust boundary. If the assurance you want is "this machine booted into the expected state and is not running under an unexpected lower layer," then the answer has to involve measured boot, TPM-backed quotes, and server-side verification.

Guest-level scanning still matters. It just should not be asked to prove something it is not positioned to prove.

This research was conducted on Windows 11 Pro (build 26100) on AMD hardware with 24 logical processors. All techniques were developed and tested in isolated lab environments. Responsible disclosure was completed before publication.