RtlLookupFunctionEntry - Overshooting Function Size on ARM64 Windows

I’m working on a project in C++ on Windows ARM64, and I’ve run into an issue when trying to retrieve function boundaries using RtlLookupFunctionEntry.

Specifically:

  • The BeginAddress returned is always correct.
  • However, the calculated function size (runtimeFunction->FunctionLength) seems to overshoot by hundreds of bytes.
  • This only happens on ARM64, the same logic works fine on x64.
  • Sometimes (depending on when I rebuild code or not) the FunctionLength is correct but if I rebuild the project it breaks again and overshoots, but checking a function ntdll.dll for example always returns a FunctionSize that is way to big.

For reference, I’m calling RtlLookupFunctionEntry like this:

PRUNTIME_FUNCTION pRuntimeFunction = RtlLookupFunctionEntry(RIP, &ImageBase, nullptr);

Here is the PRUNTIME_FUNCTION struct for ARM64:

typedef struct _IMAGE_ARM64_RUNTIME_FUNCTION_ENTRY { 
    DWORD BeginAddress;
    union {
        DWORD UnwindData;
        struct {
            DWORD Flag : 2;
            DWORD FunctionLength : 11;
            DWORD RegF : 3;
            DWORD RegI : 4;
            DWORD H : 1;
            DWORD CR : 2;
            DWORD FrameSize : 9;
        } DUMMYSTRUCTNAME;
    } DUMMYUNIONNAME;
} IMAGE_ARM64_RUNTIME_FUNCTION_ENTRY, * PIMAGE_ARM64_RUNTIME_FUNCTION_ENTRY;

Here is some example code that overshoots:

ULONG_PTR imageBase = 0;
PRUNTIME_FUNCTION fn = RtlLookupFunctionEntry((ULONG_PTR)GetProcAddress(GetModuleHandleA("ntdll.dll"), "NtClose"), &imageBase, NULL); //NtClose is only 8 bytes long -> svc 0x0F, ret;
if (fn != NULL)
{
  printf("Begin Address: 0x%p\n", imageBase + fn->BeginAddress);
  printf("Size: %d\n", fn->FunctionLength); //ARM64 instructions are always 4 bytes, so true length is "(fn->FunctionLength * 4)", but it returns something like 1000+ for some reason?
}
else
{
  printf("fn was NULL!\n");
}

Some extra context:

  • I’m working in usermode (not kernel).
  • The modules are standard Windows binaries (no custom-built ones).
  • I’ve verified that the function’s unwind info looks valid.
  • Manually disassembling confirms the function is smaller than what Rtl reports.

My Questions:

  • Is this a known quirk on ARM64 Windows, or could I be missing something about how ARM64’s unwind information or function tables work?

Any help or insight would be hugely appreciated, been banging my head against this for a bit now.

You’re encountering an interesting and not uncommon issue when working with RtlLookupFunctionEntry and ARM64 on Windows. Here’s a detailed breakdown of what’s going on and why you’re likely seeing excessively large FunctionLength values on ARM64:

TL;DR (Short Answer)

Yes, this is a known behavior on ARM64 Windows due to how unwind information is defined and handled in system modules like ntdll.dll. These modules often use conservative estimates or dummy unwind data that covers more than the actual function size. You’re not doing anything wrong — this is part of how Windows implements function tables and unwind metadata on ARM64.

Deeper Explanation

FunctionLength Field on ARM64

In the ARM64 unwind data, the FunctionLength field is encoded in units of 4-byte instructions, meaning:

function_size_bytes = FunctionLength * 4;

So if FunctionLength = 1000, the function appears to be 4000 bytes long — clearly too large for something like NtClose.

But here’s the catch:

System DLLs Use Packed or Synthetic Unwind Tables

On Windows ARM64, especially for system binaries (ntdll.dll, kernel32.dll, etc.), unwind info can be synthesized or overapproximated:

  • The compiler/linker sometimes pads unwind entries to include multiple small functions under a single unwind entry.
  • It is common to have unwind entries that encompass entire regions rather than tightly matching every function’s exact bounds.
  • This is an optimization for exception handling and stack unwinding purposes — the OS doesn’t need exact sizes for leaf functions that don’t unwind.

This behavior is much more prevalent on ARM64 compared to x64 due to ABI differences and optimizations.

When It Works vs When It Doesn’t

“Sometimes FunctionLength is correct, but after rebuilding it becomes oversized again.”

That’s expected behavior. When you build with optimizations off, or for small local projects, the function tables are more granular. But when optimizations are enabled (especially /O2 or /GL), functions might get merged, inlined, or have unwind entries shared across them.

This effect becomes more noticeable in precompiled system libraries where precision isn’t always critical.

Recommendations / Workarounds

1. Disassemble to Verify Manually (as you’re doing)

Good practice — tools like IDA Pro, Ghidra, or dumpbin /disasm will show you the actual function size.

2. Use Symbols (PDB) for Accurate Sizes

Instead of relying on unwind info for system DLLs, use PDBs and DbgHelp APIs (e.g. SymFromAddr, SymGetSymFromAddr64, etc.) to get more accurate sizes.

3. Don’t Rely on FunctionLength for Precise Function Sizes in System DLLs

You can use RtlLookupFunctionEntry to find whether a given address is inside a valid function, but not to calculate exact size boundaries, especially in system libraries.

4. Use Runtime Disassembly for Leaf-Function Size Estimation

If you’re analyzing leaf functions (like NtClose) which are short and don’t unwind, disassembling a few bytes from the start until ret/br/retab is often more reliable.

Final Thoughts

What you’re seeing is not a bug, but a design tradeoff on Windows ARM64. This behavior is aligned with how unwind information is stored for exception handling and performance. Your code and use of RtlLookupFunctionEntry are correct — but the data you’re getting back reflects architectural choices made by the Windows devs, not strict function boundaries.