Search This Blog

Wednesday, November 23, 2016

Using APCs to inject your DLL, reloaded

In a previous post Using APCs to inject you DLL I talked about injecting a DLL from the Windows Kernel.

Since the post was lacking some details, I decided to add a couple things and also talk about how to inject a 32bit process from a 64bit kernel.

When to inject our DLL

For a 64bit process, I usually do it right after ntdll.dll has been loaded. You can do this easily when you get a module load notification. You can get the notification by making a call to PsSetLoadImageNotifyRoutine().

The reason to wait for ntdll.dll is that once it is loaded, we can get the address of LdrLoadDll()

NTSYSAPI 
NTSTATUS
NTAPI

LdrLoadDll(
  IN PWCHAR               PathToFile OPTIONAL,
  IN ULONG                Flags OPTIONAL,
  IN PUNICODE_STRING      ModuleFileName,
  OUT PHANDLE             ModuleHandle );

When using LdrLoadDll() you should end up with code like this:

void NTAPI ApcLoadDLL(LPLDR_CONTEXT ctx, PVOID  SystemArgument1, 
                      VOID SystemArgument2) {
    UNREFERENCED_PARAMETER(SystemArgument1);
    UNREFERENCED_PARAMETER(SystemArgument2);
    HANDLE Module = NULL;

    ctx->LdrLoadDll(NULL, 0, &ctx->dllPath, &Module);
    return;
}

The context being defined as such:

typedef NTSTATUS(*LDR_LOAD_DLL_FN)(
    IN PWCHAR               PathToFile OPTIONAL,
    IN ULONG                Flags OPTIONAL,
    IN PUNICODE_STRING      ModuleFileName,
    OUT PHANDLE             ModuleHandle);

typedef struct ldrContext {
    PVOID ShellCode;
    UNICODE_STRING dllPath;
    HANDLE Process;
    LDR_LOAD_DLL_FN LdrLoadDll;
} LDR_CONTEXT, *LPLDR_CONTEXT;

The dllPath of the context is just a PUNICODE_STRING that contains the path of the DLL we want to inject.
That function will work as 64bit shellcode but for 32bit we'll need something different and basically 32bit assembly.

Injection of a 32bit DLL from a 64bit Kernel


The first thing that changes is when we inject the DLL. For 64bit we wait for ntdll.dll to be loaded (conveniently, it just happens to be the very first library loaded) but for 32bit APC we need to wait for a different library: wow64.dll

The reason for this is that to use an APC in a 32bit process, you can't just give the address of the routing that you want to execute. You need to give the address a specific API that is inside wow64.dll. Basically it's a thunking mechanism. 

The function that will do the work is: Wow64ApcRoutine

That function will be your APC routine, which in turn will call your actual shellcode.
The Wow64ApcRoutine routine is an APC normal routine. 

The parameters given to it are very specific though. When you create your APC you should have something like this:

LPLDR_CONTEXT32 context = (LPLDR_CONTEXT32)ctx;
PVOID ApcContext = (PVOID)(((ULONG_PTR)Apc32BitRoutine  << 32) + (ULONG_PTR) Apc32BitContext);
KeInitializeApc(apc, tThread,
OriginalApcEnvironment,
(PKKERNEL_ROUTINE)&KernelApcRoutine,
NULL,
context->Wow64ApcRoutine,
UserMode,
ApcContext);

KeInsertQueueApc(apc, 0, NULL, 0);

The context and other structures being defined as such:

typedef union
{
    struct
    {
ULONG Apc32BitContext;
ULONG Apc32BitRoutine;
    };
    PVOID Apc64BitContext;
} wow64ApcContext;

typedef wow64ApcContext  WOW64_CONTEXT;
typedef wow64ApcContext* LPWOW64_CONTEXT;

typedef struct ldrContext32 {
    ULONG ShellCode;
    UNICODE_STRING32 dllPath;
    DWORD Process;
    DWORD LdrLoadDll;
    PKNORMAL_ROUTINE Wow64ApcRoutine;
    WOW64_CONTEXT wow64Context;
} LDR_CONTEXT32, *LPLDR_CONTEXT32;

The Apc32BitRoutine is the address of your shellcode (you will have used ZwAllocateMemory() for the target process earlier).
The Apc32BitContext is your structure (basically just need the PUNICODE_STRING that specifies the path of your DLL)

The APC will therefore call Wow64ApcRoutine (you got the address of that after wow64.dll got loaded), and in turn, it will call your shellcode with the given context as a parameter.

32Bit shell code


In an earlier iteration, I used the following code:

UCHAR x86shellCode[] = {
//"\xcc" // Break Point
"\x55" // push ebp
"\x8b\xec" // mov ebp, esp
"\x8b\x45\x08" // mov eax, dword ptr [ebp+8]
"\x83\xc0\x0c" // add eax,0Ch
"\x8b\xf4" // mov esi,esp
"\x50" // push eax
"\x8b\x4d\x08" // mov ecx, dword ptr[ebp+08]
"\x83\xc1\x04" // add ecx, 4
"\x51" // push ecx
"\x6a\x00" // push 0
"\x6a\x00" // push 0
"\x8b\x55\x08" // mov edx, dword [ebp+8]
"\x8b\x42\x10" // mov eax, dword [edx+8]
"\xff\xd0" // call eax Note: No need to clean the stack after the call
"\x5d" // pop ebp
"\xc3" // ret
"\x90\x90\x90" // NOP
};

This will get the address of LdrLoadDll from the context as well as the UNICODE_STRING with the path. Essentially it does the exact same thing as the 64bit code mentioned at the beginning.