Search This Blog

Friday, September 16, 2011

Using APCs to inject your DLL

A few blogs ago I discussed remote threads on Windows 7. This topic was targeted towards the goal of injecting one's DLL into processes.

There are several ways of injecting your code inside any other processes:

  • Via SetWindowsHookEx
  • By using the APP_Init key in the registry
  • With CreateRemoteThread & NtCreateThreadEx
  • You could also do it from a driver by replacing the main thread's entry point with your shell code.
Those techniques are increasingly difficult to put in practice. They do work but sometimes, it's not enough or it may not work on some OS. For instance CreateRemoteThread only works half of the time on Windows 7 because of the different sessions used by applications and services.

Hooking NtCreateThread in a driver and inject your DLL that way won't work on Windows 7 either since NtCreateThread is no longer used by NtCreateProcess. Not to mention that some of that stuff won't work on Windows 64. For instance, NtCreateThreadEx() doesn't use the same structure in 32bit and 64bit. It took me half a day to figure out the proper way of injecting a DLL on Win64.

That time could have been saved if I had used a proper/cleaner way of injecting my DLL.

Enters APC.

In this article you can see how APCs are used in Windows (2000, XP, 7) so I won't go over it again.

The basic idea is that in order to inject our DLL, we will use an APC and queue it for the process. Quite obviously, this has to be done in a driver.

When the target program starts, our driver can be notified via a callback (See PsSetCreateProcessNotifyRoutine) and it can also be notified whenever a module loads (see PsSetLoadImageNotifyRoutine ).

As the module loading callback is called, we can wait for NTDLL.DLL to be loaded since it is the first DLL that will be automatically loaded for every process on the system.

Another reason to wait for NTDLL to be loaded is because we can parse the PE headers and find out the user mode address for LdrLoadDLL. You could do a GetProcAddress(NULL, "LoadLibraryA") and pass it down to your driver but with ASLR this could potentially cause problems.

So, in the callback, we wait for the NTDLL to load and then we obtain the address of LdrLoadDLL.

Here's the code to find out the address of a function in a given DLL.

/**
  * This function is like GetProcAddress()
 * ImageBase is the address of the mapped DLL
  * ImageSize is the size of the DLL
  * FunctionName is the API we are looking for
 *
  * Before looking through the PE headers, we need to map the DLL in memory because
  * it may not be fully mapped. By creating a MDL, we can take care of this.
  */

PVOID GetProcAddress(PVOID ImageBase, DWORD ImageSize, const char* FunctionName)
{    PVOID pFunc = NULL;
    PIMAGE_DOS_HEADER DosHeader = NULL;
    PIMAGE_NT_HEADERS NtHeader = NULL;
    PIMAGE_EXPORT_DIRECTORY pIed = NULL;
    PIMAGE_DATA_DIRECTORY ExportDataDir;
    PIMAGE_EXPORT_DIRECTORY ExportDirectory;
    PVOID LoadAddress = NULL;
    PULONG FunctionRvaArray;
    PUSHORT OrdinalsArray;
    PULONG NamesArray;
    ULONG Index;
    PMDL vMem = NULL;

    __try {
        vMem = IoAllocateMdl(ImageBase, ImageSize, FALSE, FALSE, NULL);
        if (vMem != NULL)
        {
            ULONG ByteCount = 0;
            LoadAddress =  MmGetMdlVirtualAddress(vMem);
            DosHeader = (PIMAGE_DOS_HEADER) LoadAddress;
            ByteCount = MmGetMdlByteCount(vMem);

            MmProbeAndLockPages(vMem, UserMode,  IoReadAccess);
        }
    }
    __except(EXCEPTION_EXECUTE_HANDLER)
    {
        if (vMem != NULL)
            IoFreeMdl(vMem);

        DbgPrint("Unable to read memory");
        return NULL;
    }

    //
    // Peek into PE image to obtain exports.
    //
    NtHeader = ( PIMAGE_NT_HEADERS ) PtrFromRva( DosHeader, DosHeader->e_lfanew );
    if( IMAGE_NT_SIGNATURE != NtHeader->Signature )
    {
        //
        // Unrecognized image format.
        //
        return NULL;
    }

    ExportDataDir = &NtHeader->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT];
    ExportDirectory = ( PIMAGE_EXPORT_DIRECTORY ) PtrFromRva(LoadAddress, ExportDataDir->VirtualAddress);

    if ( ExportDirectory->AddressOfNames == 0 ||
         ExportDirectory->AddressOfFunctions == 0 ||
         ExportDirectory->AddressOfNameOrdinals == 0 )
    {
        //
        // This module does not have any exports.
        //
        return NULL;
    }

    FunctionRvaArray = ( PULONG ) PtrFromRva(LoadAddress, ExportDirectory->AddressOfFunctions);
    OrdinalsArray = ( PUSHORT ) PtrFromRva(LoadAddress, ExportDirectory->AddressOfNameOrdinals);
    NamesArray = ( PULONG) PtrFromRva(LoadAddress, ExportDirectory->AddressOfNames);

    for ( Index = 0; Index < ExportDirectory->NumberOfNames; Index++ )
    {
        //
        // Get corresponding export ordinal.
        //
        USHORT Ordinal = ( USHORT ) OrdinalsArray[ Index ] + ( USHORT ) ExportDirectory->Base;

        //
        // Get corresponding function RVA.
        //
        ULONG FuncRva = FunctionRvaArray[ Ordinal - ExportDirectory->Base ];

        if ( FuncRva >= ExportDataDir->VirtualAddress && 
             FuncRva < ExportDataDir->VirtualAddress + ExportDataDir->Size )
        {
            //
            // It is a forwarder.
            //
        }
        else
        {
            //
            // It is an export.
            //
            ULONG FunctionNamePointer = (ULONG) LoadAddress + NamesArray[Index];
            const char* pszName = (const char*) FunctionNamePointer;
            if (strcmp(pszName, FunctionName) == 0)
            {
                pFunc = (PVOID) ((ULONG) LoadAddress + FuncRva);
                break;
            }
        }
    }

    __try {
        if (vMem != NULL)
        {
            MmUnlockPages(vMem);                            
            IoFreeMdl(vMem);
        }
    }
    __except(EXCEPTION_EXECUTE_HANDLER)
    {
        DbgPrint("Unable to read memory");
    }

    return pFunc;
}

By now, your callback knows that NTDLL has been loaded and with the help of the code above, we obtained the address for LdrLoadDLL.

Notice in the function above that the begining consists of mapping the DLL into memory. The DLL is loaded but may or may not be fully mapped in memory. Failure to do it will result into not finding the API we want since the PE headers are not in memory yet.

Now, we need to prepare for the APC.
First, we need a dummy Kernel APC routine since it is required by the queuing call.

/**
  * This is dummy Kernel APC Routine
  *
  */
VOID KernelApcRoutine (
    IN PKAPC Apc,
    IN PKNORMAL_ROUTINE *NormalRoutine,
    IN PVOID *NormalContext,
    IN PVOID *SystemArgument1,
    IN PVOID *SystemArgument2)
{
    UNREFERENCED_PARAMETER( SystemArgument1 );
    UNREFERENCED_PARAMETER( SystemArgument2 );

    DbgPrint("User APC is being delivered - Apc: %p\n", Apc);
    if (PsIsThreadTerminating( PsGetCurrentThread() ))
    {
        *NormalRoutine = NULL;
    }

    ExFreePoolWithTag(Apc, DIRECT_KERNEL_ALLOC_TAG);
}


That code will be called when the APC is processed and it will delete the memory allocated for the APC object.

Next, we need a function to create the APC and queue it. First our new function needs to allocate some memory in the target process. Since the callback is called in the context of the target, we can simply use NtCurrentProcess() to specify what process the memory will be allocated into.

...
    ZwAllocateVirtualMemory( NtCurrentProcess(),
                                               &context,
                                               0,
                                               &contextSize,
                                               MEM_COMMIT, PAGE_READWRITE);

...

The {context} is a structure that you will define. You can store the address of LdrLoadDLL inside of it as well as the name of the DLL you want to inject.

Then you need to allocate memory for the APC object, which you can do using:

    ExAllocatePoolWithTag(NonPagedPool, sizeof(KAPC), 'tag');

Now, you must initialize the APC, which you do with the following call:

...
     KeInitializeApc(apc, KeGetCurrentThread(),
                              OriginalApcEnvironment,
                              (PKERNEL_ROUTINE) KernelApcRoutine,
                              NULL,
                              InjectionShellCode,
                              UserMode, context);
...

The context is the one that we just allocated and which will be passed as a parameter to the user mode routine.

For the actual routine, you can go two ways. You can create some shell code in assembly and insert the opcodes inside some array of memory, or you can create a function in your own code. If you write a function inside your source code, you will have to make sure that it does not call any windows APIs.

The only thing your function should do is to create a UNICODE string manually (meaning, no call to RtlUnicodeStringInit() ).

Once the APC is initialized, you can queue it using:

...
     KeInsertQueueApc( apc, NULL, NULL, 0);
...

Here's how LdrLoadDLL is called:

...
     pfnLdrLoadDll(NULL,           // No name
                             0,                
                             &pDLL,         // full path here as a unicode string
                             &handle);
...

Basically, what will happen is the following:

  1. Process is created
  2. Module Load Callback is called
  3. Callback check if the module is NTDLL.DLL
  4. Callback retrieve the address of LdrLoadDll() from NTDLL.DLL
  5. Allocate the APC user mode routine context
  6. Allocate memory in the target process to hold the shellcode
  7. Allocate memory for the APC
  8. Initialize the APC (user mode routine points to the shell code)
  9. Queue the APC
  10. The rest of the modules get loaded and your APC gets processed
  11. The user mode routine, loads the DLL using LdrLoadDll.
  12. Main thread is created
  13. Process starts
Hopefully, all this should get you going.

Happy coding!