Bulk ASLR Data Analysis

Hello from the Lotan team at Leviathan!

We recently looked at a sample set of 80,000 crashdumps from a production environment and decided it was time to look at some data we have in aggregate. Lotan's core focus is detecting stage one attacks (shellcode) in crashed processes. To achieve this goal Lotan has to process the bulk of the data contained within a memory image. One of the most interesting components of these process images is the information about loaded modules from Windows processes. Modules are loadable libraries of executable code, such as .dll's and .ocx's (for ActiveX plugins such as Flash). In order to address a host of exploitation techniques, Address Space Layout Randomization (ASLR) was developed to randomize the base address of modules when loaded into memory. Modules compiled with the '/DYNAMICBASE' flag in Visual Studio can be loaded at an arbitrary base address in the process virtual memory space. Windows first introduced this security enhancement in Vista and it has been continually updated and improved. One major issue with ASLR on Windows is that it is 'opt in' on a module per-module basis. Tools like EMET and others can force modules to relocate in memory by essentially marking the requested base address as unavailable but these are not standard. This leads to a number of legacy libraries and poorly maintained codebases still not having dynamic base addresses.

ASLR is an older anti-exploitation technique, so the Windows implementation has been reviewed, broken, and studied for more than ten years. We thought it would be interesting to analyze data about loaded Windows modules that Lotan collects to examine the effects of ASLR in the wild. After all this time, are the majority of modules opting in to ASLR? We gathered our information from a sample set of 2,000 hosts over the span of the last year. We processed 80,000 crashdumps in that time frame, which contained a total of 50,777 unique modules (uniqueness was determined by load path). The majority of these machines were Windows 7 or Windows Server 2008.

The goal was to determine if there were any modules that never randomize their base addresses, find out which modules are doing the best at randomization, and identify any hotspots for load addresses. If an attacker can predictably guess a base address for a loaded module, it gives the attacker an advantage in building ROP chains or using other bypass techniques. If an attacker chooses to guess at a possible base address and risk having an exploit crash, the more predictable a base address is the more often that exploit will succeed, thus improving the ROI for an attacker. Lotan analyzes crashdumps for this very reason: most exploits fail a certain percentage of the time, either due to guessing a bad address, errors in shellcode, failure to perform continuation of execution, or mis-targeting.

Top 21 Least Unique Modules

The first thing we did was to sort the modules by the rate of unique address per dumps containing the module, stripping out all .exe's. In the chart below, the 'unique_rate' field is "unique_addrs / addr_cnt". This makes it easy to see the worst offenders. We strip out .exe's because while there are a LARGE number of executables missing ASLR, some very interesting, we wanted to study shared code (.dll's) because the lack of ASLR in loaded modules has a larger impact. Studying modules also gives us insight into how Windows is doing its loading / linking of modules.

At first glance we can see a few very interesting modules pop out of this data: Lenovo, VMware, and Airtel are all major offenders. Most concerning is Lenovo's representation within this this list, considering the recent Superfish snafu and that these modules are loaded in as part of the OEM image of modern Thinkpads.

Perhaps the most striking finding came from C:\Windows\SysWOW64\ntdll.dll. NTDLL is the primary module for interfacing between userland and kernel space in Windows applications. In all Windows On Windows64 (WoW64) applications this module is loaded to provide an interface for 64 bit code to interact with the kernel. The WoW64 subsystem was most recently explored by Duo Labs in their paper "WoW64 and So Can You: Bypassing EMET With a Single Instruction."Before Windows 8, the 64 bit version of NTDLL is always loaded at a hardcoded offset from the 32 bit NTDLL. What is interesting here is that out of ~48,331 crashdumps that were WoW64 (~60% of all crashes), there were only ever 256 unique addresses for the 64 bit version of NTDLL. Luckily this is well understood and was a established 'good enough' base line for Windows Vista and 7. After Windows 8 the number of slots increased for WoW64 modules, especially NTDLL. The loaded base address of these system provided .dll's randomizes on each boot. After seeing this I decided to narrow my scope and remove some of the obvious offenders and look at all dumps with greater than ten unique address, again sorted by unique_rate.

All Modules With More Than 10 Unique Addresses, Sorted by unique_rate

SysWoW64 takes up a large portion of the data here, interestingly enough with greater than 256 possible slots but fewer than would be expected from a full high entropy ASLR implementation. We spent sometime on this but were unable to come up with a reasonable explanation of why other sysWoW64 modules had such a low unique address count and did not have the same 256 slots as NTDLL. Looking at the unique address column we can see that they are reasonably clustered around ~800 addresses; if anyone knows why this might be we would love to hear from you. Also interesting to note is a Flash specific WoW64 extension as well as some C# assemblies.

Histogram of ntdll.dll Base Addresses

In order to study one module a bit closer and look for possible clustering / hotspot addresses let's return to ntdll.dll. The first step was to see if there is any specific clustering among possible addresses.

Bar Graph of Address and Their Count

Interestingly, there appears to be two major spikes, but you can see there is a large number of little blue lines at the bottom of the graph, so lets try a barplot and see if there is a specific addresses that pop out.

The spikes in the graph are individual addresses with an abnormally high count.

Table of the Top Most Common Addresses for ntdll.dll

It appears that 0x77280000 is home-sweet-home to NTDLL for our dataset. At this point it is important to consider the possible reasons for such a large amount of clustering. After revisiting the Lotan dashboard we realized the most likely cause was a simple application that takes up 1/8 of all crashes in the instance. This application was crashing repeatedly on one or two hosts, and since base addresses of these modules will only be randomized on reboot, those hot spots are most likely a single 'crashy' application.