Understanding the Windows Allocator: a Redux

If you are interested in Windows heap allocator behaviour, and you haven't already read Chris Valasek's papers Understanding the LFH and Windows 8 Heap Internals (credit also due to Tarjei Mandt), I strongly suggest you do so.  There are few more detailed and accurate accounts of the modern (post-XP) allocators' behaviour.  However, as I was working on our crashdump analytic software (codenamed Major Myer), I found that there is some additional depth of functionality not addressed in these two papers; I present it here.

First, some prerequisites.  If you haven't skimmed the papers above, go do so now.  You may also be wondering about a few fine points regarding how it all fits together, and perhaps some definitions.  After that, I will break apart the illusion of 1-to-1 hierarchical linkage in the LFH structures, and point out a couple of unmentioned semantics (after the break; this is going to be long).  Any exploits which may exist with regard to these structures I will leave as an exercise.

Lists

_LIST_ENTRY is a particularly opaque struct, but it is foundational to understanding how most Windows kernel data fits together.  It is effectively what its definition implies: a forward link pointer, and a backward link pointer, with no data.  This struct is designed for inclusion in other structs, which is why it has no data members of its own.  You have likely noticed that, in the heap structs, it appears at the beginning occasionally, but sometimes in the middle.  This is because _LIST_ENTRY structures are included in a class to make them part of circular fixed-sentinel doubly-linked lists.

By convention, when a _LIST_ENTRY appears at the top of a struct, that struct is the data type of the linked list (with one notable exception which will be covered later on).  This is so that the pointers in the list are actually pointers to the appropriate struct, theoretically minimizing offset computation requirements.  However, when a _LIST_ENTRY appears in the middle of a struct, it is best understood as a pointer to the beginning and end nodes of a linked list.  Unfortunately this idiom of Microsoft's results in the data type of the linked list being redacted from the symbols (and it not being a syntactic requirement), but it's usually fairly easy to guess what the data type might be based on the name of the _LIST_ENTRY member.

This is complicated slightly more, however: these pointer-style _LIST_ENTRY members are included in the circular linked list, but are not validly dereferenceable as the list's data type.  Thus, they act in effect as sentinel nodes as well as data pointers, with the drawback that they cannot be dereferenced, and nothing besides context identifies them as sentinels.  The correct way to parse such a list, then, is to remember the offset of the _LIST_ENTRY in the "parent" structure, and halt traversal when it is encountered.  Generically, it looks like this:

Generic diagram showing doubly-linked loose-typed sentinelled lists common to MS kernel code
Generic diagram showing doubly-linked loose-typed sentinelled lists common to MS kernel code

There is also a second type of pointer-style nodes: hints.  These are nodes which are not included in the list, but have forward and back links pointing to some node in the list, and are used in the ListHints array among other places.  It is important to be aware of these when validating the integrity of a list, and when traversing it (they are not sentinel nodes).

That's basically the same as any other linked list, but heed the typing and the positions of the _LIST_ENTRY member.  The sentinel address is that of the SmallThings member of _BIG_STRUCT.  With that out of the way, it should be considerably easier to read and understand Valasek's diagrams and structures, as well as the public symbols.

Backend Freelists

 The FreeList is one instance of atypical _LIST_ENTRY use.  It is quite simply a list of nodes which refer to all the free chunks in the backend allocator.  There is one per heap (and not one per segment, or one per size).  A ListHint is a pointer into the freelist (actually a special _LIST_ENTRY which does not form part of the linked list but only points into it), and can essentially be understood as a pointer to a sublist of the heap's FreeList for a particular chunk size; the sublist is terminated either by the start of the next sublist or the end of the list.  ListHint structures are stored in an array pointed to by the ListHints member of the _HEAP_LIST_LOOKUP ("lookup") structure (this element is more properly a pointer to an array than a double-pointer); though the array's size is stored in ArraySize (with a caveat, which we will come to), it is always 0x80 for the first lookup.  If there is no chunk in the FreeList of an appropriate size for the ListHint, the ListHint's forward link will be null.  Note that the type of a ListHint is _LIST_ENTRY, but the Blink member is not a backlink for a circular linked list; it is (as Valasek identifies) a triple-use member indicating whether the LFH is enabled (the least significant bit), and depending upon that either a counter used internally, or a pointer to the appropriate LFH _HEAP_BUCKET.

It is worth noting that the FreeList node used to reference a free chunk is stored at what would otherwise be the first user data address of the free chunk it refers to, similar to how ptmalloc stores the location of the previous and next free chunks.  The header of the free chunk can therefore be determined using basic arithmetic.

In actuality, it is not necessarily the case that chunks of sizes greater than 0x7E are all stored in the 0x7Fth ListHint.  Rather, they are stored in the final ListHint of the final _HEAP_LIST_LOOKUP; you can get there by traversing each successive ExtendedLookup pointer.  This is where it gets strange; ArraySize does not actually refer to the size of the ListHint array in the current _HEAP_LIST_LOOKUP, but rather to the total number of ListHints in the current and all previous lookups.  The number of FreeList structures in the array is actually ArraySize - BaseIndex.  ItemCount is the number of chunks referenced in the current _HEAP_LIST_LOOKUP (including oversized ones); OutOfRangeItems also refers strictly to the current lookup (and will be 0 for all but the final lookup).  The purpose of ExtraItem is not immediately clear, but it appears to always be 1.

A typical value for the second _HEAP_LIST_LOOKUP ArraySize is 0x400 (the value does not appear constrained to 0x80 and 0x800 as Valasek identifies).  It will often look like this:

Freelists
Freelists

The diagram doesn't depict clearly that the _LIST_ENTRY in each _HEAP_FREE_ENTRY points to the _LIST_ENTRY member, not to the _HEAP_FREE_ENTRY start address.  _HEAP_FREE_ENTRYis a class derived from _HEAP_ENTRY; it simply has a _LIST_ENTRY for the freelist where the user blocks would otherwise start were it busy.  This is similar to the FreeEntryOffset member in free LFH chunks.  Also, for clarification, the ListHints member of _HEAP_LIST_LOOKUP is pointing to the head element of an array of _LIST_ENTRY of size ArraySize-BaseIndex, and the element pointed to by the ListHead members is within the _HEAP (and is the sentinel node).

The Backend

The backend chunks themselves are contained in an array (I use the term loosely here, as each array element is of a different size, but the memory structure is array-like) whose bounds and parameters are defined by _HEAP_SEGMENT structures.  Valasek identifies those members relevant to finding and decoding the LFH.  Other members of interest are FirstEntry and LastValidEntry, which bound the range of addresses at which chunks exist.

_HEAP_SEGMENT structures start with a member Entry.  This records the chunk used to store the segment header (of type _HEAP or _HEAP_SEGMENT), and is essentially a normal chunk, so that everything in a heap segment is a valid chunk (the size of which could be used to identify whether it contains a _HEAP or merely a _HEAP_SUBSEGMENT).  However, the FirstEntry member points to the second chunk in the heap, being the one after the header; it therefore doesn't point to the first entry.  LastValidEntry is also a misnomer; it actually points to the first address not in the segment.  This will generally either be uncommitted memory or the start of another heap segment's header (so the typing of the pointer is technically correct, but probably exists only for syntactic comparability in C).  Interestingly, the PreviousSize field of the initial chunk (the one containing the segment header) when decoded is always 0.

Another interesting point about the heap is that each heap can have multiple segments.  Also, _HEAP is directly castable to _HEAP_SEGMENT (the latter is simply a truncation of the former).  A linked list of segments for a particular heap is provided in SegmentListEntry; in yet another list parsing oddity, the sentinel of the linked list is _HEAP's SegmentList member, which always points to the _HEAP's own SegmentListEntry member.  To parse this list, offset computation is required, as there are members in _HEAP_SEGMENT which precede the _LIST_ENTRY.  The _HEAP_SEGMENT address referred to in the list is two DWORDs and a _HEAP_ENTRY header prior to the address actually pointed to, because the _LIST_ENTRY members point to each other and not to the parent struct.  Each segment refers to a range of addresses containing chunks, and has its own sublist of uncommitted ranges.  They do not have their own Encoding, inheriting that of the _HEAP.

Within each segment, there are ranges of memory which have been designated as part of the heap in virtual memory, but have not yet been committed.  These are the uncommitted ranges in the UCRSegmentList.  Walking the chunks in a heap by their size will imply that there is a chunk header at the start of an uncommitted range; this is not the case, and there is no sentinel.  Instead, a list of uncommitted ranges (in _HEAP_UCR_DESCRIPTOR structures) is stored in each heap.  As mentioned before, this list is exceptional; it has two _LIST_ENTRY elements at its start.  The first is a node in the heap-scoped list of uncommitted ranges, _HEAP's UCRList member.  The second is a node in the segment-scoped list UCRSegmentList; UCRSegmentList is therefore not a hint-style _LIST_ENTRY, but an actual list in its own right.  Both of these members act as the sentinel node for their respective lists; the segment-scoped list requires arithmetic to dereference (subtract the size of _LIST_ENTRY).  I will spare a diagram, because it would only make it look more complex than it is.  The descriptor contains the first Address not committed, and the Size of the range in bytes.  The descriptor entry will always be stored in the last chunk before the uncommitted range, so a backend chunk flagged last will likely contain only a _HEAP_UCR_DESCRIPTOR and be of that size (0x40 on 64-bit windows).  These ranges can be skipped as though they were a chunk, when walking the backend heap.

Virtual allocations

The VirtualAllocdBlocks (a _LIST_ENTRY in _HEAP) are chunks allocated by the backend which are too large to be stored in the normal heap manager (frontend or backend).  Instead, they are allocated by requesting a new virtual memory allocation from the kernel, and handing this off to the user.  These blocks are headed up by a _HEAP_VIRTUAL_ALLOC_ENTRY.  You won't find this in the public symbols for most OSes, so I'll present the structure here in 64 bits (with a nod to Nir Sofer of NirSoft, who extracted this structure from windows vista 32-bit):

_HEAP_VIRTUAL_ALLOC_ENTRY
_HEAP_VIRTUAL_ALLOC_ENTRY

As a result of this, VMM allocations will predictably have user blocks starting at offset 0x40 from an aligned address (on 64-bit, aligned to 0x1000, or 4096-byte aligned), and so it is possible to infer with a degree of certainty that a block has been allocated from the VMM from its user address.  Additionally, VMM block addresses will be allocated sequentially at the next appropriately-aligned virtual address, and it is the VMM which services requests for additional heaps and segments from the backend heap manager.  The structure which keeps track of which of these structures are free is not evident and is probably within the kernel.  However, a _LIST_ENTRY Entry gives a circular linked list of VMM-allocated chunks associated with each heap.  ReserveSizeis the requested size of the allocation; CommitSize is the actually committed size and will under most circumstances be the same as the ReserveSize.

The list of VMM-allocated chunks will often contain LFH subsegments, because these are large enough that they would typically be serviced by the VMM instead of the normal backend heap.  On 64-bit windows 7 it appears that the threshold size after which the VMM becomes responsible for allocations, which is stored in the appropriate _HEAP's VirtualMemoryThreshold, is always 0xFF00 blocks.

The Entry member of this struct is a normal chunk header, encoded against the _HEAP's Encoding member.  Interestingly, WinDbg's !heap command will not identify these chunks.  Additionally, Size will be set appropriately, but PreviousSize will not (it will be 0, regardless of encoding, probably to avoid key disclosure).

The Flags member of the Entry substruct will typically be 0x3, or busy | extra present.  Though the _HEAP_ENTRY_EXTRA subclass is present, it is included in the _HEAP_VIRTUAL_ALLOC_ENTRY and so the flag does not require any additional math to be done when computing the start of the user blocks.

It bears mentioning that the backend chunk flag 0x08, commonly called HEAP_ENTRY_VIRTUAL_ALLOC, is actually an indicator that the chunk is used internally by the heap manager, and does not indicate that the entry is a virtual memory allocation.  It will not be found on these chunks.  WinDbg identifies chunks with this flag as "busy internal" (the flag is almost always 0x09).

The LFH

Now, moving on to the low fragmentation heap.  There is a lot of depth to the LFH which isn't immediately obvious from reading the work that is out there.  In particular, I will shed new light on the LFH flags (UnusedBytes), the meaning of the CRTZone structures, and how to find all the subsegments.

The LFH flags are somewhat of a peculiarity, because whether a chunk is busy is inferred from the UnusedBytes member of the heap entry.  Commonly, it is said that the flags 0x18 interchangeably indicate that a chunk is busy.  However, in many cases (possibly generated by reallocation), the flag 0x20 also indicates a busy chunk.  The flags 0x03 allegedly indicate "top" and 0x04 is usually unidentified.  However, this is incorrect.

The LFH flags should be read with a mask of 0x3F.  The upper two bits are always 0b10, a fact which appears to be validated only as of windows 8.  The remaining bits are the number of "wasted" bytes in the allocation.  This includes 8 bytes for the chunk header (which is effectively 8 bytes on both 32 and 64 bit builds of windows, for reasons I will get into later).  Because of the added 0x08, it happens that if the chunk is busy it is guaranteed to have at least one of the flag bits 0x38 set, so it is provably accurate to use these as interchangeable flag bits to indicate the busy state.  For free chunks the UnusedBytes member is always 0x80, which corresponds to 0 wasted bytes.  For an allocation which corresponds exactly to the maximum size serviceable by the LFH subsegment in question, it will be 0x88.  Thusly, the requested size of the allocation can be inferred from the UnusedBytes field by subtracting 0x88 (or, more pedantically, masking to 0x3F and then subtracting 0x08), and then subtracting the resultant value from the size of the chunk in bytes.  However, on 64-bit systems, there is a caveat to that method.

In the 64-bit implementation of the heap, each _HEAP_ENTRY begins with a new qword member PreviousBlockPrivateData.  This member is space which is reserved for user data from the previous chunk, and allocations (may) include the size of this space.  Accordingly, single-block allocations (effectively an allocation of enough space for only the chunk header) make sense in 64-bit windows because they can service requests for 8 bytes or less.  The LFH therefore includes "0-block" allocations; when this happens, the subsegment is effectively an array of _HEAP_ENTRY, and all the data is stored in the PreviousBlockPrivateData member of the succeeding block.  Subsegments (and heap segments) are allocated with spacing which appears to be intended to accommodate this.  In this way, it is possible for a new user allocation to have the same address as a _HEAP_ENTRY struct, or alternatively an address 0x10 bytes prior to another user allocation.  In the first chunk in a segment or subsegment, or special uses of _HEAP_ENTRY (such as Encoding), PreviousBlockPrivateData is unused, but tends to be initialized to 0.

The SubsegmentZones member of the _HEAP_LOCAL_DATA is a list of _LFH_BLOCK_ZONE structures.  These are (as identified by Valasek) for keeping track of subsegments.  That is to say, CRTZones are arrays of _HEAP_SUBSEGMENT structures.  FreePointer is a pointer to the first address not in that array (eg. an index of size+1 into the array).  The actual subsegments, containing the user blocks from which allocations are drawn, are pointed to from these, and are kept in other chunks.  Those chunks are allocated by the backend as well, and are often contiguous.

The important thing here is that while looking at the appropriate _HEAP_LOCAL_SEGMENT_INFOstructure will in fact give you the address of the subsegment that will be used for allocations, there are other subsegments associated with the segment that aren't referred to there at all.  Subegments become the ActiveSubsegment when they are new, and when chunks are freed back into them, they become the Hint and sometimes the CachedItems (which is an array of pointers to 16 _HEAP_SUBSEGMENTs, not necessarily without duplication).  However, when they are filled up and nothing is freed into them, they simply pass from memory as a new ActiveSubsegment is allocated; the _HEAP_LOCAL_SEGMENT_INFO is only reminded of them once a chunk in them becomes free.

The user allocations do not actually come from the memory allocated for the CRTZone that refers to it.  Instead, a block zone merely contains an array of _HEAP_SUBSEGMENTs, which are references to the appropriate subsegment (more specifically to its _HEAP_USERDATA_HEADER, which is immediately followed by the user chunk array).  The allocations for the user data itself ultimately tend to come from the VMM.  Such blocks (as contain LFH userdata) look like normal VMM-allocated blocks in that they start with a structure the same length as _HEAP_VIRTUAL_ALLOC_ENTRY and have a valid chunk header at the end which is encoded against the parent heap's encoding, but all members besides the chunk header are often null; the flags are "busy internal" as expected and the requested size is at least that of the subsegment.  However, this is not the case for all subsegments, and so it is probable that occasionally more than one subsegment might be stored inside one VMM allocation.  Additionally, subsegments small enough to be allocated without using the VMM appear to be allocated normally from the backend.  In the majority of cases this means that the first chunk in an LFH subsegment (of whatever size) will be immediately preceded by a _HEAP_ENTRY structure.

The _HEAP_USERDATA_HEADER has only a couple members that appear meaningful.  SubSegment is a backlink to the _HEAP_SUBSEGMENT where this _HEAP_USERDATA_HEADER is the UserBlocks member.  SizeIndex is unrelated to the size of the chunk and is not the same as the SizeIndex in the parent _HEAP_SUBSEGMENT.  Reserved appears uninitialized.  Signature is always 0xF0E0D0C0.

Hopefully this serves to demystify the structure of the windows heap to a greater extent than before.  Though this is by no means completes the available description of the windows heap and there is lots of stuff in it that remains unclear, it should make working within it substantially easier.

PEAP at DEF CON 21

History

Last year Moxie Marlinspike and David Hulton presented "Defeating PPTP VPNs and WPA2 Enterprise with MS-CHAPv2" at Defcon 20. Moxie's blog post [1] describes how DES is still relevant in 2013. Their cracker [2] turns captured WPA2 handshakes that use MS-CHAPv2 into NT hashes. These hashes can get you onto the network. This wasn't the first attack on MS-CHAPv2. Asleap [3], the MS-CHAPv2 cracker that Joshua Wright wrote in 2003-2008, uses a weakness in MS-CHAPv2 to crack LEAP and PPTP handshakes much faster than brute force. LEAP was abandoned after Asleap was released.

PEAP at Defcon 21

Which brings us to PEAP. Defcon 21 featured 2 talks about PEAP, both with functional demos. Josh Yavor's "BYOD PEAP Show" showed the default settings for Android, iPhone, Blackberry, and Windows Phone, all of which include PEAP with insecure settings. His demo showed blobs flying by which he promised were NT hashes of passwords. He also gave advice on how to exploit networks. Public transit, choke points, and lunch seems to offer the best opportunity to find insecure phones. James Snodgrass (PuNk1nPo0p) and Josh Hoover (wishbone) demoed their tool LootBooty [4] which capitalizes on iPhone and Android weaknesses in PEAP. Their talk "BYO-Disaster and Why Corporate Wireless Security Still Sucks" utilized flaws in implementation of PEAP to get plaintext passwords which are sent over WPA2-Enterprise. iPwner exploits a man-in-the-middle attack against IOS/OSX devices using PEAP-MSCHAPv2. After connecting to the rogue access point, the user is prompted for username and password to access any website. These credentials are sent in cleartext over WPA2 and the attacker logs them. This attack requires the user to be sufficiently convinced to type in their username and password and not a fake password but is an interesting attack. PEAPing Tom is a second attack which uses EAP-GTC to capture users credentials in plaintext using a rogue RADIUS server on a rogue access point.

From the author of LootBooty, PuNk1nPo0p:

PEAPv0 only supports MSChapV2 as its inner authentication mechanism and is the only PEAP version natively supported by Microsoft. The problem is IOS, OSX, Android, etc all support PEAPv0 too, which makes them all vulnerable to Josh Wright's and Moxie's offline dictionary attack of the captured challenge / response or HASH as we nerds call it.

PEAPv1 continues to support EAP-MSChapV2, but also adds support for EAP-GTC as an inner authentication alternative to EAP-MSChapV2. Microsoft does not natively support EAP-GTC, but IOS, OSX, Android, and pretty much everyone but Microsoft natively supports it. So.... EAP-GTC was designed to be used (but not LIMITED too) for one time passwords like secure Id tokens, etc. So our PEAPING-TOM attack just tells all connecting clients to use EAP-GTC in place of EAP-MSChapV2 thus delivering their domain credentials in clear text instead of your typical HASH that requires offline cracking.

Conclusion

Both talks focused on attacking enterprise wireless weaknesses that present themselves in a specific configuration that has gained popularity over time: bring your own device (BYOD). Since the wireless setup requires only an employee's domain username and password an unintended device can be added by an employee. Little do they know that an attacker can leverage weaknesses in the protocol to compromise the wireless network as well as the rest of the network. "BYOD PEAP Show" gave good advice which says that EAP-TLS is easier to setup than securing PEAP is. This is solid advice that we can echo. Using client certificates and server certificates eliminates the weakness of passwords and gives the attacker no chance to harvest credentials or man-in-the-middle a user. This eliminates the ease of use that EAP-MSCHAPv2 and BYOD offer, but it also eliminates the vulnerabilities described above.

After writing this, I found that Microsoft released a security advisory and Slashdot posted an article to the front page, MS: Windows Phone 8 Wi-Fi Vulnerable, Cannot Be Patched. The title is misleading, but the article received over 100 comments many of which were insightful.

For those who are looking for a solution: please implement EAP-TLS with certificate verification and avoid PEAP.

Many thanks to the authors, especially PuNk1nPo0p for their contribution to security without which this post would not be possible.

Works Cited

1) Marlinspike, Moxie. URL: https://www.cloudcracker.com/blog/2012/07/29/cracking-ms-chap-v2/
2) CloudCracker. URL: https://www.cloudcracker.com/
3) Wright, Joshua. "Asleap - exploiting cisco leap". URL: http://www.willhackforsushi.com/Asleap.html
4) PuNk1nPo0p. "Lo0tBo0ty Wifi-To0lz". URL: https://github.com/punk1npo0p/lootbooty

The Digital Pickpocket

I am sure that everyone has seen the commercial where users of a specific brand of smartphone are passing a video back and forth by simply touching the devices together.  It is a very slick feature that obviously makes moving files between mobile devices an easy task to accomplish.

The technology being used to provide this feature is known as Near Field Communications (NFC).  This same technology, which is an extension to older Radio Frequency Identification (RFID) technology, is also being integrated in other facets of our lives under the banner of convenience.  Unfortunately, like anything where convenience is the priority, there are some potential security issues that the security community has been pointing out for years.  In this case we are talking about “Tap to Pay” credit cards, transit cards, and other cards that use NFC to broadcast payment information to payment terminals.

As previously mentioned, NFC is an extension to RFID technology.  RFID technology, typically used to track inventory, is (I am over simplifying here) essentially a small radio transmitter that requires little to no power.  The main difference, which according to many NFC vendors is a security feature, is that RFID allows for a longer range transmission than NFC.  Essentially NFC will work when the devices are inches apart while RFID can be meters apart.  If you want the real geeky details on exactly how NFC works I suggest that you give the ISO standard (ISO 18092) a read. 

To read a NFC transmission or even an RFID one for that matter one simply needs to have a receiver that is within range of the transmitting device.  I would like to tell you that this transmission is performed over cryptographically secured channels or that only an authorized receiver may pick up the transmission but unfortunately, this is not always the case.

This week we had an opportunity to talk with KOMO TV News Reporter Matt Markovich about NFC technology and some of the risks it presents when used as a payment mechanism.  I would say that my impression of Matt was that he is more technical that most reporters I have worked with in the past as when he approached Leviathan for assistance on his story, he already had a working test case that helps prove the threat.

What Matt was proving (video below) was that this technology of convenience is not secure from an eavesdropper or interception.  Essentially, a “bad guy” can build his own receiver and as long as he is within the necessary range read the transmission coming from the NFC enabled card.  In Matt’s test case, he uses Visa credit cards however, with a bit of customization work this can be extended to read other types of NFC enabled cards such as transit passes, and door locks.

When watching the video remember no vulnerability is being exploited this is simply leveraging a feature of the technology, not a bug.  NFC is after all simply a radio transmitter, there is no access control or authorization required to accept that radio transmission. 

It is also important to understand that this is different than some of the ways we have seen RFID technology leveraged by attackers.  In the past attackers have built low cost devices like this Proxmark one pictured below to read RFID enabled devices;

Proxmark Proxcard reader
Proxmark Proxcard reader

While a setup like this could easily and just as cheaply be built (less than 100$) to read NFC this is not exactly portable or discrete which are two things that an attacker will require due to the fact that in order to read the NFC chip the attacker must be within range which for NFC is no more than 4 inches.

In addition, the above setup assumes that, even if you follow one of the many online tutorials, you must have a level of competence when it comes to building your own electronics.  So, instead of going to all of this trouble and to insure a more stealthy mechanism for attack we go back to the beginning of this post and the wiz bang smartphone feature found on most Android smartphones that allows you to transfer data simply by touching the devices together.

Security researchers were quick to leverage   the native NFC support found in most Android phones plus the powerful features of the smartphone itself to make this attack stealthy and practical.  By simply running a custom, community supported version of the Android Operating system as well as publically available apps one can turn their smartphone in to a NFC receiver and accept a transmission.  Not only can one receive the transmission, which by the way contains all the data needed to “borrow” the target’s credit card details, but it can be saved and then replayed at a later date or relayed in real-time from the smartphone to make a purchase at any standard “tap to pay” terminal.

This is exactly what Matt did and demonstrates in the video and explains in this article.

The most common response to this sort of attack is typically something along the lines of; “yes, but you need to be within 4 inches to make this work.”  In fact, this is exactly what MasterCard said in response to the KOMO News inquires; "The circumstances under which it can occur in the real world are extremely rare."

This is absolutely true however thieves already have no problem performing a traditional pick pocket theft, so why not instead of actually taking the wallet simply bump in to your target and scan it. 

In cases like this, it is human nature to want to find someone or something to blame.  Before you assume that it is once again those “evil hackers” or organized criminal rings that are responsible remember – this is a feature not a bug.  The demonstration as done by Matt in the video is simply leveraging an existing feature.  This means that if you absolutely must find someone to blame, then you must pick whomever is responsible for implementing such an insecure way to transmit payment data.  Yes you guessed, the Payment Card Industry (PCI).  Let’s be clear, this “vulnerability” exists due to the fact that convenience has outweighed security.  The PCI wanted a way to insure that consumers can quickly pay for items without spending the extra few seconds fumbling with your wallet and counting that dirty paper cash stuff that no one seems to use anymore.

Could the Payment Industry implement NFC technology in a secure way?  The ISO standard does outline various provisions that may add security to the implementation however, considering the scale of this implementation there may be real world operational and technical hurdles that prevent this from happening.

Unfortunately, we will see more and more of this technology however, one of the best defenses today is to simply call your credit card company and ask them to issue you a card that does not support “tap to pay” or any other NFC technology.  Today, card issuers are honoring this request, of course eventually they may stop however, if enough consumers reject the technology perhaps change can be forced.

2013 Verizon DBIR Thoughts

It's another year - and time for the 2013 Verizon Data Breach Investigations Report. Despite the name, the report references the previous year - 2012. The most notable part of this year's report is that the list of contributors continues to grow up to a total of 19 for this report. This fact becomes really important as you dig into the report which contains a fairly large number of year-over-year comparisons. When receiving the briefing, it was noted that despite there being 5 years worth of data, it's like comparing apples to not-apples.

Thinking about the reports over the years and the industry response, someone with a more recent memory of STATS 101 can probably use better words to describe the change in how the report is consumed, by which groups within organizations and with what focus. It is really important to understand that the dataset represents voluntary disclosure of breaches by the 19 participating organizations. It is NOT clean data, there is as much scientific rigor as possible but without the kind of assurance that one would expect from a peer-reviewed journal article. The information is more useful for decision making than the usual peanut-butter and jet engines, but not by much. If the utility of the information works for you based on your understanding of your world - great! If not, think carefully about the conclusions that you will inevitably draw as you read.

If you read no other part of the report, read the 4 page executive summary. It's on pages 4 through 7 of the PDF. Go. Read. NOW.

Welcome back! Now that you've read the executive summary, think about what resonated for you. I was rather interested in the following:

  • Who are the victims? -- again, important to note, these are the reported breaches. Looking deeper in the report, you'll see a couple of focus areas - food services and professionals. I'm eagerly awaiting some analysis of the "professionals" segment. The Verizon team will be putting out additional analysis in the form of blog posts. You should watch for them.
  • Who's perpetrating breaches? -- consistently over the years, reported breaches have an external bad actor. We have no idea if the unreported breaches involve insiders or not. And of course, "State-affiliated actors tied to China" makes it into the report right at the beginning.
  • How do breaches occur? -- later on in the report you'll see an interesting graph associated with spear phishing - just keep sending links, they'll eventually click on them.
  • What commonalities exist? -- interesting for sure, but the one that stands out is the number of breaches which are discovered by external parties. Perhaps this is connected to the bias of "reported breaches", when a third party discovers the breach it is much harder to un-report it and hide the results from regulators and legislators (and I suppose customers and clients.)
  • What can we do about it? -- this is a great list. If you're building an information security program you could do worse than follow these 8 points. In the body of the report you'll come to a table that would seem to indicate that doing the basics of infosec well (technical preventative and detective controls) really doesn't help but the reality is that to understand your information and your environment those things are a requirement.

Dig into the body of the report and you'll find much more to think about. I'm certain that the discussions around this report this week will be entertaining.

Some closing thoughts for right now: It doesn't matter how big or small your organization is, you can learn something from the DBIR. And invest in a color printer if you don't read off of the screen, the graphs don't look very good in grayscale.

This is the first in many blog postings you can expect from Leviathan Security Group's Risk Advisory Services Team.

James Arlen is a member of Leviathan’s Risk Advisory Services Team. As a member of the team, James is primarily responsible for multi-year client engagements using the V-ISO (Virtual Information Security Office) model – providing substitute or backup coverage for client information security departments. In addition to these duties, James often serves in a senior role on complex Technical Services projects where close integration with client technical and business teams is necessary.

Zero-Permission Android Applications Part 2

Since my previous post about zero-permission Android applications, I have continued researching along the same lines. This work is similar to research presented by Lookout at Defcon. As that work is a few years old, I thought this topic should be reviewed with fresh eyes.

At Defcon 18, three researchers from Lookout Mobile Security presented on several permission-related security issues. My observation is that most of the research presented was targeted at Android 2.2 and below. They demonstrated that an app with no permissions can reboot a device, start on boot, block user input, and obnoxiously play noise. More interestingly, they showed that they could launch the browser and perform two-way communication with a remote server, and that they could do so when the screen was off.

Building on this work, I wanted to test out their approach to two-way communication using the browser while the screen is off and the device is locked. I found that, indeed, registering a URI handler and calling that URI from within the browser worked well. I was able to have the app query my server, causing the server to call a URI to instruct the app on what to send next. This would cause the app to send that data and query the next instruction and then close the browser window when there was nothing left to do. Hiding this activity was another matter.

I could not launch the browser while the screen was off, at least not in 2.3 or 4.0: the intent to launch the browser would begin, but it would block until the screen turned back on. On 2.3, this meant that when the screen was off, my app would launch and then wait until the screen was on but still locked. My app could then freely perform two-way communication. Of course, if the device was quickly unlocked once the screen was on, it would be evident that the browser was open. On 4.0, things were a little trickier: I found that when the screen was on but the device was locked, the browser would launch but fail to navigate to the requested page. I'm not sure what changed in 4.0, but it definitely makes sneaky browser games much more difficult to perform.

Building upon my previous release, this version of my No Permissions app includes many new actions I could accomplish without requesting permissions. Like before, it can gather version and identifier information and send that information to my server. It can also upload any readable file, whether from the SD card or from a world-readable directory, and send that data to the server. I also created a location mining function that would read the pictures stored on the SD card, scan them for EXIF tags, and upload the location within the EXIF along with the time the picture was taken. While not as good as having the FINE_LOCATION permission, this information can contribute to creating an accurate impression of where a person was, assuming that the storing of location data within pictures is enabled. The app uses the browser trickery described in the previous paragraph for two-way communications, as such the server is able to instruct the app to download a file, store it in the app's private storage, and either open the file with the default handler for that file type or execute it as Java code. While not allowing any new permissions, executing new Java code can further expand the possible activities. Lastly, I created the option to take a base64-encoded string and execute that on the command line -- if the app detects that the device may be rooted, it attempts to perform that command line action as root, otherwise it executes it as our unprivileged app's user id.

At some point, when attempting to implement code to run shell commands given by the server, I noticed that ping was the only binary in /system/bin that was setgid and world-executable. After some testing, I quickly realized that my application with zero permissions was able to successfully ping Internet hosts. I tested this out on a device running 2.3.5 and it worked, but when I tested the code on a device running 4.0.4, it didn't. After a bit of reverse engineering, I found that the ping binaries were different on the two devices. In fact, the differences corresponded to this commit by Nick Kralevich. According to the git logs, the fix was committed after the deadline for the 2.3.x branch. This means that any device running Android 2.3.x or below is vulnerable to this slight oversight in permission checking. Just to be thorough, I downloaded a ROM for a Honeycomb-based tablet. The ROM, which was at Android 3.0.1, contains a ping binary with the changes present in the 4.0.x branch.

It's important to realize this is not a regular ICMP tunnel; we can only execute the ping binary from a command-line. The end result is that one can send 24 bytes of data to a remote server and receive a small integer in the range 8-64 (less than 6 bits of data) in return. This technique allows for a slow trickle of ICMP packets to carry information off of a device and is able to receive enough information back to know what data to send next from a list of possible options. While being a much smaller channel on which to exfiltrate data, this is undetectable to the end-user. In fact, when testing this on a rooted device with a firewall app, I found that the firewall only filters TCP and UDP, missing the ICMP packets completely.

As a side note, execution of the ping binary is also the only way I determined that an application with zero permissions can perform DNS lookups. It may be possible to achieve a wider communication channel using DNS lookups, but I did not look into this option.

Based on this research, I strongly advise that users update to Ice Cream Sandwich (4.0.x), if possible. In addition, disable location data within pictures you take if you don't need the feature. Do not store any files or data on your SD card or external storage any longer than necessary; move them to your computer or delete them if you no longer need them on your phone or tablet. Finally, be careful about what apps you install; even if it doesn't ask for many permissions, a malicious app can perform all sorts of nasty actions on your device.

Android app developers should continue to write apps that only request the permissions needed by the app, and they also need to be wary of any 3rd-party code, such as advertising libraries, that they include in their apps. App developers should also be mindful of file-level permissions of any locally stored files and avoid using external storage where possible; I suggest that they consider using libraries such as SQLCipher to increase the privacy of any stored user data.

Google has some work to do, too. It's my opinion that Google needs to take responsibility for device updates away from the neglectful carriers and push the latest version to every device that can run it. If 4.0 isn't possible for all devices, security updates like fixing the ping binary should be back-ported to the older 2.3 branch. Further, Google should be adding more permissions, such as restricting access to browser launching and restricting the execution of shell commands.

The source for the revised No Permissions app is available here: NoPermissions-1.5.zip

The source for the Ping tunneling app is available here: Ping.zip

Zero-Permission Android Applications

There's been a lot of research in the Android security space. The most notable examples are Jon Oberheide's fake Twilight app, Georgia Weidman's SMS bot, and the numerous clever root exploits. Recently in the mainstream media, there's been buzz about apps (allegedly) misusing permissions; some of these apps include Facebook, Skype, Path, and just about every advertisement library.

One question that was posed internally was: what data can an app access when it has no permissions? I thought this was an interesting question, so I decided to make a proof-of-concept app to explore this idea. Some previous work had been demonstrated by Thomas Cannon of viaForensics. I wanted to develop that work further through a discussion backed by source code.

I created a ''No Permissions'' Android app that explores what data is available to be harvested from an Android device even when the installed app has no permissions.

The following three actions can be completed by pressing the corresponding button in the app:

The first privileged access area is the SD Card. Every application has at least read-only access to the contents of this external storage. ''No Permissions'' scans the /sdcard directory and returns a list of all non-hidden files. While it's possible to fetch the contents of all those files, I’ll leave it to someone else to decide what files should be grabbed and which are going to be boring. It's worth noting that even though the Android developer docs state that there's no security enforced upon files stored on external storage, many things are stored on the SD Card, including photos, backups, and any external configuration files -- on my own device, I found that OpenVPN certificates were stored on the SD card (which I promptly corrected!)

Secondly, I can fetch the /data/system/packages.list file to determine what apps are currently installed on the device. From there, I can scan each directory used by those applications to determine whether sensitive data can be read from those directories. In the ‘’No Permissions’’ app, this functionality returns a list of installed apps and a list of any readable files. When testing this on the Android emulator, I am only able to read the app's own directory, but when testing on a real device, I am able to read some files belonging to other apps. This feature could be used to find apps with weak-permission vulnerabilities, such as those that were reported in Skype last year.

The third action I was able to take was to grab identifiable information about the device itself. Without the PHONE_STATE permission, it's not possible to read the IMEI or IMSI, however the GSM & SIM vendor IDs can still be read. The /proc/version pseudofile, which reveals the kernel version and possibly the name of the custom ROM installed, can also be read. In addition to those identifying values, the app reads the Android ID, which is a 64-bit number randomly generated when a device is first booted and remains constant thereafter. More information about the Android ID is available in the Android Developer Docs.

Though this app uses buttons to activate the three different actions detailed above, it's trivial for any installed app to execute these actions without any user interaction.

What can be done with the data once it’s collected? Without the INTERNET permission, how can it be sent anywhere? While it's true that most network access is restricted, there is one network call that can be made without any permissions: the URI ACTION_VIEW Intent opens a browser. By passing data via GET parameters in a URI, the browser will exfiltrate any collected data. In my tests, I found that the app is able to launch the browser even after it has lost focus, allowing for transmission of large amounts of data by creating successive browser calls.

The attached code was tested against Android 4.0.3 and Android 2.3.5
Source code: NoPermissions.zip
APK file: NoPermissions.apk

The Double-Edged Sword of HSTS Persistence and Privacy

HTTP Strict Transport Security or more commonly known as HSTS is a draft policy by the IETF WebSec working group currently proposed that would extend the standard list of HTTP response headers to allow domain owners to enroll complying browsers into exclusively secure communications with the web server for an asserted period of time.

This is accomplished by rewriting all HTTP requests to that particular domain regardless of entry (be it via link, image or manually typed in the address bar) over HTTPS and validating the certificate chain. If a secure connection cannot be established or the certificate chain cannot be verified then the request fails with a transport level error and is abandoned.

The actual implementation of this is nearly trivial. Over a secure connection the server simply has to return the header specifying how long the browser should exclusively attempt HTTPS connections and a flag whether it should include sub-domains:

Strict-Transport-Security: max-age=31536000; includeSubDomains

Under normal circumstances as long as the user has been to that domain within the max-age of the policy, this is an effective mitigation against sslstrip type attacks which rely on users to initiate an HTTP connection to perform a man-in-the-middle attack against the browsers.

One of the less understood implications of this proposal is the role that wildcard SSL certificates play. When purchasing an SSL certificate the domain owner must decide between a standard certificate that covers only one particular FQDN such as store.domain.com or a (more expensive) wildcard certificate issued to *.domain.com that would encompass multiple sub-domains such auth.domain.com and store.domain.com.

As the certificate wildcard feature is decoupled from the HSTS includeSubDomains flag it leads to interesting behavior that allows an actor such as an advertising company or any other entity to store, retrieve, and edit data in the browser's database. When a wildcard SSL certificate is used it allows the owner to have a near unlimited number of entires in the HSTS databases as currently implemented by supporting browsers.

An entry in the HSTS database can grant a single-bit of information to an interested party that can be retrieved at a later time. Lets look at an example where we want to store and retrieve the word "HELLO" in a browser's HSTS database using nothing but forum image tags and a trivial encoding.

To set the bits we would simply need to create a post with the following tags:

[img]https://charcount-5.trackingdomain.com/setbit.png[/img]

[img]https://0-H.trackingdomain.com/setbit.png[/img]
[img]https://1-E.trackingdomain.com/setbit.png[/img]
[img]https://2-L.trackingdomain.com/setbit.png[/img]
[img]https://3-L.trackingdomain.com/setbit.png[/img]
[img]https://4-O.trackingdomain.com/setbit.png[/img]

 

When a browser goes to each of these URLs over HTTPS the web server would see the /setbit.png key and include a HSTS header with a large max-age value in the response and create an entry in the browser's HSTS table for each of the sub-domains.

To read this data back out a javascript block on a different domain than the original forum would first brute force the character count by creating resource requests enumerating possible values and having the server respond whether the request came in over HTTP or HTTPS as the requests would have been rewritten by the browser if the sub-domain is present in HSTS database. These requests would look like:

http://charcount-1.trackingdomain.com/getbit.png [ Server: HTTP ]
http://charcount-2.trackingdomain.com/getbit.png [ Server: HTTP ]
http://charcount-3.trackingdomain.com/getbit.png [ Server: HTTP ]
http://charcount-4.trackingdomain.com/getbit.png [ Server: HTTP ]
http://charcount-5.trackingdomain.com/getbit.png [ Server: HTTPS! ]

The same brute-force enumeration process would be performed to retrieve the individual characters of the message body. This enumeration is more effective than the current history enumeration attacks via CSS (here.)

At first this approach looks like a Bloom filter. Seemingly akin to burning in bits permanently and not having the ability to change them but thanks to the max-age specifier of the header it is possible to also clear bits by setting their maximum age to 0:

Request URL: https://charcount-5.trackingdomain.com/clearbit.png
Strict-Transport-Security: max-age=0;

Initially this doesn't look worse than standard tracking cookie as long as it is cleared on a regular basis but clearing the HSTS database frequently renders it much less effective in preventing the very attacks it sought guard against. Therein lies the classic trade-off of security versus privacy. Of the currently two HSTS supporting browsers there is no consensus on this topic. Chrome opts for increased privacy by clearing HSTS database when cookies are cleared while Firefox 4 opts to store HSTS settings in the separate and infrequently cleared site-preferences database.

So what can be done about this?

My proposal is to amend the draft to force the includeSubDomains flag on wildcard certificates. This would limit them to only one entry in the browsers HSTS database and make the technique above prohibitively expensive to non-CA owners as a separate signed SSL certificate would be needed for every bit of information stored and limit encoding options. That way we can have the best of both worlds, privacy and security.

Analysis of Adler32

Original Research May 9, 2011

Introduction

Adler32 is a checksum designed to detect corruption in Zlib's compressed data or corruption in the algorithms. Since it was much faster than its more reliable competitor, CRC32 it was chosen for this task [1]. If the uncompressed data does not match the Adler32 checksum, the application can inform its protocol or the user to retransmit the data. However, the main use of Adler32 was to debug implementations of Zlib. Adler32 as well as CRC32 are insufficient for any purpose that requires a high degree of certainty. The chance of a collision for an ideal 32-bit checksum is 1 in 4 billion, which can be easily computed in a few hours by anyone with a reasonable computer. An attacker can even do it in realtime if they had precomputed all checksums beforehand. Adler32 has known weaknesses making it much more susceptible to collision than the more reliable CRC32.

In 2002, RFC 3309 changed the checksum of the SCTP protocol from Adler32 to CRC32. In doing so it explained a weakness of Adler32 with short messages [2]. Since it is not relied upon for important cryptographic security and is used for speed, few research papers have been published on it. Adler32 also contains a weakness in long messages where the end of the message is modified but the start of the message is the same. This means that Zlib cannot be transmitted by itself without use of a slower checksum or hash. Gzip and PNG formats are designed to use Zlib and choose to have an optional CRC32 checksum of the compressed zlib data instead of the original data.

In this blog post I will describe a method to generate collisions between Adler32 checksums. This has no security implications but may be interesting to cryptographers looking for a trivial distinguisher [3]. CRC32 and all 32-bit checksums can be analyzed with the same scripts with no modifications to the algorithm. By dividing secure hashes into 32-bit chunks, they can be tested not fully but spot checked using this same algorithm.

Collision

The following is a Python script to generate Adler32 collisions. I have tested this on millions of pieces of data and fails only on a small subset of data.

from zlib import adler32
from random import randint
def rand_data(l = 4):
     return ''.join([chr(randint(0, 255)) for i in xrange(l)])
#end def rand_data(l = 4)

def generateAdler32Collision(data):
     y = list(data)
     adler_input = adler32(data)
     y[-1] = chr((ord(y[-1])+1)%256)
     y[-3] = chr((ord(y[-3])+1)%256)
     y[-2] = chr((ord(y[-2])-2)%256)
     output = ''.join(y)
     adler_output = adler32(output)
     if adler_output == adler_input:
         return output
     #end if
     #print 'Looking for a different collision'
     y[-1] = chr((ord(data[-1])-1)%256)
     y[-3] = chr((ord(data[-3])-1)%256)
     y[-2] = chr((ord(data[-2])+2)%256)
     output = ''.join(y)
     adler_output = adler32(output)
     if adler_output == adler_input:
         return output
     #end if
     #print 'I give up'
     return None
 #end def generateAdler32Collision(data)

a = 'A'*1013 + 'test'
adler32(a) == -1892941051
adler32(generateAdler32Collision(a)) == -1892941051
generateAdler32Collision(a) != a
generateAdler32Collision(a) == ('A'*1013 + 'tfqu')

successes = 0
failures = 0
for i in xrange(1<<24):
    a = rand_data(1024)
    a32 = adler32(a)
    coll = generateAdler32Collision(a)
    if coll == None or coll == a:
        failures += 1
        continue
    #end if
    successes += 1
#next i
print 'Successes:', successes
print 'Failures: ', failures
print 'Success rate:', (1.0 * successes)/(successes+failures)

Method

To find this method, I took a subset of the first 4 billion checksums and found all the collisions. The most frequent collision repeated for almost every piece of data. The difference between the two data is only 3 bits in the last 3 bytes.

     y[-3] += 1
     y[-2] -= 2
     y[-1] += 1

Also:

     y[-3] -= 1
     y[-2] += 2
     y[-1] -= 1

The exceptions to this algorithm are when the last 3 bytes are an edge case.

Mark Adler himself explains this in an e-mail correspondence:

The particular case you found, and many more, are easily derivable from the algorithm. When you append three bytes a, b, and c, the sums are updated as follows:

s1' += a + b + c s2' += 3s1 + 3a + 2b + c

If we want to find deltas to a, b, and c, call them da, db, and dc that result in the same Adler-32 value (from the sequence a + da, b + db, c + dc), then you can easily derive the deltas:

pick da to be any value, set db = -2da, and set dc = da

i.e. any multiple of the vector {1, -2, 1}.

Then the sums will be the same. So your example is da = dc = 1, db = -2, or the base vector {1, -2, 1}. You could just as well pick something like da = dc = -3, db = 6, or {-3, 6, -3}. So there are many ways to change three bytes to return to the same Adler-32 value. Therefore the Hamming distance is at most three bytes, or when the distance is measured in bits, three bits in distinct bytes (when there happens to be no carry to adjacent bits for those deltas). You can find many, many more such vectors where the three bytes not adjacent. E.g. multiples of {1, 0, -3, 2}.

It is confusing to say that Adler32 has a weakness against short messages (less than a few hundred bytes). If you choose a 1024-byte psuedorandom prefix and modify the last bytes, you will find the same number of collisions as a short message. I chose to use a 1024-byte prefix (which was unnecessary) computed the Adler32 checksum of all 0-4 byte combinations appended to the prefix. Using this I was able to catalog many collisions among Adler32 checksums. The table in its current optimized format is 16GB.

By simply indexing the data and removing unnecessary duplicates, a full set of collisions can be created that allows the generation of collisions with Adler32 in real-time. Since collisions should not affect the security of any application using Adler32 or CRC32, this information is only useful for research purposes.

Data

The first search for collisions found 32 checksums with 64 collisions. The Adler32 checksum of each item the following list is 15728760:

 ['w\x00', '\x00v\x01', '\x01t\x02', '\x02r\x03', '\x03p\x04', '\x04n\x05', '\x05l\x06', '\x06j\x07', '\x07h\x08', '\x08f\t', '\td\n', '\nb\x0b', '\x0b`\x0c', '\x0c^\r', '\r\\\x0e', '\x0eZ\x0f', '\x0fX\x10', '\x10V\x11', '\x11T\x12', '\x12R\x13', '\x13P\x14', '\x14N\x15', '\x15L\x16', '\x16J\x17', '\x17H\x18', '\x18F\x19', '\x19D\x1a', '\x1aB\x1b', '\[email protected]\x1c', '\x1c>\x1d', '\x1d<\x1e', '\x1e:\x1f', '\x1f8 ', ' 6!', '!4"', '"2#', '#0$', '$.%', '%,&', "&*'", "'((", '(&)', ')$*', '*"+', '+ ,', ',\x1e-', '-\x1c.', '.\x1a/', '/\x180', '0\x161', '1\x142', '2\x123', '3\x104', '4\x0e5', '5\x0c6', '6\n7', '7\x088', '8\x069', '9\x04:', ':\x02;', ';\x00<']
prefix = "739b1478e08a126d24148678c4f7b35b8f5976ef06e02d04118dc0f568512bf747c7c7e29d980aee2968b490e8874abcb80c320f363bc14e6aa9cae809de0af5f3ce262d547867bd085a2ddda8339e530f0252038d31d5b52bfd879236b340040708881521bc8f473482cc20b9dcbd134c7f43398bede1abf5e15030942b7f4ee7fb77c68968752398fcb68fbde2f3f53fa5ed5ac470d79d9456f4bde059885905017eb333fe8057e4429b4d463735ebf0f947840de7c16221fdc6a0cf2d3f18a8fec11408fbee629a004830bd1103bf1211e59e5db0c753482ed76b1dc0860aadaad259e7cc091fd935170d83ce95d8b0e7c07d512e8f42bd6c8344c5d716cb70e68c501e62a024cf749fe60be4ecd2db4780a60f953e534f80de44a54d50783d66c211abc33558c5dae6f6a879bc5264ae79a2f78d04f508400de3097b643e72eeb6d222fba2eea933c05efe71b212fe7a9d173189ff83ba389c04b4c6f3b7ecf7f7edca307211ce61d381c7cdf9c1075aa1071abd1b59ee5acd3318aded06295c961e1d30611d4a9c81880b9c5ea66048aa8f60962257a30c8c3c3147e86200b8dde4c8d934b388914bccbc74dc5fe419899c443cbe30023adfbaee83d78a57c0c9774ff04ae4cbfc3e1be5099e43494bc4b608fe9d317309d31a22a771c72dcbb474770e55bc724d83001e08309e0de4a3814789c619e0a9f7df4f1347a77b6bbde8086b212fe70324b1243164e703af194ed76236dd3225e4c04dd2b60f89266e6231afa52c6c4391953355a8151ec8ddde3ad737b3ad87fcf7720efda5032d26efb36079ce579c8f5030637e2d6f1ad43ac18083e80d49aef56404858c657ec97e3a4a4a6cf8815db68e275a1c507aab209ad70d9d1e711de03471d90252e6f896a9e4d1e5634335f9dd30eb088930ffd18c844ef463f4242afe9611becaedfc95f11ed4471f295ab45fddc04f3f77dcfaa818fa8b14aa93dbbcee559700fed9db70acb4ca0f40a679d1d51ab02aba02b0e7bc0bdf9f9ce144133079ac9a1e1f1552c04345e90922cc4badf85a5f0c0373c8d7f66b308a41609b5f5f6271787fa839664606f8bfad3ea8bbdcdc781c981a05263951edb960033ea4f1b8fb822c319b2ad021f462c40bb1071307142a07e5df70dc3e2f07fac56310e4860665c939c4480fd083e08764f225f5a04ee2d5ac64755db03c76e394dc853613273ba0216528814eb9d4bf8fc4ca7b430e9977096f58e833abcbdda66701fb6480ab75f08818524e628128c806756da025114f6767b6c23545f8bf645b4d7ea85c551f9a706647f3add10fbc67653ebf04e7cbc8da19d1dc9bcc795442cb047157015d59ea4915c49eb0f9b96678dc1e2ffac28abecb23bfdb26a7b8a22bb907ffc4c57017bc5b47c403ac3844c0099bd1e36f7edf613fed1b8b364494f6d389".decode('hex')

Using the pseudorandom prefix above, my program found the following collisions:

Adler32 checksums starting with byte 0b: 
Adler32 checksums with no collision: 457 
Adler32 checksums with collisions: 12285 
Adler32 checksums max collisions: 803 
Adler32 checksums total: 2190952

Adler32 checksums starting with byte 0c: 
Adler32 checksums with no collision: 365 
Adler32 checksums with collisions: 51047 
Adler32 checksums max collisions: 4480 
Adler32 checksums total: 63387000

Adler32 checksums starting with byte 0d: 
Adler32 checksums with no collision: 256 
Adler32 checksums with collisions: 68800 
Adler32 checksums max collisions: 11011 
Adler32 checksums total: 282481503

Adler32 checksums starting with byte 0e: 
Adler32 checksums with no collision: 226 
Adler32 checksums with collisions: 82300 
Adler32 checksums max collisions: 18096 
Adler32 checksums total: 619301010

Adler32 checksums starting with byte 0f: 
Adler32 checksums with no collision: 554 
Adler32 checksums with collisions: 93047 
Adler32 checksums max collisions: 21680 
Adler32 checksums total: 913874100

Adler32 checksums starting with byte 10: 
Adler32 checksums with no collision: 642 
Adler32 checksums with collisions: 127437 
Adler32 checksums max collisions: 21846 
Adler32 checksums total: 993022364

Adler32 checksums starting with byte 11: 
Adler32 checksums with no collision: 502 
Adler32 checksums with collisions: 140245 
Adler32 checksums max collisions: 20569 
Adler32 checksums total: 795505939

Adler32 checksums starting with byte 12: 
Adler32 checksums with no collision: 545 
Adler32 checksums with collisions: 139404 
Adler32 checksums max collisions: 14851 
Adler32 checksums total: 459911612

Adler32 checksums starting with byte 13: 
Adler32 checksums with no collision: 3005 
Adler32 checksums with collisions: 114560 
Adler32 checksums max collisions: 7619 
Adler32 checksums total: 161746754

Adler32 checksums starting with byte 14: 
Adler32 checksums with no collision: 27671 
Adler32 checksums with collisions: 72271 
Adler32 checksums max collisions: 2371 
Adler32 checksums total: 20276876

Adler32 checksums starting with byte 15: 
Adler32 checksums with no collision: 30761 
Adler32 checksums with collisions: 5131 
Adler32 checksums max collisions: 102 
Adler32 checksums total: 105776

Adler32 checksums starting with byte 16: 
Adler32 checksums with no collision: 6162 
Adler32 checksums with collisions: 0 
Adler32 checksums max collisions: 0 
Adler32 checksums total: 6162

Adler32 checksums starting with byte 17: 
Adler32 checksums with no collision: 55 
Adler32 checksums with collisions: 0 
Adler32 checksums max collisions: 0 
Adler32 checksums total: 55

Adler32 checksums starting with byte 18: 
Adler32 checksums with no collision: 201 A
dler32 checksums with collisions: 0 
Adler32 checksums max collisions: 0 
Adler32 checksums total: 201

Adler32 checksums starting with byte 1b: 
Adler32 checksums with no collision: 1 
Adler32 checksums with collisions: 0 
Adler32 checksums max collisions: 0 
Adler32 checksums total: 1

 

With the given prefix and a suffix of less than 5 bytes, there are no checksums that do not start with 0b, 0c, 0d, 0e, 0f, 10, 11, 12, 13, 14, 15, 16, 17, 18, or 1b.

Summary of all Adler32 checksums found: 
Adler32 checksums with no collision: 71403 
Adler32 checksums with collisions: 906527 
Adler32 checksums max collisions: 21846 
Adler32 checksums total: 4311810305

The largest collision found (most number of values with the same checksum) had 21846 data with the same checksum. It is available in the attached text file (adler32_mpmp3_1043a_21846.txt). To verify I suggest the following code:

from zlib import adler32
from hashlib import sha1
coll_data = file('adler32_mpmp3_1043a_21846.txt', 'r').read()
if sha1(coll_data).hexdigest() != 'd1bec182b115bd874487b603c7d5eaaf3c1bd9c8':
    print "Error: Invalid data"
    exit(1)
#end if
coll_array = coll_data.split('\n')
coll_d = coll_array[1][coll_array[1].index('['):]
# Note that this command is insecure if the data is above is untrusted.
collisions = eval(coll_d)
a_colls = [adler32(prefix + coll) for coll in collisions]
a_colls == ([a_colls[0]] * len(a_colls))

Analysis

A large scale test for collisions in Adler32 shows that collisions are more common than values that do not collide. 4311738902 values that collide vs 906527 checksums that do not collide gives us a 99.998% chance of there being a nearby collision for any checksum. The reason behind this high chance of collision is in the design of the algorithm.

adler32('\x00\x22\x33\x00') = 13631574
adler32('\x00\x22\x33\x01') = 13697111
13697111 - 13631574 = 65537
adler32('\x9f\x39\xf3\xbc') = 97321608
adler32('\x9f\x39\xf3\xbd') = 97387145
97387145 - 97321608 = 65537

Two adjacent pieces of data with the same prefix except the last byte will have a difference of 65537.

a = [adler32(prefix + 'rew' + chr(i)) for i in range(256)]
b = [a[i+1]-a[i] for i in range(255)]
b == [65537] * 255
c = [adler32(prefix + 're' + chr(i) + '\xff') for i in range(256)]
d = [adler32(prefix + 're' + chr(i+1) + '\x00') for i in range(255)]
e = [d[i] - c[i] for i in range(255)]
e == [-16580862] * 255

Two adjacent pieces of data with last byte being ff and 00 will have a difference of -16580862.

f = [adler32('\x00' * i) for i in range(256)]
g = [f[i+1]-f[i] for i in range(255)]
g == [65536] * 255

The difference between different length strings of 00 is 65536.

adler32('\x00' * i) == adler32('\x00' * (i + 1)) + 65536

Adler32 addition is done modulo 65521 [4] for better mixing. This benefit allows it to outperform Fletcher-16, which was its original intent. All hashes must collide due to the Pigeonhole principle [5], but Adler32's design does not use all pigeonholes effectively for similar data. When given the same prefix and different suffixes Adler32 performs much worse than CRC32. To find a collision, I can modify the end of the stream to find collisions. If a collision is required between a good value and a specific bad value, the start of the stream must be modified to change the most significant bits of the Adler32 checksum. Changing the last 4 bytes of the stream is insufficient to produce all possible checksums.

Also, it should be noted that some checksums are difficult to collide since Adler32 does not produce them regularly.

Performance

With advances in algorithms CRC32 is now much faster than it originally was. It is still slower than Adler32, but by a factor of 20% - 100%. For 1kB of data, Adler32 takes 31 seconds per 16M (in Python) while CRC32 takes 38 seconds per 16M. In C, Adler32 takes 25 seconds while CRC32 takes 34 seconds.

In Python:

from time import time
from zlib import adler32, crc32

c = 'test'
start = time()
for i in range(256*256*256): c = adler32(str(c) + 'A' * 1000)
print 'Adler32 benchmark: %3.1f' % (time() - start), 'seconds'

start = time()
for i in range(256*256*256): c = crc32(str(c) + 'A' * 1000)
print 'CRC32 benchmark: %3.1f' % (time() - start), 'seconds'

Adler32 benchmark: 30.9 seconds
CRC32 benchmark: 37.8 seconds

In C:

/*
Benchmark for Adler32 and CRC32
by Joel R. Voss
Jan 23, 2012
 */

#include <zlib.h>
#include <sys/time.h>
#include <stdlib.h>
#include <stdio.h>

int main(int argc, char **argv)
{
	int i;
	char z[1024];
	for(i = 0; i < 1024; i++)
	{
		z[i] = random() % 256;
	}
	z[1023] = 0;
	struct timeval start;
	gettimeofday(&start, 0);
	int b = 0;
	for(i = 0; i < 16000000; i++)
	{
		long d[10];
		int j;
		for(j = 0; j < 10; j++)
		{
			d[j] = random();
		}
		// Switch the first 8-16 * 10 = 80 - 160
		snprintf(z, 1024, "%016lx%016lx%016lx%016lx%016lx%016lx%016lx%016lx%016lx%016lx", d[0], d[1], d[2], d[3], d[4], d[5], d[6], d[7], d[8], d[9]);
		uLong a = adler32(0L, Z_NULL, 0);
		a = adler32(a, (Byte *)z, 1024);
		b += a;
	}
	struct timeval end;
	gettimeofday(&end, 0);
	printf("Adler32 total: %i\n", b);
	printf("Adler32 benchmark: %3.3f\n", (end.tv_sec - start.tv_sec) + (end.tv_usec - start.tv_usec)/1000000.0f);
	gettimeofday(&start, 0);
	b = 0;
	for(i = 0; i < 16000000; i++)
	{
		long d[10];
		int j;
		for(j = 0; j < 10; j++)
		{
			d[j] = random();
		}
		// Switch the first 8-16 * 10 = 80 - 160
		snprintf(z, 1024, "%016lx%016lx%016lx%016lx%016lx%016lx%016lx%016lx%016lx%016lx", d[0], d[1], d[2], d[3], d[4], d[5], d[6], d[7], d[8], d[9]);
		uLong a = crc32(0L, Z_NULL, 0);
		a = crc32(a, (Byte *)z, 1024);
		b += a;
	}
	gettimeofday(&end, 0);
	printf("CRC32   total: %i\n", b);
	printf("CRC32   benchmark: %3.1f\n", (end.tv_sec - start.tv_sec) + (end.tv_usec - start.tv_usec)/1000000.0f);
	return 0;
}

Adler32 benchmark: 24.8 seconds
CRC32 benchmark: 32.5 seconds

This could explain why Gzip, BZip2 and LZMA chose CRC32 instead of Adler32 for checksum.

Knowing that the Adler32 checksum suffers from far greater occurrence of collisions than its competitor should convince anyone to reconsider its use in protocols and file formats. The computation, memory and storage required to make this conclusion may not have been available in 1996 when the algorithm was written. However, its author was well aware of the issues with the algorithm.

This raises the question of what criteria an algorithm should meet. In 2011, cryptographic hashes hoping to become the SHA-3 standard have to pass a multi-stage contest with peer review and public comments [6]. Each is tested by the strictest standards, which must involve far more than a trivial distinguisher. As computational power grows attacks can improve using as much computational power as available to researchers. However the takeaway from this blog post should not be that brute force should find collisions in algorithms. Instead as we look for collisions, we must remember that our current computational power is enough to test weak algorithms.

Conclusion

The Adler32 algorithm is not complex enough to compete with comparable checksums. A class of collisions exists when the end of the stream is modified in a certain way. The likeliness of this occurring is low enough that it can only be expected to occur in rare circumstances, but when data is modified very often (e.g. HTTP over a wireless connection) Adler32 alone is not enough to ensure integrity. Since CRC32 is only slightly more computationally expensive and cryptographic hashes are necessary for many uses, developers today have better choices for this purpose.

More detailed information about Adler32 and comparable checksums can be found in the 2009 Maxino-Coopman paper, "The Effectiveness of Checksums for Embedded Control Networks" [7].

Works Cited

1. Gailly, Jean-Loup. "RFC 1950: ZLIB Compressed Data Format Specification version 3.3". May, 1996. URL: http://www.gzip.org/zlib/rfc1950.txt

2. "RFC 3309: Stream Control Transmission Protocol (SCTP) Checksum Change." September, 2002. URL: http://tools.ietf.org/html/rfc3309

3. Ferguson, Niels, Schneier, Bruce. "Practical Cryptography". 2003. Page 47.

4. "Revisiting Fletcher and Adler Checksums". 2006. URL: http://www.zlib.net/maxino06_fletcher-adler.pdf

5. Various. "Pigeonhole principle". Retrieved: Oct 26, 2011. URL: http://en.wikipedia.org/wiki/Pigeonhole_principle

6. NIST. "Cryptographic Hash Algorithm Competition". Retrieved: Dec 8, 2011. URL: http://csrc.nist.gov/groups/ST/hash/sha-3/index.html

7. Maxino, Theresa, Koopman, Philip. "The Effectiveness of Checksums for Embedded Control Networks". 2009. URL: http://www.ece.cmu.edu/~koopman/pubs/maxino09_checksums.pdf

Afterword

I wish to personally thank Mark Adler for his time in providing a detailed explanation of the Adler32 checksum and known issues. His feedback on my research made this blog post factually correct and far more useful for readers.

Why Authenticity Is Not Security

Code signing aims to solve two specific problems. The first problem is how to identify the author of code. The second problem is how to verify that code has not been modified after release. Truly solving one or the other would be a good step forward. That both may be solved simultaneously is a great step forward. There is a common misconception, however, that code signing represents a giant leap forward by making the signed code itself "secure". There is not a little danger in trusting code simply because it is signed, without regard to the security of its development before release.

First, a primer on how code signing works. Code signing calculates a cryptographic hash of the code, which is then digitally signed by the author and appended to the code itself. After release, the appended hash can be compared to a newly calculated hash in order to detect modification of the code. The digital signature can be compared to the author's certificate in order to also detect modification of the appended hash. Optionally, the author's certificate may be compared against a certificate authority to detect an inauthentic certficiate. Remember that this process was only designed to gurantee the authenticity of the code, that it has not been modified after release and that it was released by the expected author. None of those steps, however, can attest to the safety of the signed code; inversely, code that is not signed is not necessarily less safe.

Safety is introduced into the code signing process as a byproduct of an organization's criteria for signing code and revoking certificates. For example, developers may request that Microsoft sign the developers' drivers through Microsoft's Windows Logo program, to indicate that they have passed Microsoft's tests for reliability and compatibility. In addition, x64 architecture Windows drivers are required to be signed by the author or they will be prevented from loading, and will also be prevented from loading if the signing certificate is revoked. The signature certificate could be revoked in the event the driver is observed to be misbehaving, but by then the damage is already done and only the scope can be minimized. Microsoft could decline to sign a driver that fails to pass their tests, but in this case, security is not something for which they are testing. For signed code to be trusted as secure, and not merely as authentic, it must be reviewed for vulnerabilities.

In fact, the code signing process could incorporate an actual level of security. Large software vendors could require security reviews as part of their code signing process. In addition, large organizations with their own security team could take this responsibility for reviewing the security of applications they use, regardless of whether the author signed them, and re-distributing them internally with the organization's own signature. Smaller organizations might outsource this responsibility to a security vendor. Finally, security vendors could counter-sign code as a seal of approval to encourage adoption of a product.

The primary advantage of this approach is enabling end-users to decide on the trustworthiness of an application based on whether it has been signed by a trustworthy security team, and not merely by the author, if it is signed at all. The secondary advantage is the possibility of automatically enforcing these trust decisions at an administrative level, rather than leaving it up to careless users on a case-by-case basis. That, however, is a post for another time....