Scudo Hardened Allocator
Introduction
The Scudo Hardened Allocator is a user-mode allocator based on LLVM Sanitizer’sCombinedAllocator, which aims at providing additional mitigations against heapbased vulnerabilities, while maintaining good performance.
Currently, the allocator supports (was tested on) the following architectures:
- i386 (& i686) (32-bit);
- x86_64 (64-bit);
- armhf (32-bit);
- AArch64 (64-bit);
- MIPS (32-bit & 64-bit).
The name “Scudo” has been retained from the initial implementation (Escudomeaning Shield in Spanish and Portuguese).
Design
Allocator
Scudo can be considered a Frontend to the Sanitizers’ common allocator (laterreferenced as the Backend). It is split between a Primary allocator, fast andefficient, that services smaller allocation sizes, and a Secondary allocatorthat services larger allocation sizes and is backed by the operating systemmemory mapping primitives.
Scudo was designed with security in mind, but aims at striking a good balancebetween security and performance. It is highly tunable and configurable.
Chunk Header
Every chunk of heap memory will be preceded by a chunk header. This has twopurposes, the first one being to store various information about the chunk,the second one being to detect potential heap overflows. In order to achievethis, the header will be checksummed, involving the pointer to the chunk itselfand a global secret. Any corruption of the header will be detected when saidheader is accessed, and the process terminated.
The following information is stored in the header:
- the 16-bit checksum;
- the class ID for that chunk, which is the “bucket” where the chunk residesfor Primary backed allocations, or 0 for Secondary backed allocations;
- the size (Primary) or unused bytes amount (Secondary) for that chunk, which isnecessary for computing the size of the chunk;
- the state of the chunk (available, allocated or quarantined);
- the allocation type (malloc, new, new[] or memalign), to detect potentialmismatches in the allocation APIs used;
- the offset of the chunk, which is the distance in bytes from the beginning ofthe returned chunk to the beginning of the Backend allocation;
This header fits within 8 bytes, on all platforms supported.
The checksum is computed as a CRC32 (made faster with hardware support)of the global secret, the chunk pointer itself, and the 8 bytes of header withthe checksum field zeroed out. It is not intended to be cryptographicallystrong.
The header is atomically loaded and stored to prevent races. This is importantas two consecutive chunks could belong to different threads. We also want toavoid any type of double fetches of information located in the header, and uselocal copies of the header for this purpose.
Delayed Freelist
A delayed freelist allows us to not return a chunk directly to the Backend, butto keep it aside for a while. Once a criterion is met, the delayed freelist isemptied, and the quarantined chunks are returned to the Backend. This helpsmitigate use-after-free vulnerabilities by reducing the determinism of theallocation and deallocation patterns.
This feature is using the Sanitizer’s Quarantine as its base, and the amount ofmemory that it can hold is configurable by the user (see the Options sectionbelow).
Randomness
It is important for the allocator to not make use of fixed addresses. We usethe dynamic base option for the SizeClassAllocator, allowing us to benefitfrom the randomness of the system memory mapping functions.
Usage
Library
The allocator static library can be built from the LLVM build tree thanks tothe scudo
CMake rule. The associated tests can be exercised thanks to thecheck-scudo
CMake rule.
Linking the static library to your project can require the use of thewhole-archive
linker flag (or equivalent), depending on your linker.Additional flags might also be necessary.
Your linked binary should now make use of the Scudo allocation and deallocationfunctions.
You may also build Scudo like this:
- cd $LLVM/projects/compiler-rt/lib
- clang++ -fPIC -std=c++11 -msse4.2 -O2 -I. scudo/*.cpp \
- $(\ls sanitizer_common/*.{cc,S} | grep -v "sanitizer_termination\|sanitizer_common_nolibc\|sancov_\|sanitizer_unwind\|sanitizer_symbol") \
- -shared -o libscudo.so -pthread
and then use it with existing binaries as follows:
- LD_PRELOAD=`pwd`/libscudo.so ./a.out
Clang
With a recent version of Clang (post rL317337), the allocator can be linked witha binary at compilation using the -fsanitize=scudo
command-line argument, ifthe target platform is supported. Currently, the only other Sanitizer Scudo iscompatible with is UBSan (eg: -fsanitize=scudo,undefined
). Compiling withScudo will also enforce PIE for the output binary.
Options
Several aspects of the allocator can be configured on a per process basisthrough the following ways:
- at compile time, by defining
SCUDO_DEFAULT_OPTIONS
to the options stringyou want set by default; - by defining a
scudo_default_options
function in one’s program thatreturns the options string to be parsed. Said function must have the followingprototype:extern "C" const char*
scudo_default_options(void)
, with adefault visibility. This will override the compile time define; - through the environment variable SCUDO_OPTIONS, containing the options stringto be parsed. Options defined this way will override any definition madethrough
__scudo_default_options
.
The options string follows a syntax similar to ASan, where distinct optionscan be assigned in the same string, separated by colons.
For example, using the environment variable:
- SCUDO_OPTIONS="DeleteSizeMismatch=1:QuarantineSizeKb=64" ./a.out
Or using the function:
- extern "C" const char *__scudo_default_options() {
- return "DeleteSizeMismatch=1:QuarantineSizeKb=64";
- }
The following options are available:
Option | 64-bit default | 32-bit default | Description |
QuarantineSizeKb | 256 | 64 | The size (in Kb) of quarantine used to delaythe actual deallocation of chunks. Lower valuemay reduce memory usage but decrease theeffectiveness of the mitigation; a negativevalue will fallback to the defaults. Settingboth this and ThreadLocalQuarantineSizeKb tozero will disable the quarantine entirely. |
QuarantineChunksUpToSize | 2048 | 512 | Size (in bytes) up to which chunks can bequarantined. |
ThreadLocalQuarantineSizeKb | 1024 | 256 | The size (in Kb) of per-thread cache use tooffload the global quarantine. Lower value mayreduce memory usage but might increasecontention on the global quarantine. Settingboth this and QuarantineSizeKb to zero willdisable the quarantine entirely. |
DeallocationTypeMismatch | true | true | Whether or not we report errors onmalloc/delete, new/free, new/delete[], etc. |
DeleteSizeMismatch | true | true | Whether or not we report errors on mismatchbetween sizes of new and delete. |
ZeroContents | false | false | Whether or not we zero chunk contents onallocation and deallocation. |
Allocator related common Sanitizer options can also be passed through Scudooptions, such as allocator_may_return_null
or abort_on_error
. A detailedlist including those can be found here:https://github.com/google/sanitizers/wiki/SanitizerCommonFlags.
Error Types
The allocator will output an error message, and potentially terminate theprocess, when an unexpected behavior is detected. The output usually starts with"Scudo ERROR:"
followed by a short summary of the problem that occurred aswell as the pointer(s) involved. Once again, Scudo is meant to be a mitigation,and might not be the most useful of tools to help you root-cause the issue,please consider ASanfor this purpose.
Here is a list of the current error messages and their potential cause:
"corrupted chunk header"
: the checksum verification of the chunk headerhas failed. This is likely due to one of two things: the header wasoverwritten (partially or totally), or the pointer passed to the function isnot a chunk at all;"race on chunk header"
: two different threads are attempting to manipulatethe same header at the same time. This is usually symptomatic of arace-condition or general lack of locking when performing operations on thatchunk;"invalid chunk state"
: the chunk is not in the expected state for a givenoperation, eg: it is not allocated when trying to free it, or it’s notquarantined when trying to recycle it, etc. A double-free is the typicalreason this error would occur;"misaligned pointer"
: we strongly enforce basic alignment requirements, 8bytes on 32-bit platforms, 16 bytes on 64-bit platforms. If a pointer passedto our functions does not fit those, something is definitely wrong."allocation type mismatch"
: when the optional deallocation type mismatchcheck is enabled, a deallocation function called on a chunk has to match thetype of function that was called to allocate it. Security implications of sucha mismatch are not necessarily obvious but situational at best;"invalid sized delete"
: when the C++14 sized delete operator is used, andthe optional check enabled, this indicates that the size passed whendeallocating a chunk is not congruent with the one requested when allocatingit. This is likely to be a compiler issue,as was the case with Intel C++ Compiler, or some type confusion on the objectbeing deallocated;"RSS limit exhausted"
: the maximum RSS optionally specified has beenexceeded;
Several other error messages relate to parameter checking on the libc allocationAPIs and are fairly straightforward to understand.