Memory Safety Bugs Mitigation on Linux

TL;DR

Common memory safety bugs mitigation techniques include marking memory pages as non-executable (the so called NX bit), address space layout randomization (ASLR) and stack canaries (which specifically protect against buffer-overflows; they are sometimes referred to as "stack cookies").

NX is usually useless by itself, while both ASLR and stack canaries, in favorable conditions, typically prevent memory safety bugs from being exploited.

Introduction

A common way in which an attacker may take control of a program written in a language that does not provide memory safety (such as C/C++) is by making use of a situation where the program, accesses unintended portions of memory. I will focus on buffer overflows, although other types of memory safety vulnerabilities exist. Specifically, I will use the example of a stack-based buffer overflow (unintented write beside a buffer located on the stack). I will also assume that I am talking about a 32-bit x86 CPU architecture, running Linux.

How Buffer Overflows Work

Let's take a look at the following code snippet

int foo(void)
{
    char buf[64];

    /*
     * ...........................
     *
     * Logic-handling code follows
     * ...........................
     */
}

Here we've got a function that has one local (and therefore on the stack) variable - a buffer of size 64. When this function gets called, this will be its stack frame:

                    +--------------------+
                    |   Return address   |  <-- EBP + 4
                    +--------------------+
                    |      Old EBP       |  <-- EBP
                    +--------------------+
                    |                    |
                    |        buf         |
                    |                    |
                    |                    |
                    |                    |
                   ///                  ///
                    |                    |
                    +--------------------+  <-- EBP - 64

Return address is the address the program will jump to, when it returns from the function foo. Assuming that the contents of the buffer are filled with data fed by the user, and no checking of the length of the data is performed,it is easy to see a simple exploitation scenario - if we can write past the buffer buf, we can conveniently put any code in buf and then overwrite the return address with the address of our code. Hence, we have a remote code execution scenario.

Keeping your Apples and Oranges Separate

An obvious technique to prevent this kind of scenario from occurring - even in the presence of buffer overflow bugs, is to make all user-supplied data non-executable. In this example, it would mean making the buffer non-executable. Since there is no plausible reason why you would want to have executable data on the stack, a simple solution would be to apply this protection to the whole stack. Same for other pages of memory, except for those which actually hold code.

This is exactly what the NX bit is. It is an attribute, set on a page of memory, saying that trying to jump to any address on this page will terminate the program. For this to work, the CPU has to provide this kind of protection (as all modern CPUs do), and the operating system itself has to make use of it (i.e mark appropriate pages as NX when loading the executable). This way your code is clearly separated from your data in memory.

Let me go off on a tangent for a paragraph or too. Here we encounter a common theme in exploit mitigation, namely that some things are best implemented in hardware. Since while the CPU is executing in the userspace mode, there is no easy way for the kernel to control what goes on and where - unless of course this control is implemented with the help of the CPU itself, I don't see a trivial way to enforce something like this purely in the kernel, without the assistance of the CPU.

NX Bit on Linux

An implementation of the NX bit mechanism requires awareness on part of many elements of the Linux ecosystem. Supposing that we have a CPU that supports this kind of protection, the process begins with a binary executable format, which allows to mark certain parts of the images as having certain access protections. In the case of ELF, which is pretty much the standard for binary executables on Linux today, it allows marking "segments" of the image as readable, executable or writable, or a combination of thereof. This is interpreted as: when the image is loaded into memory, pages of memory which these segments occupy will have these attributes set.

In the segment header, there is a member p_flags, which holds this information. From my /usr/include/elf.h:

/* Legal values for p_flags (segment flags).  */

#define PF_X		(1 << 0)	/* Segment is executable */
#define PF_W		(1 << 1)	/* Segment is writable */
#define PF_R		(1 << 2)	/* Segment is readable */
#define PF_MASKOS	0x0ff00000	/* OS-specific */
#define PF_MASKPROC	0xf0000000	/* Processor-specific */

When the Linux kernel ELF loader (in fs/binfmt\_elf.c) reads the executable image, it will set the memory pages attributes based on these flags:

			if (eppnt->p_flags & PF_R)
		    		elf_prot = PROT_READ;
			if (eppnt->p_flags & PF_W)
				elf_prot |= PROT_WRITE;
			if (eppnt->p_flags & PF_X)
				elf_prot |= PROT_EXEC;

Making Something out of Nothing

This would seem like a reasonable protection against these kinds of attacks. Unfortunately, there exists a highly reliable method of getting around that. It is known as the return-to-libc attack and it has a more sophisticated cousin - return oriented programming (ROP).

Before we discuss those let's take a moment to think what we are actually trying to accomplish - we said that the previous technique allows us to execute arbitrary code - but what kind would like to run? We can imagine the greatest flexibility would be achieved by taking interactive control of the system, say by loading a shell binary via an execve() call or spawning a child shell with system() or similar.

The idea of the return-to-libc attack is very simple, if we can't point the execution to our code, let's jump to the address of a known libc function - e.g system(). Before we do that, we overwrite appropriate entries of the stack with the path of the shell (let us remember that arguments for a function on 32-bit x86 typically go on the stack). The first working exploit of this kind was given by Solar Designer in 1997.

ROP uses a clever idea based on the fact that since we can write on the stack, we can put a series of "fake" return addresses on the stack, which will in fact point to short sequences of instructions ending with a RET (all in the original binary or any of the libraries that it links to) and we build our exploit out of such sequences (called "gadgets"). To me it seems really remarkable that it can be done this way, but it is in fact quite easy (Hint: you need the int 0x80 instruction to execute a syscall, and a bunch of registers filled with the arguments of that syscall; look for pop's in your gadgets... Remember you can put anything you want on the stack).

Address Space Layout Randomization (ASLR)

Notice how the above exploitation method assumes we know the addresses where any gadget is located. A (very effective) way to defeat this is by loading the code at a random address in the virtual memory space. This technique is called Address Space Layout Randomization (ASLR).

Linux has a runtime parameter that controls ASLR:

$ cat /proc/sys/kernel/randomize_va_space
2

0 means no randomization. 1 means that the base address of the stack, shared libraries, and of memory returned from mmap(2) (unless MAP_FIXED was used) is randomized. 2 means that in additional to those, the initial brk(2) address is randomized.

Let's take a look at fs/binfmt_elf.c in the Linux kernel. The function randomize_stack_top() is responsible for generating the base address of the stack.

static unsigned long randomize_stack_top(unsigned long stack_top)
{
	unsigned long random_variable = 0;

	if (current->flags & PF_RANDOMIZE) {
		random_variable = get_random_long();
		random_variable &= STACK_RND_MASK;
		random_variable <<= PAGE_SHIFT;
	}
#ifdef CONFIG_STACK_GROWSUP
	return PAGE_ALIGN(stack_top) + random_variable;
#else
	return PAGE_ALIGN(stack_top) - random_variable;
#endif
}

The base is offset by a random value, if and only if PF_RANDOMIZE is set. PF_RANDOMIZE is a process flag that is set if the kernel parameter discussed at the beginning of this section is set, unless Linux is emulating a system which does not support ASLR.

	if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space)
		current->flags |= PF_RANDOMIZE;

Similarly the base of the heap is randomized.

	if ((current->flags & PF_RANDOMIZE) && (randomize_va_space > 1)) {
		current->mm->brk = current->mm->start_brk =
			arch_randomize_brk(current->mm);

And so are any shared libraries.

		} else if (loc->elf_ex.e_type == ET_DYN) {

            /* snip */

			if (elf_interpreter) {
				load_bias = ELF_ET_DYN_BASE;
				if (current->flags & PF_RANDOMIZE)
					load_bias += arch_mmap_rnd();
				elf_flags |= elf_fixed;
			} else
				load_bias = 0;

Interestingly, the base of the "main" executable itself, is not randomized. (see ELF type in the above code being ET_DYN).

There are two helper functions in the above code that warrant a comment, namely arch_map_rnd() and arch_randomize_arch(). They are declared in include/linux/elf-randomize.h and non-trivial definitions are provided only when CONFIG_ARCH_HAS_ELF_RANDOMIZE is enabled and the CPU architecture supports this option. That is to say, there are several flavors of ASLR only Linux. Various features of ASLR are available depending on whether the kernel was compiled with CONFIG_ARCH_HAS_ELF_RANDOMIZE and on the value of randomize_va_space, those two being independent of each other. In the kernel, Documentation/features/vm/ELF-ASLR/arch-support.txt lists wchich architectures that support CONFIG_ARCH_HAS_ELF_RANDOMIZE (spoiler: not that many do).

Stack Cookies

Now we turn to a mitigation technique that protects specifically against stack-based buffer overflow vulnerabilities (unlike NX and ASLR which attempt to protect against all memory safety bugs). Another difference is that this method works mostly on the compiler level, with a little help from the libc/dynamic linker, while the previously described methods make use of an interplay between the hardware and the kernel.

The key observation here is that an attacker wanting to exploit a stack-based buffer overflow will typically want to overwrite the return address of the current function (see diagram above), and in doing so will overwrite everything between the buffer and the return address (this assumption does not hold in every case, but is a useful heuristic for making common bugs useless for exploitation).

The mitigation works as follows: the compiler generates additional code that puts a random (generated at runtime) value on the stack, just after the return address, and then checks that, shortly before the function returns, if the value saved on the stack matches the one that was previously generated. In case of a mismatch, the program aborts.

                    +--------------------+
                    |   Return address   |  <-- EBP + 4
                    +--------------------+
                    |      Old EBP       |  <-- EBP
                    +--------------------+
                    |      Canary        |  <-- EBP - 4
                    +--------------------+
                    |                    |
                    |        buf         |
                    |                    |
                    |                    |
                    |                    |
                   ///                  ///
                    |                    |
                    +--------------------+  <-- EBP - 68

In this way, just like a canary in a coal mine's death would alert the miners to the presence of carbon monoxide, the program can detect tampering with stack buffers. The random value is called a "stack canary", "stack cookie" or a "sentinel value".

GCC originally provided two switches to control generation of such canaries, namely -fstack-protector-all and -fstack-protector. As the name suggests, -fstack-protector all applies the mitigation to all functions (quite understandably this carries a performance penalty), while -fstack-protector is quite limited. Later, -fstack-protector-strong was added using a more useful heuristic for deciding which functions should be protected. -fstack-protector-strong serves as a compromise between performance and safety.

Let me also emphasize again that this method is only for stack-based buffer overflows, it doesn't generalize to heap-based buffers, out of bound reads (whether on stack or heap), use-after-free bugs, etc, etc.

Canary Checking Code

I compiled this program:

int main(void) { char buf[64]; return 0; }

with GCC 5.4.0:

$ gcc -fstack-protector-all stack_canary.c

I ran objdump on it to see the disassembled code:

$ objdump -M intel -d ./a.out

And got the following output (unimportant parts omitted):

0000000000400546 <main>:
  400546:       55                      push   rbp
  400547:       48 89 e5                mov    rbp,rsp
  40054a:       48 83 ec 50             sub    rsp,0x50
  40054e:       64 48 8b 04 25 28 00    mov    rax,QWORD PTR fs:0x28
  400555:       00 00
  400557:       48 89 45 f8             mov    QWORD PTR [rbp-0x8],rax
  40055b:       31 c0                   xor    eax,eax
  40055d:       b8 00 00 00 00          mov    eax,0x0
  400562:       48 8b 55 f8             mov    rdx,QWORD PTR [rbp-0x8]
  400566:       64 48 33 14 25 28 00    xor    rdx,QWORD PTR fs:0x28
  40056d:       00 00
  40056f:       74 05                   je     400576 <main+0x30>
  400571:       e8 aa fe ff ff          call   400420 <__stack_chk_fail@plt>
  400576:       c9                      leave
  400577:       c3                      ret
  400578:       0f 1f 84 00 00 00 00    nop    DWORD PTR [rax+rax*1+0x0]
  40057f:       00

We can see that after the initial stack frame setup, it pushes a value from the thread local storage onto the stack (40054e-40055b; the segment register fs points to thread local storage). Then, just before returning, it copies the value on the stack to a register (400562-400566), compares it to the initial value in thread local storage (400566-40056f) and jumps over a function call if the two are equal (40056f-400571). This way the call __stack_chk_fail() will only be executed when the values don't match. Unsurprisingly, __stack_chk_fail() is a glibc function that aborts the program.

A small correction: I realized I compiled the program as a 64-bit x86 executable, and previously I said I would discuss 32-bit only. In this case, it doesn't affect the discussion though.

Generating the Canary Value

The only thing left to explain is how the initial sentinel value ends up in thread local storage. We know this N bit value has to be generated at runtime (otherwise it would be trivial for the attacker to just use the known value when overwriting the buffer), and it has to be uniformly distributed (otherwise the chance that the attacker can guess the value is greater than 2^(-N)).

In glibc, the dynamic initializes the canary by generating a random value using an initial seed put in the auxiliary vector by the kernel. This way it doesn't have to make a syscall (which is costly) to get a random seed. In etf/rtld.c:

static void
security_init (void)
{
  /* Set up the stack checker's canary.  */
  uintptr_t stack_chk_guard = _dl_setup_stack_chk_guard (_dl_random);
#ifdef THREAD_SET_STACK_GUARD
  THREAD_SET_STACK_GUARD (stack_chk_guard);
#else
  __stack_chk_guard = stack_chk_guard;
#endif

  /* ... */

}

THREAD_SET_STACK_GUARD is defined in cases where the sentil value is put in thread local storage (as it is on x86).

Defeating ALSR and Canaries

Usually writing a succesful exploit requires knowing at least parts of the address space layout. Therefore, exploits that defeat ASLR typically focus on information leaks that reveal addresses of interesting ROP gadgets. The same technique can be used to bypass stack canaries - if the attacker can read the stack frame, they can also read the sentinel value, and craft the exploit in a way that will the correct cookie value at the appropriate offset.

As a concrete example of such an information leak, imagine a network service that has an out of bounds read bug - which is a close relative of buffer overflows that we've been looking at, except that the missing check is when performing a read operation. Heartbleed was a notable bug of this kind.

Both ASLR and canaries are typically very effective. Unless an information leak is found, they prevent these types of bugs from being used as gateways to executing arbitrary code.

Conclusion and Further Reading

Memory safety vulnerabilites have been around for a long time, and in spite of an increasing awareness of them in the programming community, as well as the rising popularity of memory-safe languages such as Rust, they are unlikely to go away in the near future. People make mistakes (even security experts!) and large codebases in languages such as C exist. Mitigation techniques such as NX, ASLR, stack canaries and others, as well as separation measures such as namespaces, SELinux/AppArmor, seccomp and hypervisors limit the impact of those bugs.

The interested reader may take a look at the original Phrack paper about stack buffer overflows. An extensive discussion of return oriented programming is given by Ryan Roemer et al.. The paper introducing the concept of Address Space Layour Randomization was published by the PaX team in 2003.