# C Development Under DragonFly BSD Volume 6: Secure Programming Concepts [[!toc levels=2]] ## Security When thinking about security one must never forget that **putting a lock on something is not secure per se** . Even if you have the biggest and strongest front door, when your roommate leaves the key under the doormat a stranger or a friend can just walk right in. If the reason the key is left under the door mat is that it weighs 1kg then maybe you bought the wrong lock. As C is a medium level programming language, it only provides very basic functionality and checks for incorrect usage - barely more than the processor itself. If you program in C, you can shoot yourself in the foot as easily as you can when operating as root on a UNIX™ system. There are a variety of possible mistakes and even more ways to make use of them, which is called ***exploiting***. Successfully exploiting a so-called bug can affect a target system in various ways. An attacker might prevent the target system from providing any service(s) by crashing or hanging it, he could possibly gain access on the target system which he didn't have before (remote exploit), he might elevate his privileges on a target system that he could access before (local exploit), or he could simply get information that was disclosed. But what are the common security flaws, how and why do they happen and how do they get exploited? One example is a bug that allows an attacker to write to memory locations. This could happen if data that is input by an attacker or user is incorrectly or not checked. Once the attacker has the ability to modify memory, he can use clever tricks to gain control over the exploited program(s) which may lead to more system-wide exploitations and possibly resulting in what was mentioned above. ### Common security issues #### Buffer overflows A buffer overflow can be compared to what occurs when you fill a five pound sack with 10 pounds of apples. Basically, what happens is that the apples end up spilling over. While some might fall harmless at your feet, others can end up rolling onto the not-so-clean portion of your kitchen. While that doesn't sound too bad, assume we take it another step and say your small child eats these apples. If you're lucky, nothing will happen. However, if your not-so-clean portion of the kitchen is like many others, chances are, that they'll get ill. This analogy is very similar to what happens in a buffer overflow. First, let's take a minute and describe what a buffer is. A buffer is something (an array, a pointer, or whatever) a programmer uses to store data. While this problem isn't dramatic, the issue arises when the programmer places a limit on how much this buffer can hold. Now, let's assume we have the following innocent prompt within a program: char FirstName[10]; printf("Please enter your first name\n"); scanf("%s", FirstName); That seems simple enough, what could be wrong with it? Well, let's take a minute and assume you enter your first name which is 11 characters. What is going to happen? Well, C being the "I'm only doing what you told me to do." language that it is, will put 11 <> characters in this 10 space buffer. Now, where does this extra character go? Let's rewind back to the early 1900s for the answer. John von Neumann was theorizing how data should appear in a computer. He determined it would be easier for the computer if data and functions occupied the same regions of memory. This little decision is the root of all buffer overflows, because these extra characters start writing over the regions of memory the next portion of your program reside. Now, just like we don't know where the apples were going to fall, this code can overwrite something relatively unimportant (the next ***printf(3)***-call) or something much more important (a return-statement). The keyword ***return*** means something special to C and the computer on which it runs. It informs the computer that it should adjust some internal components, that afterwards dictate what portion of memory to move to next. What an exploiter of a weakness of this type will do, is to craft the incoming data (in this case your first name) in such a way, that this internal component is told to move to some other function on the machine, which the exploiter will then use to further subvert the machine in question. #### Off by one errors An off by one error is what results when the programmer has an incomplete understanding of a counting problem. Generally there are two groups of counting problems. One group deals with values which are counts and sizes, and the other group deals with values which are indicies and offsets. Counts are usually based on a starting sum of one, but indicies are usually based on a starting offset of zero. When the classification of a value from one of these groups is confused with the other, an ***off by one*** error occurs. If a count is confused with an index, the value as used is one too high. Likewise, if an index is confused for a count, the value is one too low. In either case, whether a value is a count or an index depends on the context and what is being counted. Therefore, it is extremely important to define what is included in a count, as we will see with the standard library string functions. Off by one errors are often introduced when an array, such as a string of characters or a byte addressed buffer, is referenced. Furthermore, what is being counted is often misunderstood. Many standard library string functions count the number of characters in a string, and many more count the number of bytes in a string. The major difference is whether the terminating NUL character in the string is counted. An ***off by one*** error often leads to the exploitation of a common security hole known as a ***buffer overflow*** or ***buffer attack***. As a simple example, consider the following: void do_something (char *instr) { char localstr[80]; /* ... */ if (strlen(instr) > 80) { fprintf(stderr, "Line must be 80 characters or less.\n"); exit(1); } strcpy(localstr, instr); /* ... */ } The strlen(3) library function returns the byte count ***excluding the NUL terminating character***, so the programmer is not testing the size of the incoming string against the allocated space. The programmer should have used a buffer of size 81 to make room for the NUL. This type of error allows an attacker to modify data that is adjacent to the buffer in memory. For example, suppose the program's idea of what the string was stored in memory following the buffer. Then an attacker could follow a number of steps to first modify this value following the buffer, next directly adjust the buffer to a desired size, and finally store the desired data into the buffer. With such an attack, the attacker would be able to store data in memory almost anywhere after the start of the buffer, possibly modifying other data that is always addressable or even statisically addressable at a point after the start of the buffer. A buffer attack allows an attacker to potentially bypass tests designed to keep the system secure. Depending on the severity of this offset hole, arbitrary exploitation is possible (a simple one-byte write into the base register on x86 may lead to arbitrary control over program flow). To avoid mistakes like these: * Always review the documentation of functions like strlen(3), and be completely sure of what they return. Is it a count of characters in or operated on in a string, or is it an index into a string, and does the value include or exclude a terminating ***NUL*** character. * Avoid functions like strcpy(3). Instead, use functions that take a size argument like strncpy(3). In the above case it still would have been a bug, but it would have been harder to exploit. * Dynamically allocate memory rather than using stack variables. While errors can still occur there, they are less likely to affect stack variables and return addresses. * Try to develop a clear mental picture of what is going on, rather than "checking if it works". The most plentiful source of off by one errors is programs that have the error without crashing. #### Format string vulnerabilities A format string that can be provided by an attacker causes a vulnerability. The attacker can write to the memory or gain information through a leak. A simple example: /* wrong */ void print(const char *msg) { printf(msg); } /* right */ void print(const char *msg) { printf("%s", msg); } By not specifying the format of the data passed to printf and simply passing the variable msg, any format characters contained within the msg variable will be expanded internally by printf. By abusing the various format characters available, an information leakage may occur (for example, passing %s will expand the printf to dereference a pointer from the stack [which may or may not be valid]). Clever crafting of format strings may not only allow an attacker arbitrary information leakage but arbitrary program flow (with the abuse of the %n format modifier, allowing writes to user-supplied memory addresses). Always pass a format modifier to the *printf family of functions to avoid the format string bug. #### Integer overflow Consider the following code: void func(char *s) { char *ptr; /* implicit: signed */ int size; size = strlen(s); if(size > 0) { ptr = (char *) malloc(size+1); memcpy(ptr, s, size); } } If size is the maximum value of a signed integer, size + 1 will be -1; #### Race Conditions Lack of synchronization in certain areas of your application code may lead to the creation of a race condition. When non-atomic code is written with the assumption of non-divisability (being atomic), a new category of programming bugs open, many of which are potentially exploitable. Race conditions can prop up in signal handling code, I/O code, multi-threaded logic code and more. Take the following code as an example: int answer; if (MD5(file, "4cf91ea4a6a10dbe5f55438b1a4a7d55") # TRUE) { printf(" Are you sure you want to continue [y\n]: "); answer = getchar(); if (answer # 'y') execlp(file, execlp, NULL); } Imagine a suid application allows execution of any file with identical MD5s (as the example above). What if a user was to replace the file after the MD5() function is called and before execlp()? Rather than executing the expected file, any arbitrary file may be called. Similar race conditions may be found in the insecure usage of mktemp() and friends. #### Using libraries to avoid these issues #### List of insecure and potentially insecure functions in the standard library String manipulation functions: * strcpy * strcat * strncpy with wrong length parameter * strncat with wrong length parameter * gets * sprintf * snprintf with wrong length parameter * all printf-like functions with a harmful format string Other functions: * memcpy #### Overview, exercises and examples #### Social engineering and non-programming security flaws ##### Filching passwords from their keeper(s) ## Section Notes