Buffer Overflow : The Hidden Threat to Your Computer Systems and How to Prevent Them

Buffer overflow attacks are a serious concern for anyone involved in computer security. As our reliance on technology continues to grow, it’s crucial to understand the risks associated with buffer overflow vulnerabilities and how to prevent them.

In this article, we’ll explore what buffer overflows are, the different types of buffer overflow attacks, and the potential consequences of these vulnerabilities. We’ll also dive into some of the common and lesser-known techniques for preventing buffer overflow attacks, including input validation, safe programming techniques, memory-safe languages, code signing and execution prevention, stack canaries, sandboxing, and fuzz testing.

Whether you’re a programmer, system administrator, or simply interested in computer security, this article will provide valuable insights into the world of buffer overflow vulnerabilities and how to safeguard your digital systems against them. So, let’s jump right in and explore how to protect your computer systems from this potentially devastating threat.

Points to Cover

What is Buffer Overflow
What Is a Buffer Overflow Attack?
Buffer Overflow Exploits
Buffer Overflow Consequences
Types of Buffer Overflow Attacks
Stack buffer overflow attack example
Executing a Buffer Overflow Attack
Buffer Overflow Causes
Which Programming Languages Are More Vulnerable?
Buffer overflows and the Web
How to Prevent Buffer Overflows
FAQ
Conclusion

What is Buffer Overflow

Buffer Overflow is a computer security vulnerability that arises when a program tries to store more data in a buffer than it can hold. In simple terms, a buffer is a temporary storage area in a computer’s memory that holds data while it is being processed. Buffers are used in many parts of a program, such as input/output operations, network communication, and memory allocation.

However, when a program tries to write more data to a buffer than it can hold, the excess data overflows into adjacent memory locations, which can cause unpredictable behavior. This vulnerability can be exploited by attackers to execute arbitrary code, crash a program, or even take control of a system.

Think of it like a glass of water. If you keep pouring water into the glass beyond its capacity, the water overflows and spills onto the table, creating a mess. Similarly, when a program writes more data to a buffer than it can hold, the excess data overflows into adjacent memory locations, creating a mess in the computer’s memory.

The consequences of a buffer overflow can be severe, ranging from data loss and corruption to system crashes and unauthorized access. Attackers can exploit buffer overflow vulnerabilities to inject malicious code that can take control of a system, steal sensitive data, or spread malware.

For example, a buffer for log-in credentials may be designed to expect username and password inputs of 8 bytes, so if a transaction involves an input of 10 bytes (that is, 2 bytes more than expected), the program may write the excess data past the buffer boundary.

Buffer overflows can affect all types of software. They typically result from malformed inputs or failure to allocate enough space for the buffer. If the transaction overwrites executable code, it can cause the program to behave unpredictably and generate incorrect results, memory access errors, or crashes.

What Is a Buffer Overflow Attack?

A buffer overflow attack is a devious exploit that takes advantage of a program’s vulnerability to store more data in a buffer than it can hold. Think of it like a Trojan horse that sneaks in and wreaks havoc on a system by exploiting its weakness.

A buffer is like a temporary storage container in a computer’s memory that holds data while it’s being processed. When a program tries to write more data to a buffer than it can hold, the excess data overflows into adjacent memory locations, which can cause unpredictable behavior.

An attacker can exploit this vulnerability by crafting a malicious input that contains more data than the buffer can hold. This excess data then overflows into adjacent memory locations, allowing the attacker to modify the program’s behavior or inject malicious code that will execute when the program returns from the current function.

This exploit can lead to a range of malicious activities, such as arbitrary code execution, denial of service, or even taking control of the system. Attackers can use buffer overflow attacks to spread malware, steal sensitive data, or gain unauthorized access to a system.

A buffer overflow vulnerability will typically occur when code:

Is reliant on external data to control its behavior
Is dependent on data properties that are enforced beyond its immediate scope
Is so complex that programmers are not able to predict its behavior accurately

Buffer Overflow Exploits

Buffer overflow exploits are a type of cyber attack that can have devastating consequences for a system’s security. It’s like a hacker sneaking through an unlocked window and then opening the doors to let all their friends in.

In a buffer overflow exploit, the attacker sends more data to a buffer than it can hold. This extra data then overflows into adjacent memory locations, which can modify the program’s behavior or allow the attacker to execute arbitrary code.

Attackers can use buffer overflow exploits to gain unauthorized access to a system, steal sensitive data, or spread malware. Once the exploit has been successful, the attacker can then take control of the system, manipulate data, or use the system as a platform for launching further attacks.

Attackers use a buffer overflow to corrupt a web application’s execution stack, execute arbitrary code, and take over a machine. Flaws in buffer overflows can exist in both application servers and web servers, especially web applications that use libraries like graphics libraries. Buffer overflows can also exist in custom web application codes. This is more likely because they are given less scrutiny by security teams but are less likely to be discovered by hackers and more difficult to exploit.

One common way that attackers can execute buffer overflow exploits is through a technique known as a “stack-based buffer overflow.” This technique involves overflowing the stack, which is a section of memory used for storing function call frames and local variables. By overflowing the stack, the attacker can modify the program’s return address and execute their own code.

Buffer Overflow Consequences

Buffer overflow is a security vulnerability that can have severe consequences for a system’s integrity and security. It’s like a small leak in a dam that can quickly turn into a catastrophic breach.

System crashes: A buffer overflow attack will typically lead to the system crashing. It may also result in a lack of availability and programs being put into an infinite loop.
Access control loss: A buffer overflow attack will often involve the use of arbitrary code, which is often outside the scope of programs’ security policies.
Further security issues: When a buffer overflow attack results in arbitrary code execution, the attacker may use it to exploit other vulnerabilities and subvert other security services.

Types of Buffer Overflow Attacks

There are several types of buffer overflow attacks, each with its own unique characteristics and methods of execution. Here are three common types:

Stack-based buffer overflow: This type of attack exploits a vulnerability in the stack, which is a region of memory used for storing function call frames and local variables. By overflowing the stack with more data than it can hold, an attacker can modify the program’s behavior and execute arbitrary code.
Heap-based buffer overflow: This type of attack targets the heap, which is a region of memory used for dynamic memory allocation. By overflowing the heap with more data than it can hold, an attacker can corrupt the memory space and modify the program’s behavior.
Format string attacks: This type of attack exploits a vulnerability in the way a program handles format strings. By providing a malformed format string, an attacker can modify the program’s memory and execute arbitrary code.

Other types of buffer overflow attacks include integer overflow, off-by-one error, and heap overflow attacks. Each of these attacks has its own set of techniques and methods of execution.

Stack buffer overflow attack example

Now that we know a program can overflow an array and overwrite a fragment of memory that it should not overwrite, let’s see how this can be used to mount a buffer overflow attack. In a typical scenario (called stack buffer overflow), the problem is caused – like so many problems in information security – by mixing data (meant to be processed or displayed) with commands that control program execution. In C, like in most programming languages, programs are built using functions. Functions call each other, pass arguments to each other, and return values. For instance, our code, which reads an IP address from a file, could be part of a function called readIpAddress, which reads an IP address from a file and parses it. This function could be called by some other function, for example, readConfiguration. When readConfiguration calls readIpAddress, it passes a filename to it and then the readIpAddress function returns an IP address as an array of four bytes. Fig. 1. The arguments and the return value of the readIpAddress function

Fig. 1. The arguments and the return value of the readIpAddress function

During this function call, three different pieces of information are stored side-by-side in computer memory. For each program, the operating system maintains a region of memory which includes a part called the stack or call stack (hence the name stack buffer overflow). When a function is called, a fragment of the stack is allocated to it. This piece of the stack (called a frame) is used to:

Remember the line of code from which program execution should resume when function execution completes (in our case, this will be a specific line in the readConfiguration function)
Store the arguments passed to the function by its caller (in our case, let’s assume /home/someuser/myconfiguration/ip.txt)
Store the return value that the function returns to its caller (in our case, it’s a four-byte array, let’s say (192, 168, 0, 1))
Store local variables of the called function while this function is being executed (in our case, the variable char[MAX_IP_LENGTH] buf)

So if a program has a buffer allocated in the stack frame and tries to insert more data than can fit there, user input data may spill over and overwrite the memory location where the return address is stored. Fig. 2. Contents of the stack frame when the readIPAddress function is called

Fig. 2. Contents of the stack frame when the readIPAddress function is called

If the problem was caused by random malformed user input data, the new return address most likely will not point to a memory location where any other program is stored, so the original program will simply crash. However, if the data is carefully prepared, it may lead to unintended code execution. The first step for the attacker is to prepare special data that can be interpreted as executable code and will work for the attacker’s benefit (this is called a shellcode). The second step is to place the address of this malicious data in the exact location where the return address should be. Fig. 3. The content of ip.txt overwrites the return address

Fig. 3. The content of ip.txt overwrites the return address

In effect, when the function reads the IP character string and places it into the destination buffer, the return address is replaced by the address of the malicious code. When the function ends, program execution jumps to malicious code.

Executing a Buffer Overflow Attack

Cybercriminals exploit buffer overflow problems to alter the execution path of the application by overwriting parts of its memory. The malicious extra data may contain code designed to trigger specific actions — in effect sending new instructions to the attacked application that could result in unauthorized access to the system. Hacker techniques that exploit a buffer overflow vulnerability vary per architecture and operating system.

Buffer Overflow Causes

Coding errors are typically the cause of buffer overflow. Common application development mistakes that can lead to buffer overflow include failing to allocate large enough buffers and neglecting to check for overflow problems. These mistakes are especially problematic with C/C++, which does not have built-in protection against buffer overflows. Consequently, C/C++ applications are often targets of buffer overflow attacks.

Buffer Overflow Attack Example

[Adapted from “Buffer Overflow Attack Explained with a C Program Example,” Himanshu Arora, June 4, 2013, The Geek Stuff]

In some cases, an attacker injects malicious code into the memory that has been corrupted by the overflow. In other cases, the attacker simply takes advantage of the overflow and its corruption of the adjacent memory. For example, consider a program that requests a user password in order to grant the user access to the system. In the code below, the correct password grants the user root privileges. If the password is incorrect, the program will not grant the user privileges.

printf ("\n Correct Password \n");pass = 1;}if(pass){/* Now Give root or admin rights to user*/printf ("\n Root privileges given to the user \n");}return 0;

However, there is a possibility of buffer overflow in this program because the gets() function does not check the array bounds.

Here is an example of what an attacker could do with this coding error:

$ ./bfrovrflwEnter the password :hhhhhhhhhhhhhhhhhhhhWrong PasswordRoot privileges given to the user

In the above example, the program gives the user root privileges, even though the user entered an incorrect password. In this case, the attacker supplied an input with a length greater than the buffer can hold, creating buffer overflow, which overwrote the memory of integer “pass.” Therefore, despite the incorrect password, the value of “pass” became non zero, and the attacker receives root privileges.

Which Programming Languages Are More Vulnerable?

Nearly all applications, web servers, and web application environments are vulnerable to buffer overflows. Environments that are written in interpreted languages, such as Java and Python, are immune to the attacks, with the exception of overflows in their interpreter.

Buffer overflow attacks are typically caused by coding errors and mistakes in application development. This results in buffer overflow as the application does not allocate appropriately sized buffers and fails to check for overflow issues. These issues are particularly problematic in the programming language C/C++ as it does not have buffer overflow protection built in.

This programming language is not the only one vulnerable to buffer overflow attacks. A buffer overflow program in Assembly, C, C++ or Fortran is also particularly vulnerable and more likely to enable attackers to compromise a system. However, applications written in JavaScript or Perl are typically less vulnerable to buffer overflow attacks.

Buffer overflows and the Web

However, even programmers who use high-level languages should know and care about buffer overflow attacks. Their programs are often executed within operating systems that are written in C or use runtime environments written in C, and this C code may be vulnerable to such attacks. In order to see how a buffer overflow vulnerability may affect a programmer using such a high-level programming language, let’s analyze CVE-2015-3329 – a real-life security vulnerability discovered in the PHP standard library in 2015. A PHP application is a collection of *.php files. In order to make it easier to distribute such an application, it may be packed into a single file archive – as a zip file, a tar file, or using a custom PHP format called phar. A PHP extension called phar contains a class that you can use to work with such archives. With this class, you can parse an archive, list its files, extract the files, etc. Using this class is quite simple. For example, to extract all files from an archive, use the following code:

$phar = new Phar('phar-file.phar');
$phar->extractTo('./directory');

copyWhen the Phar class parses an archive (that’s new Phar('phar-file.phar')), it reads all filenames from the archive, concatenates each filename with the archive filename, and then calculates the checksum. For example, for an archive called myarchive.phar that contains files index.php and components/hello.php, the Phar class calculates checksums of two strings: myarchive.pharindex.php and myarchive.pharcomponents/hello.php. The reason why the authors implemented it this way is not important here – what is important is how they implemented it. Until 2015, this operation was done using the following function (see the old PHP source code):

phar_set_inode(phar_entry_info *entry TSRMLS_DC) /* {{{ */
{
        char tmp[MAXPATHLEN];
        int tmp_len;

        tmp_len = entry->filename_len + entry->phar->fname_len;
        memcpy(tmp, entry->phar->fname, entry->phar->fname_len);
        memcpy(tmp + entry->phar->fname_len, entry->filename, entry->filename_len);
        entry->inode = (unsigned short)zend_get_hash_value(tmp, tmp_len);
}

copyAs you can see, this function creates a char array of called tmp. First, the name of the phar archive (in our example, myarchive.phar) is copied into this array using the following command:

memcpy(tmp, entry->phar->fname, entry->phar->fname_len);

copyIn this command:

The first argument, tmp, is a destination where bytes should be copied.
The second argument, entry->phar->fname, is a source from where bytes should be copied – in our case, the filename of the archive (myarchive.phar).
The third argument, entry->phar->fname_len, is a number of bytes that should be copied – in our case it is the length (in bytes) of the archive filename.

The function copies the filename (in our example, index.php or components/hello.php) into the tmp char array using the following command:

memcpy(tmp + entry->phar->fname_len, entry->filename, entry->filename_len);

copyIn this command:

The first argument, tmp + entry->phar->fname_len, is a destination where bytes should be copied – in our case, it is a location in the tmp array just after the end of the archive filename.
The second argument, entry->filename, is a source from where bytes should be copied.
The third argument, entry->filename_len, is a number of bytes that should be copied.

Then the zend_get_hash_value function is called to calculate the hashcode. Notice how the size of the buffer is declared:

char tmp[MAXPATHLEN];

copyIt has a size of MAXPATHLEN, which is a constant defined as the maximum length of a filesystem path on the current platform. The authors assumed that if they concatenate the filename of the archive with the name of a file inside the archive, they will never exceed the maximum allowed path length. In normal situations, this assumption is met. However, if the attacker prepares an archive with unusually long filenames, a buffer overflow is imminent. The function phar_set_inode will cause an overflow in the tmp array. An attacker can use this to crash PHP (causing a denial of service) or even make it execute malicious code. The problem is similar to our simple example from above – the programmer made a simple mistake, trusted user input too much, and assumed that the data will always fit in a fixed-size buffer. Fortunately, this vulnerability was discovered in 2015 and fixed.

How to Prevent Buffer Overflows

Developers can protect against buffer overflow vulnerabilities via security measures in their code, or by using languages that offer built-in protection.

In addition, modern operating systems have runtime protection. Three common protections are:

Data execution prevention—flags certain areas of memory as non-executable or executable, which stops an attack from running code in a non-executable region.
Structured exception handler overwrite protection (SEHOP)—helps stop malicious code from attacking Structured Exception Handling (SEH), a built-in system for managing hardware and software exceptions. It thus prevents an attacker from being able to make use of the SEH overwrite exploitation technique. At a functional level, an SEH overwrite is achieved using a stack-based buffer overflow to overwrite an exception registration record, stored on a thread’s stack.
Use Address Space Layout Randomization (ASLR): ASLR is a security technique that randomly arranges the address space positions of key data areas of a process. This can make it harder for attackers to exploit buffer overflow vulnerabilities by making it more difficult to predict where their injected code will reside in memory.
Use Code Signing and Execution Prevention: Code signing helps to ensure that only trusted code is executed. This can prevent buffer overflow vulnerabilities caused by malware that injects malicious code into the system. Execution prevention techniques like Data Execution Prevention (DEP) can also help to prevent buffer overflow attacks by blocking the execution of code in memory regions marked as non-executable.
Implement Stack Canaries: A stack canary is a security mechanism that can detect buffer overflow attacks by inserting a random value in the stack frame. If an attacker attempts to overwrite the return address or other important data, the canary value will be overwritten first, triggering an immediate error.
Use Sandboxing: Sandboxing is a technique that involves running software in a controlled environment that restricts its access to system resources. This can help to prevent buffer overflow vulnerabilities by limiting the damage that an attacker can do if they manage to exploit a vulnerability.
Use Fuzz Testing: Fuzz testing is a technique that involves providing random input to an application in order to detect unexpected behavior. This can help to uncover buffer overflow vulnerabilities that may not be apparent during normal testing.

Security measures in code and operating system protection are not enough. When an organization discovers a buffer overflow vulnerability, it must react quickly to patch the affected software and make sure that users of the software can access the patch.

FAQ

What is a buffer overflow attack?

A buffer overflow attack is a type of security vulnerability that occurs when a program tries to write more data to a buffer than it can hold. This can cause the excess data to overflow into adjacent memory areas, potentially overwriting important data or allowing an attacker to execute malicious code.

What programming languages are more vulnerable to buffer overflows?

Low-level programming languages like C and C++ are more vulnerable to buffer overflows because they provide direct access to memory and don’t have built-in safety features like automatic bounds checking.

What are the consequences of a buffer overflow attack?

The consequences of a buffer overflow attack can range from crashing a program to allowing an attacker to gain remote control of a system. Attackers can use buffer overflows to execute malicious code, steal sensitive data, or escalate their privileges.

How can buffer overflow attacks be prevented?

Buffer overflow attacks can be prevented by using input validation, safe programming techniques, memory-safe languages, code signing and execution prevention, stack canaries, sandboxing, and fuzz testing. By implementing these measures, programmers and system administrators can reduce the risk of buffer overflow vulnerabilities and keep their systems secure.

How do I know if my system is vulnerable to buffer overflows?

You can use vulnerability scanning tools or hire a security expert to conduct a vulnerability assessment to determine if your system is vulnerable to buffer overflows. Additionally, staying up-to-date with security patches and best practices can help prevent new and emerging buffer overflow vulnerabilities.

Conclusion

In conclusion, buffer overflow attacks can pose a serious threat to the security and integrity of computer systems. As we’ve seen, these attacks can result in a wide range of consequences, from program crashes to remote code execution by malicious actors. However, by implementing a variety of preventive measures, including input validation, safe programming techniques, memory-safe languages, code signing and execution prevention, stack canaries, sandboxing, and fuzz testing, we can significantly reduce the risk of buffer overflow vulnerabilities.

It’s important for programmers and system administrators to stay vigilant and up-to-date on the latest security best practices and technologies to ensure that their systems remain secure against emerging threats. By taking proactive measures to prevent buffer overflow attacks, we can help ensure the safety and reliability of our digital systems and protect against the potentially catastrophic consequences of these vulnerabilities.

So You Want to Be a Hacker: 2024 Edition

What is Active Directory? A Beginner’s Guide

Understanding Passive and Active Reconnaissance