Robust programming to bomb-proof your code

Robust programming to bomb-proof your code

Robust: Strong; able to survive and not likely to break

[15]

Introduction

Secure programming is a type of coding that focuses on preventing known vulnerabilities. However, robust programming takes it a step further. It emphasizes building well-structured programs that anticipate and handle the potential problems, not just avoiding the common pitfalls. Basically, robust programming is the foundation for secure coding.

Robust programming is a style of programming that focuses on handling unexpected termination and unexpected actions.

Robust programming demands handling invalid inputs, program terminations, and user actions gracefully. This means providing clear and unambiguous error messages that are easy for the user to understand. These informative messages help users identify and fix problems more easily.

💡
The consequences of poorly written software can range from minor inconveniences to critical failures. A prime example occurred on September 23rd, 2010, when a software error in Facebook's system, likely related to error handling, caused an outage lasting over two hours. This incident highlights the potential impact of software bugs, not just on social media platforms, but also in critical areas like medical software and electronic voting systems.

A system's security is defined by its security policy, which outlines the authorized states the system can be in. Any deviation from these permitted states becomes a security breach. Secure programming therefore emphasizes writing code that adheres to the established security policies. These policies define the desired behavior of the system and the acceptable states it can occupy. Consider a program vulnerable to a stack-based buffer overflow. An attacker could exploit this flaw to overwrite the return address on the stack, forcing the program to execute malicious code. If this code grants the attacker elevated privileges (e.g., setuid-to-root on Linux), it violates the system's security policy by allowing unauthorized actions. In essence, secure programming practices aim to prevent such vulnerabilities and uphold the security policies that safeguard the system.

In the same program, even if the attacker's exploited code doesn't gain elevated privileges, the buffer overflow vulnerability itself remains a security concern. While it might not directly violate a policy focused solely on privilege escalation, it still creates an exploitable weakness. Secure programming principles aim to eliminate such vulnerabilities altogether, preventing attackers from gaining any unauthorized access or control, regardless of privilege levels. This highlights the broader scope of secure programming – it's not just about preventing specific policy violations, but about building robust systems resistant to various attack vectors.

Exploiting the buffer overflow without privilege escalation might not directly violate a security policy, but it certainly exposes a robustness issue. Robust code anticipates and handles unexpected inputs, including those that could trigger buffer overflows. In this scenario, a robust program would gracefully terminate execution, informing the user about the invalid input that caused the problem. This controlled termination prevents unpredictable behavior and potential system crashes, even if the attacker's goals weren't focused on privilege escalation.

Basic Principles

  • Robust Code: It focuses on preventing program crashes and unexpected behavior.

  • Fragile Code: Code that is susceptible to crashes, unexpected behavior due to unhandled errors and bad inputs. It lacks error handling mechanisms, clear error messages, leading to troubleshooting difficulties.

4 Basic Principles

Robust Programming following four basic principles:

  1. Be Paranoid

  2. Assume Stupidity

  3. Don’t hand out Dangerous Implements

  4. Be prepared for Can happen

Be Paranoid

The core principle of defensive programming can be summarized as -

If you didn't generate it, don't trust it.

This approach acknowledges the potential for errors and unexpected behavior, both within your own code and from external sources. Defensive programmers write code with the assumption that their own work might have flaws or bugs. They employ techniques to proactively identify and mitigate these issues as early as possible.

This cautious approach extends to user input. Defensive programming treats all incoming data with suspicion, assuming it could be invalid or malicious. Code written with this philosophy includes robust checks on function calls to ensure successful execution.

Assume Stupidity

This means the code shouldn't rely on users having in-depth knowledge of the system or having read manuals. This could involve input validation techniques to ensure data conforms to expected formats. Instead of relying on cryptic error codes that necessitate manual lookups, error messages should be user-friendly and self-contained. They should clearly explain the encountered problem in a way that's easy to understand. The code should be designed to detect errors as soon as possible during execution. Upon encountering an error, the code should take appropriate actions to prevent it from propagating and causing further problems. This might involve logging the error for analysis, informing the user with a clear message, or performing a controlled termination to maintain system stability. This facilitates easier debugging and system recovery.

Don't Hand Out Dangerous Implements

This principle emphasizes the concept of encapsulation. It encourages the isolation of a code module's internal state. Data structures, libraries, and pointers to data should be hidden from external entities, including the user. By hiding internal details, the code becomes less susceptible to accidental modifications from external sources (like user interaction). Additionally, segregating internal details promotes modularity, making the code more organized and easier to maintain.

Be Prepared For Can Happen

While certain conditions might seem highly unlikely, good practices dictate considering and handling them nonetheless. This principle goes beyond merely anticipating user errors. Code modifications and additions over time can introduce inconsistencies that trigger previously "impossible" scenarios. By incorporating checks for these unlikely yet potential conditions, the code becomes more robust. Even if such checks simply return an error indicator, they serve a valuable purpose.

In essence, defensive programming promotes a culture of anticipating the unexpected. It's not about dwelling on worst-case scenarios, but rather about incorporating safeguards to catch potential problems before they cause critical failures.

Fragile Code

Let's explore some common fragile code examples and their robust counterparts.

Fragile Code Example 1 - Program to calculate average from a list of numbers -

#include <stdio.h>
int main() {
    int nums[5] = {5, 3, 6, 2, 8};
    int sum = 0;
int avg;
    int i;
    for (i = 0; i < 5; i++) {
        sum += nums[i];
    }
    avg= sum / 5;
    printf("Average: %d\n", avg);
    return 0;
}

This code snippet exemplifies fragile programming. It calculates the average of a list by summing the elements and dividing by a fixed value of 5. This approach assumes the list always contains precisely five numbers. Any deviation from this assumption, such as an empty list or a list with a different size, would lead to incorrect results or even program crashes.

Robust Counterpart

#include <stdio.h>

int main() {
    int nums[] = {5, 3, 6, 2, 8}; // Array size inferred from initializer
    int num_elements = sizeof(nums) / sizeof(nums[0]); // Calculating number of elements
    int sum = 0;
    int i;

    if (num_elements == 0) {
        printf("Error: Empty list. Cannot calculate average.\n");
        return 1; // Indicates error
    }

    for (i = 0; i < num_elements; i++) {
        sum += nums[i];
    }

    float avg = (float)sum / num_elements; // Use float for non-integer results

    printf("Average: %.2f\n", avg); // Print with 2 decimal places

    return 0;
}

Let's see why the above code is more robust. The code infers the array size from the initializer, making it more flexible in case the number of elements changes. It calculates the actual number of elements using sizeof on the entire array and then dividing by the size of a single element. This works because the array size is known at compile time. It checks if the list is empty and prints an error message if so. The return value of 1 indicates an error condition. It uses a float variable for the average to handle potential non-integer results accurately and a %.2f is used to print the average with two decimal places.

Fragile Code Example 2 - Program for modifying variables using pointers -

#include <stdio.h>
int main() {
    int num = 10;
    printf("Num: %d\n", num); // Printing the value of num
    char *ptr = (char*)&num; // Changing the value of num by mistake
    ptr[0] = 0;
    ptr[1] = 0;
    ptr[2] = 0;
    ptr[3] = 0;
    printf("Num: %d\n", num); // Print the value of num again
    return 0;
}

This code demonstrates a potential pitfall when modifying variables through pointers. Initially, the variable num is assigned a value of 10. While the code then prints this value using printf(), a later section mistakenly modifies num indirectly through a pointer to its memory address.

Robust Counterpart

#include <stdio.h>
#define DEFAULT_NUM 10 // Define a constant value for the initial value of num
int main() {
    int num = DEFAULT_NUM;
    printf("Num: %d\n", num); // Print the value of num
    num = 20; // Attempt to change the value of num (this will not work)
    printf("Num: %d\n", num); // Print the value of num again
    return 0;
}

Instead of directly assigning a value to num, this code defines a constant DEFAULT_NUM to represent its initial state. This constant is then used to initialize num, promoting clarity and preventing accidental modifications. While the code attempts to change num's value later by assigning a new value (20), it won't make any differnce because num is declared as an int and unlike other variable types, int variables in C generally cannot be reassigned after their initial declaration. This characteristic, combined with the use of a constant for initialization, safeguards num's value and contributes to a more robust program.

Fragile Code Example 3 - Program for dividing a number by 0 -

#include <stdio.h>
int main() {
    int x = 10;
    int y = 5;
    int z = x / y; // Dividing x by y and store the result in z
    printf("Result: %d\n", z); // Printing the result
    return 0;
}

This code snippet seems basic but, harbors a fragility. It hinges on the assumption that the variable y holds a non-zero value. This assumption becomes a critical point of failure if y is indeed zero. Dividing by zero in most programming languages results in a runtime error, causing the program to crash. This scenario could easily arise if y receives its value from an external source like user input or a file.

Robust Counterpart

#include <stdio.h>
int main() {
    int x = 10;
    int y = 5;
    int z ;
    if (y == 0) {
        printf("Error: Cannot divide by zero\n");
        return 1; // Return an error code
    }
    z= x / y; // Divide x by y and store the result in z
    printf("Result: %d\n", z); // Print the result
    return 0;
}

The above revised code addresses the critical issue of division by zero. It incorporates a check on the value of y before attempting the division. If y is indeed zero, the code just handles the scenario by printing an informative error message and returning a specific error code (1).

Fragile Code Example 4 - Program for accepting values from user -

#include <stdio.h>
int main() {
    int num1, num2, sum;
    printf("Enter two numbers separated by a space: "); // Prompt the user to enter two numbers
    scanf("%d %d", &num1, &num2);
    sum = num1 + num2; // Calculate the sum of the two numbers
    printf("The sum of %d and %d is %d\n", num1, num2, sum); // Print the result
    return 0;
}

The above program exemplifies a fragility in user input handling. It calculates the sum of two integers retrieved from user input. However, the code lacks validation mechanisms to safeguard against unexpected or erroneous user entries. For instance, if the user enters non-numeric characters instead of integers, the program would likely crash due to parsing errors.

Robust Counterpart

#include <stdio.h>
int main() {
    int num1, num2, sum;
printf("Enter two numbers separated by a space: ");
    if (scanf("%d %d", &num1, &num2) != 2) { // Read user input and check for errors
        printf("Invalid input. Please enter two numbers separated by a space.\n");
        return 1;
    }
sum = num1 + num2;
printf("The sum of %d and %d is %d\n", num1, num2, sum);
    return 0;
}

Here, we use the scanf function to read two integers from the user. However, it goes beyond simply reading the input. The code incorporates error checking by verifying the return value of scanf. A successful scanf typically returns the number of items it successfully read. In this case, the code expects to read two integers, so it checks if the return value is precisely 2. If scanf returns a different value, or if an error occurs during the reading process, the code just exits the program after displaying an error message.

programming-meme-29.jpg (566×571)

Lessons To Learn

Lesson 1: Prioritize Parameter Clarity

It highlights the importance of designing function parameters for clarity and reducing the risk of errors. One example is the 'flag argument' often used to indicate actions like 'create' or 'delete'. Imagine a flag where 1 represents 'create' and 0 represents 'delete'. Psychologically, programmers might struggle to recall the correct value, potentially leading to unintended behavior (e.g., deleting a queue when they meant to create one). So, instead of flags, using descriptive parameter names that explicitly convey the intended action seems like a good option. For instance, use create_queue and delete_queue instead of a single parameter with a flag value.

Lesson 2: Validate Function Inputs

Check function parameters to avoid crashes caused by invalid values (null pointers, non-positive values). If parameters are invalid, handle the errors appropriately (e.g., error messages, return codes). Validate pointer validity (qptr) and size (size) during queue creation and deletion to prevent memory allocation issues.

Lesson 3: Avoid Double Free with Pointers

Passing pointers by reference can lead to errors if the function doesn't track allocation history. Consider a function qmanage that manages a queue using a pointer (qptr). If qmanage allocates memory for the queue in the first call (e.g., qmanage(&qptr, 1, 100)), subsequent calls with deallocation requests (e.g., qmanage(&qptr, 0, 1)) trying to free the same memory can cause crashes. The function must track allocations or rely on mechanisms to prevent double frees. This could involve using ownership flags or smart pointers (depending on the programming language) to manage memory deallocation.

Lesson 4: Don't Ignore Return Values

Always check the return values of functions, especially those that perform memory allocation or operations that could potentially fail (e.g., multiplication with overflow risk). Check the return value of malloc to ensure successful memory allocation before using the pointer.

Lesson 5: Guard Against Arithmetic Overflow/Underflow

Overflow typically occurs with positive operands, while underflow happens with negative operands. Use larger data types (if applicable) to accommodate wider ranges of values.

End Note

It's kinda weird to notice in retrospect how much more engaging some concepts become once the external pressure from professors is not there. While studying for exams, I really struggled with this particular concept from my coursework. But, I'll be honest, it's not that bad to learn. The other thing I noticed is that when I stop writing for too long, there's definitely an outpour of content, like today. Sorry not sorry.