Undefined C: Common mistakes

Posted on April 10th, 2015 by Dwight Guth

This blog post is the beginning of an intended series of blog posts detailing undefined and unspecified behavior in the ISO C standard, and its impact on development. To start with, we will summarize the domain and provide information about some of the undefined behaviors which we have found to be most widespread in production-deployed C code of the open source projects we have tested.

As many of you probably know, C is a programming language defined by a standards committee. Some of you may know the details: Periodically the ISO/IEC Working Group 14 gets together to write the latest version of ISO standard 9899, the international standard for the C programming language. However, I suspect few of you have read the standard itself, because in practice, I have found that very little production-level C code conforms fully to the standard.

What this means for C software can essentially be grouped into two categories. The first, and significantly more benign effect, is that the vast majority of C software is extremely non-portable. Use of GCC extensions abounds, as does reliance on the fact that GCC emits warnings for many issues in applications which the standard actually considers to be a constraint violation. In efforts to deal with this, many developers end up writing significant amounts of code porting their application to all possible compilers and platforms, usually compounding the problem by introducing yet-more reliance on non-portable behavior.

By contrast, code which conforms to the ISO C standard can essentially be guaranteed to run correctly on any modern compiler that implements the standard. While many applications do rely on implementation-defined behavior, or sometimes unspecified behavior, typically this behavior is much easier to port and therefore does not interfere with the overall portability of a codebase. And yet, nonetheless, many developers rely heavily on functionality of GCC and other popular compilers which do not conform to any standard at all.

What some people do not realize is that GCC is actually a quite powerful checker for many of these issues. I strongly encourage all users to try to compile their code so that it compiles even when GCC is passed the flags -Wall -pedantic -Werror -std=c11. This causes GCC to treat all GCC extensions and all undefined behavior for which GCC emits a warning to be treated as an error, and also enables the C11 version of all headers (you can also use -std=c99 or -std=c89 if you prefer). Note that this will not cause GCC to reject all undefined programs, nor will it even reject all programs which can be determined to be undefined trivially using static analysis. However, it is a good first step if your goal is to create highly portable code.

Note, however, that not all undefined programs can be detected this way. Because of the difficulty associated with exactly this activity, we at Runtime Verification have developed our proprietary tool, RV-Match for C, which can be used to compile and execute C programs in an interpreted mode which allows detection of a much larger subset of all possible undefined behaviors in C. While our tool can be used to detect quite serious errors like memory access after an object is freed, buffer overflows, and segfaults, I will not focus on them in this post, choosing to examine instead the undefined behaviors which are quite common even in production code, because of a lack of commonly-used checkers for these issues.

Minor GCC Extensions

There are a number of very minor differences between GNU C and ISO Standard C. Among some of the most popularly used extensions are binary constants, zero-length arrays, the double constant suffix, returning a value of type void from a void function, accessing the size of the void type, and unsigned enum values. See below:

int x = 0b1111;
struct foo {
  int a;
  char buffer[0];
};
union bar {
  int b;
  char buffer[0];
};
double y = 0.0d;
void recurse() {
  return recurse();
}
size_t z = sizeof(void);
int *p = &x;
void *p2 = (void *)p + 1;
enum {
  A = 0xffffffff
};

Generally in each of these cases, the ISO C standard provides another behavior that can be procedurally used to replace the extension, leading to standards-conforming C code. To summarize the examples given, I personally choose to replace binary constants with hexadecimal constants, zero-length arrays with either a flexible array member or an array of known constant size (depending on whether the array is in a struct or a union), removing the d suffix entirely (0.0 is already a double), evaluting void expressions as a separate statment, casting void pointers to char *, using an unsigned int instead of an enum, etc. Compare the above code with the equivalent below code which is strictly conforming:

int x = 0xf;
struct foo {
  int a;
  char buffer[];
};
union bar {
  int b;
  char buffer[sizeof(int)];
};
double y = 0.0;
void recurse() {
  recurse();
  return;
}
size_t z = sizeof(char);
int *p = &x;
void *p2 = (char *)p + 1;
unsigned int A = 0xffffffff;

These are all generally quite easy to replace and many times can even be detected by GCC using the trick I mentioned above. However, some common errors are harder to detect.

ISO C inline semantics

Many developers are not completely familiar with the semantics of the inline function specifier in C99 and C11. For example, consider the following program:

inline int foo(void) { return 5; }
int main() {
return foo();
}

When I compile this program with gcc -std=c11 test.c -O2, everything looks great. I have no idea that I have written an undefined C program. Suddenly one day for whatever reason I forget to specify -O2 when compiling, and see an error: undefined reference to 'foo'. What is going on?

Well, it turns out you have fallen afoul of one of the side effects of the way GCC inlines functions. When I compile with -O2, GCC performs inline substitution as one of its optimizing passes. This causes the function foo to be inlined at its call site in the main function, such that the main function no longer contains any function calls to foo. As a result of this, the program compiles successfully.

But wait, you say, why does that matter? Isn't foo defined right there? Well, no, not really. According to the C11 standard section 6.7.4, which deals with the inline specifier, a function with external linkage (ie, generally, one not declared with the static keyword), in which all declarations of the function at file scope include the inline keyword but not the extern keyword, is what is called an inline definition. Inline definitions do not count as what's called an external definition of a function (ie, storage allocated in the binary for the function's compiled instructions). So actually, the definition you provided for foo is never provided to the linker, causing the linker error you saw above. You can fix this error in one of three ways: by giving the function internal linkage (ie, specifying static inline), by turning your definition into an external definition (ie,

extern int foo(void);
inline int foo(void) {
  return 5;
}

), or by keeping the inline definition, but providing an external definition in another file. In the last case, the C compiler is free to choose to either link against the external definition, or inline the inline definition. Which behavior you get at any given call site is unspecified, and GCC typically performs performance analysis to try to guess which will generate better performance. Any of these three alterations will result in a defined C program.

Undeclared identifiers

Another common pitfall in writing defined C programs has to do with declarations and definitions. According to the ISO C standard, any identifier that is used at least once in an expression must be defined somewhere in the program (either in the same translation unit, in the case of an identifier with internal linkage, or elsewhere in the program, in the case of an identifier with external linkage). However, GCC does not perform this check on identifiers with external linkage. The identifiers are instead resolved only by ld, the linker, which operates over object code and therefore does not have access to the declarations themselves. As a result of this, you can declare the same function with different types in different translation units, and have everything compile correctly until the time when you call the function, at which point the compiled code uses the wrong ABI and ends up corrupting memory. If this function is called by code that runs only very rarely, but in a critically-important component, this is a complete disaster. For example, one way this can happen is if you attempt to access a function in the C standard library without remembering to include the correct header file. Worse, if you are using GLIBC, sometimes you can include the correct header file and still not have access to the declaration you wanted, because you forgot to declare the right feature test macro. Often this will silently work correctly, because of how gcc infers the types of function calls. But if you are calling a function that operates over chars or shorts, you may end up with broken code because it has inferred an int where really you wanted a short, due to C's promotion semantics. Again, you will not detect this unless your code calls the function in question, by which point it may be too late to prevent disaster.

In the next post in this series, we will focus on more serious undefinedness errors, and compare the effectiveness in catching these errors of valgrind versus RV-Match.