Saturday, October 17, 2009

What Is the C Preprocessor?

If there is a constant appearing in several places in your program, it's a good idea to associate a symbolic name to the
constant, and then use the symbolic name to replace the constant throughout the program. There are two advantages
to doing so. First, your program will be more readable. Second, it's easier to maintain your program. For instance, if
the value of the constant needs to be changed, find the statement that associates the constant with the symbolic name
and replace the constant with the new one. Without using the symbolic name, you have to look everywhere in your
program to replace the constant. Sounds great, but can we do this in C?
Well, C has a special program called the C preprocessor that allows you to define and associate symbolic names with
constants. In fact, the C preprocessor uses the terminology macro names and macro body to refer to the symbolic
names and the constants. The C preprocessor runs before the compiler. During preprocessing, the operation to replace
a macro name with its associated macro body is called macro substitution or macro expansion.
You can put a macro definition anywhere in your program. However, a macro name has to be defined before it can be
used in your program.
In addition, the C preprocessor gives you the ability to include other source files. For instance, we've been using the
preprocessor directive #include to include C header files, such as stdio.h, stdlib.h, and string.h, in the programs
throughout this book. Also, the C preprocessor enables you to compile different sections of your program under
specified conditions.

The C Preprocessor Versus the Compiler

One important thing you need to remember is that the C preprocessor is not part of the C compiler.
The C preprocessor uses a different syntax. All directives in the C preprocessor begin with a pound sign (#). In other
words, the pound sign denotes the beginning of a preprocessor directive, and it must be the first nonspace character
on the line.
The C preprocessor is line oriented. Each macro statement ends with a newline character, not a semicolon. (Only C
statements end with semicolons.) One of the most common mistakes made by the programmer is to place a semicolon
at the end of a macro statement. Fortunately, many C compilers can catch such errors.

TIP
Macro names, especially those that will be substituted with constants, are normally represented with
uppercase letters so that they can be distinguished from other variable names in the program.

The #define and #undef Directives

The #define directive is the most common preprocessor directive, which tells the preprocessor to replace every
occurrence of a particular character string (that is, a macro name) with a specified value (that is, a macro body).
The syntax for the #define directive is
#define macro_name macro_body
Here macro_name is an identifier that can contain letters, numerals, or underscores. macro_body may be a string or a
data item, which is used to substitute each macro_name found in the program.
As mentioned earlier, the operation to replace occurrences of macro_name with the value specified by macro_body is
known as macro substitution or macro expansion.
The value of the macro body specified by a #define directive can be any character string or number. For example, the
following definition associates STATE_NAME with the string "Texas" (including the quotation marks):
#define STATE_NAME "Texas"
Then, during preprocessing, all occurrences of STATE_NAME will be replaced by "Texas".
Likewise, the following statement tells the C preprocessor to replace SUM with the string (12 + 8):
#define SUM (12 + 8)
On the other hand, you can use the #undef directive to remove the definition of a macro name that has been
previously defined.
The syntax for the #undef directive is
#undef macro_name
Here macro_name is an identifier that has been previously defined by a #define directive.
The #undef directive "undefines" a macro name. For instance, the following segment of code:
#define STATE_NAME "Texas"
printf("I am moving out of %s.\n", STATE_NAME);
#undef STATE_NAME
defines the macro name STATE_NAME first, and uses the macro name in the printf() function; then it removes the
macro name.