Users Online

· Guests Online: 31

· Members Online: 0

· Total Members: 188
· Newest Member: meenachowdary055

Forum Threads

Newest Threads
No Threads created
Hottest Threads
No Threads created

Latest Articles

FAQ: C Pre-Processor

FAQ (Frequently Asked Questions) >C Pre-Processor
01 What is Difference between Preprocessor and Compiler?02 What are the Steps of Compilation of a C Program on Linux System03 What is a C Preprocessor?04 Explains Preprocessor Directives and their Uses in C Program
05 Explain #define Directive and its Uses in a C Program06 Explain Macros in C with Examples
01 What is Difference between Preprocessor and Compiler?
Question1: What is Difference between Preprocessor and Compiler?
Answer: Though, the preprocessor is the first to look at the source code file and performs several preprocessing operations before it’s compiled by the compiler. Nevertheless, compiler sets the source code file, say “hello.c”, through several phases of translation before jumps into preprocessing. Actually, it prepares the file for being preprocessed by preprocessor.

Firstly, it does mapping of characters in source code file to source character set. It takes care of multi-byte characters and trigraphs. Secondly, it locates character combinations of backslash ‘\’ followed by “newline character” and deletes them. For example, the two physical lines,

printf("hello \
world\n");
becomes,

printf("hello world\n");

Newline character is one you get by pressing Enter key and not the character representation ‘\n’ or symbolic constant \n.

This preparation is a useful feature as preprocessor expressions are required to be one logical line. Although, each logical line can be one or more physical lines.

Next, compiler breaks the text into a sequence of tokens and sequences of white space and comments. A token is a group of words separated by white spaces. Then replaces each comment by single white space. For example,

int/* declaring an integer variable */value;
becomes,

int value;
After this, prepocessor processes source code. Because it processes the source code before it is compiled hence it’s called preprocessor. It starts it’s work by deleting the comments, following the #directives including header files on your request, replacing the replacement text for macros and symbolic constants, telling the compiler which block of code to accept or ignore. Preprocessor doesn’t know C. It simply takes text of one type and converts it to some other type.

Top
02 What are the Steps of Compilation of a C Program on Linux System
This C Tutorial explains Steps Involved in Compiling a C Program on Linux System.
Let’s understand GCC Compilation Process which comprises of following steps.

1. Preprocessing
2. Compilation
3. Assembly
4. Linking
5. Program Translation

Before talking of compiling and running C program in Linux let’s see why C is so popular ever since it was created. He was the Dennis Ritchie who developed C language in 1969 to 1973. C was developed from the beginning as the system programming language for UNIX. Most of the UNIX kernel, and all of its supporting tools and libraries, were written in C. Initially, C was designed to implement the UNIX operating system. Even today, C is the first choice for system-level programming. Here I explain compilation and execution of a simple C program in detail.

Let’s take a very simple C Program as an example and compile it in Linux – The Classic Hello World!

/* helloworld.c -- a simple C program */
#include <stdio.h>
int main()
{
printf("Hello World!n");

return 0;
}
To compile and run this C program every part of the system has to perform in concert. In order to compile above C program in Linux, we will start right from the creation of the program. The “Hello World!” program starts its life as a source file which is created with help of a text editor and saved as helloworld.c. The helloworld.c program code is stored in a file as a sequence of bytes. Each byte has a value corresponding to some character. The first byte has the value 35 that corresponds to the character ‘#’, for example. Likewise, the second byte has the integer value 105, which corresponds to the character ‘i’, and so on. The idea illustrates that all information in a system is represented as a bunch of bits.

To compile and run the C program helloworld.c, all C statements must be translated individually into a sequence of instructions that a machine can understand. These instructions are then packaged in a form called executable object program. There are other programs which perform this task to get the program running. On a Linux system, the translation from source code to object code (executable) is performed by a compiler driver. Here we will compile C program by gcc.

The following command compiles C program helloworld.c and creates an executable file called helloworld. Don’t forget to set appropriate permissions to helloworld.c, so that you won’t get execute permission errors.

gcc -o helloworld helloworld.c

While compiling helloworld.c the gcc compiler reads the source file helloworld.c and translates it into an executable helloworld. The compilation is performed in four sequential phases by the compilation system (a collection of four programs – preprocessor, compiler, assembler, and linker).

Now, let’s perform all four steps one by one and understand each independently.

1. Preprocessing

During compilation of a C program the compilation is started off with preprocessing the #directives (e.g., #include and #define). The preprocessor (cpp – c preprocessor) is a separate program in reality, but it is invoked automatically by the compiler. For example, the #include command in line 1 of helloworld.c tells the preprocessor to insert the contents of the system library header file stdio.h directly into the program text at its place. The result is another file typically with .i suffix. In practice, the preprocessed file is not saved to disk unless the -save-temps option is used.

This is the first stage of compilation process where preprocessor directives (macros and header files etc.) are expanded. To perform this step gcc executes the following command internally.

[root@host ~]# cpp helloworld.c > helloworld.i

The result is a file helloworld.i that contains the source code with all macros expanded. If you execute the above command in isolation then the file helloworld.i will be saved to disk and you can see its content using vi or any other editor you have on your Linux box.

2. Compilation

In this phase compilation proper takes place. The compiler (ccl) translates helloworld.i into helloworld.s. File helloworld.s contains assembly code. You can explicitly tell gcc to translate helloworld.i to helloworld.s by executing the following command.

[root@host ~]# gcc -S helloworld.i
The command line option -S tells the compiler to convert the preprocessed code to assembly language without creating an object file. After having created helloworld.s you can see the content of this file. While looking at assembly code you may note that the assembly code contains a call to the external function printf.

3. Assembly

Here, the assembler (as) translates helloworld.s into machine language instructions, and generates an object file helloworld.o. You can invoke the assembler at your own by executing the following command.

[root@host ~]# as helloworld.s -o helloworld.o
The above command will generate helloworld.o as it is specified with -o option. And, the resulting file contains the machine instructions for the classic “Hello World!” program, with an undefined reference to printf.

4. Linking

This is the final stage in compilation of “Hello World!” program. This phase links object files to produce final executable file. An executable file requires many external resources (system functions, C run-time libraries etc.). Regarding to our “Hello World!” program you have noticed that it calls the printf function to print the ‘Hello World!’ message on console. This function is contained in a separate pre compiled object file printf.o, which must somehow be merged with our helloworld.o file. The linker (ld) performs this task for you. Eventually, the resulting file helloworld is produced, which is an executable. This is now ready to be loaded into memory and executed by the system.

The actual link command executed by linker is rather complicated. But still, if you passionate enough you can execute the following command to produce the executable file helloworld by yourself.

[root@host ~]# ld -dynamic-linker /lib64/ld-linux-x86-64.so.2 /usr/lib64/crt1.o /usr/lib64/crti.o /usr/lib64/crtn.o helloworld.o /usr/lib/gcc/x86_64-redhat-linux/4.1.2/crtbegin.o -L /usr/lib/gcc/x86_64-redhat-linux/4.1.2/ -lgcc -lgcc_eh -lc -lgcc -lgcc_eh /usr/lib/gcc/x86_64-redhat-linux/4.1.2/crtend.o -o helloworld
And, you can then run final executable file helloword as follows:

[root@host ~]# ./helloworld

Output:

hello, world!
I executed the above command on an x86_64 system having gcc 4.6.0. It might be that above command does not work on your system as it is. It all matters that where the libraries located?

For you, there is no need to type the complex ld command directly – the entire linking process is handled transparently by gcc when invoked, as follows.

[root@host ~]# gcc helloworld.c -o helloworld
During the whole compilation process there are other files also in role along with the source code file. If you see the very first statement of helloworld.c it is #include (includes header file). Likewise, while compiling a C program you have to work with following types of files.

5. Program Translation

Source code files: These files contain high level program code which can be read and understood by programmers. Such files carry .c extension by convention.

Header files: These types of files contain function declarations (also known as function prototypes) and various preprocessor statements. They are used to allow source code files to access externally-defined functions. As a convention header files have .h extension.

Object files: These files are produced as an intermediate output by the gcc compiler during program compilation. They consist of function definitions in binary form, but they are not executable by themselves. Object files end with .o extension by convention.

Binary executables: These are produced as the output of a program called a linker. During the process of compiling and running C program the linker links together a number of object files to produce a binary file which can be directly executed. Binary executables have no special suffix on LINUX like operating systems.

Along with above four types of files, while compiling a C program you can come across with .a and .so, static and shared libraries respectively, but you would not normally deal with them directly.

Top
03 What is a C Preprocessor?
Question3: What is a C Preprocessor?
Answer: Preprocessor is a program that performs textual substitutions on source code even before program is compiled. Basically, this deletes the comments, inserts the code of specified #included headers into the program, defines and substitutes the #define symbols and macros and sets, using conditional compilation, which fragment of code to be processed or skipped by the compiler. Let’s take
an example,

/*
* cpp_functions.c -- program shows different functions performed by C
* pre-preprocessor
*/
#include <stdio.h>
#include <stdlib.h>

/* TRUE and FALSE are symbolic constants */
#define TRUE 1
#define FALSE 0

/* conditional compilation */
#if (TRUE)
#include <string.h>
#elif (!FALSE)
#include <ctype.h>
#else
#include <math.h>
#endif

#define MAX(a,b) ((a) > (b) ? (a) : (b)) /* MAX is a macro */


int main(void)
{
int x = 10, y = 20;
float u = 12.34, v = -0.98;
double s = 113.563, t = 34.65;

/* Let's use Macro MAX(a,b) to compare two values */
printf("The greater of %d and %d is: %d \n", x, y, MAX(x, y));
printf("The greater of %f and %f is: %.2f\n", u, v, MAX(u, v));
printf("The greater of %lf and %lf is: %.6lf\n", s, t, MAX(s, t));

return 0;

}
Above program doesn’t do any useful. Nevertheless, it used preprocessor directives #include, #define which cause preprocessor to define and substitute them into program before being compiled. Firstly, it inserts the contents of headers #include and #include into their respective places as we had typed in those contents there. Then defined symbolic constants TRUE and FALSE. Next was a piece of conditional compilation code which specified for given condition which header to be
further included and upon decision inserted it’s contents in it’s respective place into the program. Then came a defined macro called MAX(a,b).

In main() we used MAX() to compare values. Preprocessor performs textual substitutions on the source code. Wherever it found symbolic constants substituted them with their respective definitions. For example,

TRUE

is substituted with 1

FALSE
is substituted with 0

MAX(10, 20)
is substituted by

((10) > (20) ? (10) : (20));
Well! How does this substitution take place? Preprocessor, firstly, examines macro arguments to see if they contain any #defined symbols. And then sunstitution text is inserted in place of original text into the program. For macros, Argument names are replaced with their values. Code is scanned again to see if there are any #defined symbols. If so, process is repeated again and again.
Top
04 Explains Preprocessor Directives and their Uses in C Program
This C Tutorial explains Preprocessor Directives or Symbols and their Uses in a C Program.
Every C program contains one or more functions which are prototyped in standard library header files. To use functions we need to include them in program. This inclusion is done by preprocessor directives #include. Preprocessor when operates on the source code, stumbles upon such #include directives, for ex. #include , then substitutes contents of such header files in the program at the place of respective #include directive. For ex.,

/* cprog.c */
#include <stdio.h>

int main(void)
{
printf("Hello World!\n");
return 0;
}
In above program, printf() is prototyped in stdio.h header file. Therefore, stdio.h included in program. The system library file, for ex. #include , also contains preprocessor directive EOF which marks end of each file. This is defined as,

/*
* End of file character.
* Some things throughout the library rely on this being -1.
*/
#ifndef EOF
# define EOF (-1)
#endif
Preprocessor symbols all begin with ‘#’ character. Basically, these symbols include #include, #define directives, for ex.,

#include <stdio.h>
#include <stdlib.h>

#define TRUE 1
#define FALSE 0

#define SIZE 512

#define MAX(a,b) ((a) > (b) ? (a) : (b)) /* Macro */
All preprocessor directives play textual substitution in program. #include directive includes contents of standard library file in program, #define directives, for ex.,

#define SIZE 512
replaces every occurrence of SIZE with value 512, #define macros, for ex.,

#define MAX(a,b) ((a) > (b) ? (a) : (b))

inserts replacement text ((a) > (b) ? (a) : (b)) wherever MAX(a,b) appears in program. Besides, conditional compilation makes compiler see what fragment of code to compile and what to skip. For ex.,

#if constant_expression
linux_ver1();
#elif constant-expression
linux_ver2();
#else
unix_os();
#endif
Every #if construct ends with it’s matching #endif construct. Notice that constant_expression above identifies #define symbols , macros or literal constants. Any variables whose value can’t be ascertained during compile time aren’t valid candidates. Preprocessor doen’t compute the expressions. therefore, C program variables can’t be valid constant expressions.


Top
05 Explain #define Directive and its Uses in a C Program
This C Tutorial explains #define Directive in a C Program.
#define directives are preprocessor directives. These are either symbolic constants or macros or conditional compilation constructs or other various directives. Let’s see first symbolic constants, for ex.,

#define NAME "What is your name?"
#define SIZE 512
#define PI 3.14

#define FOREVER for(;;)
#define PRINT printf("values of x = %d and y = %d.\n", x, y)
Notice that we haven’t used ‘;’ to terminate the replacement texts in each #defined symbol above. Actually, when we use them in program, for ex.

int main(void)
{
char buf[SIZE];
int x = 10, y = 20;

FOREVER; /* ; used here */

PRINT;
x++;
y++;

PRINT;

return 0;

}
‘;’ is used to terminate the symbolic statement as any other C statement. Let’s see what happens when preprocessor operates on the main() above, it becomes,

int main(void)
{
char buf[512];
int x = 10, y = 20;

for(;;);

printf("values of x = %d and y = %d.\n", x, y);
x++;
y++;

printf("values of x = %d and y = %d.\n", x, y);

return 0;

}
So what do you think how will it affect the program execution if you have used ‘;’ in the replacement text, for ex.,

#define PRINT printf("values of x = %d and y = %d.\n", x, y); /* ';' */
and then used #define PRINT in program and preprocessed it, output results as follows,

int main(void)
{
printf("values of x = %d and y = %d.\n", x, y);;

return 0;
}
If you notice the main() above, the extra ‘;’ doesn’t affect the program execution. But now consider,

#include <stdio.h>
#define TRUE 1
#define PRINT printf("values of x = %d and y = %d.\n", x, y);

int main(void)
{
if (TRUE)
PRINT;
else
printf("Bye!\n");

return 0;
}
Notic that extra ‘;’ causes compilation error in the program. So, it’s always a rule to use ‘;’ to terminate the symbolic constants in program and not in their definitions.

Let’s now consider #define macros, for ex.,
#define MUL(a,b) a * b
and use MUL(a,b) in program as,

int main(void)
{
printf(“product of 5 and 10 is %d\n”, MUL(5,6));
return 0;
}

Of course, output displays as

product of 5 and 10 is 50
Now, guess what’ll be the output if you use macor arguments as,

int main(void)
{
int x = 10, y = 10;

printf("product of %d and %d is %d\n",
x + 1, y + 1, MUL(x + 1, y + 1));

}
Output displays as,

product of 11 and 11 is 21
It’s not correct at all! Why? Where have execution gone wrong? Let’s explore this by substituting

macro MUL(x + 1, y + 1) by it’s definition in the printf() below,

printf("product of %d and %d is %d\n",
x + 1, y + 1, MUL(x + 1, y + 1));
Preprocessor does substitution as follows,

1496378545662977381308_000011
And cause gets cleared! Firstly,

1 * y
multiplied, expression became

10 + 10 + 1
which resulted 21. This problem can be easily fixed by using parenthesis in macro declaration as

#define MUL(a,b) ((a) * (b))
Remember that every operand and entire arithmetic expression in macros definition should be properly parenthesised to avoid unexpected results because of adajacent operators in expressions, and within the macro definitions.

Actually, symbolic constants make programs easily maintainable by allowing them to be modified just at their place of declaration irrespective of size of program and how many different places have these been used in program! For ex.,

#include <stdio.h>
#define SIZE 10

void disp(char []);

int main()
{
char name[SIZE] = {'a', 'b', 'c', 'd', 'e', 'f', 'g',
'h', 'i', 'j'};
disp(name);
return 0;

}

void disp(char name[])
{
int i;

printf("name is: ");
for (i = 0; i < SIZE; i++) {
printf("%c", name[i]);
}

printf("\n");
}
Notice that had symbolic constant SIZE even required to be modified, it would have to be updated at one place in the program irrespective of program size. Further, scope of #define symbols is throughout the program from the point of declaration.

Let’s come to function-like macros. Where do macros play role in C programs? We have already used MUL(a,b) macro to compute product of two values. What type of values were those? We had used integers there. Of course, we can use MUL(a,b) to compute product of any type of values, whether integers, floats, double, unsigned integers, etc. This means, macros are typeless! Well! We could rather use functions but then we should have declared functions, one for each type of arguments. In addition,
there’s overhead in function call and return. Therefore, macros are efficient over functions where they are very small in size.


Top
06 Explain Macros in C with Examples
This C Tutorial explains Macros in C with examples.
Macros are like functions but aren’t true functions. For ex.,

#define MAX(a,b) ((a) > (b) ? (a) : (b))
Let’s use it in a C program,

#include <stdio.h>
#define MAX(a,b) ((a) > (b) ? (a) : (b))

int main(void)
{
int x = 10, y = 15;
float u = 2.0, v = 3.0;
double s = 5, t = 5;

printf("Max of two integers %d and %d is: %d\n", x, y, MAX(x,y));
printf("Max of two floats %.2f and %.2f is: %.2f\n", u, v, MAX(u,v));
printf("Max of two doubles %.2lf and %.2lfis: %lf\n", s, t, MAX(s,t));

return 0;
}
Notice the output as follows,

Max of two integers 10 and 15 is: 15
Max of two floats 2.00 and 3.00 is: 3.00
Max of two doubles 5.00 and 5.00is: 5.000000
Notice that same macro MAX(a,b) evaluated larger of two integers, two floats, two doubles etc.. This means macros are typeless. Had we used functions instead, we had have required three, one for each type of values. Besides, there’s overhead in functions’ calling and returning.

Then which of these two is more efficient than other? Each has its own merits and demerits. Of course, macros are efficient than true functions when they are very short, one or two lines of code as for ex. MAX(a,b). Since preprocessor performs textual substitution in the program, therefore, it substitutes every occurrence of macro with its repacement text making program size huge unless macro is very small. Though, it doesn’t affect runtime efficiency of the program however it slows down compilation process. Every program has a single copy of each function which is called by calling program whenever required.


Top
Render time: 0.72 seconds
10,800,060 unique visits