Skip to main content

Comparing Compiler Options and Output Files (Lab 2)

Depending on the way a file is compiled, it will have different run time speeds, file sizes, and Assembly code. This is explained below in 7 different compilations of a C program:
 
 **All images will be hosted here [click the Assembly Code Pictures.docx green text to download] to make this post lighter on the planet, every picture reference will be made by figure followed by a number.**

1: gcc lab2.c -g (debugging) -O0 (no optimization) -fno-builtin (no builit in function optimization)

When exploring the Assembly version of this code we use objdump -d lab2.c which tells us a few things. Firstly, the code I wrote is in the section <main> in figure 1. Secondly, the statement to which I am calling (a simple "Hello World!" printf statement) is displayed in figure 2. Lastly, with objdump -h we can see there are 30 section headers and using ls-1h we can see the file is approximately 72 kb.

2:  gcc lab2.c -static -g -O0 -fno-builtin

Without explaining -static, lets see what it does to our file size, section headers, and function call. After compilation the file size is approximately 815kb, many times bigger than our dynamically compiled program! We can also see that we have 2 more section headers totalling 31 which are assumed to be unneeded. Lastly, the function call makes a reference to a different section header  <_IO_printf> shown in figure 3. This section appears to be much larger and slower than the printf counterpart as seen in figure 4.

3: gcc lab2.c -g -O0
 Without including the no function optimization option, we notice an important change in function call, it now references <puts@plt> noted in figure 5. This section is also quite small, noted in figure 6.

4: gcc lab2.c -O0 -fno-builtin

Without including debugging features and information we notice some differences in size, section headers, and disassembly output. Firstly, the size is slightly lower, now approximately 70kb instead of 72kb. Secondly, there are now only 25 section headers, significantly less than the original 30. Lastly the disassembly output is significantly smaller than the original output.

5: gcc lab2.c -g -O0 -fno-builtin

This time, I added more print arguments, the statement looks like the one below.

printf("Hello World!\n%d\n%d\n%d\n%d\n%d\n%d\n%d\n%d\n%d\n%d", 1,2,3,4,5,6,7,8,9,10);

We notice in figure 7 that each number is assigned a register specific to the architcture being used, which in my case is an Aarch64 architecture.

6: gcc lab2.c -g -O0 -fno-builtin

Changing it up again, I now switched the basic "Hello World!" statement to a void function called output() and called it in the main function. The result is the code in figure 8, where the <main> section now calls the <output> section which calls the previously mentioned <printf@plt>. This occurs becuase the assembler must now dig for the code we want to execute through the external function call.

7: gcc lab2.c -g -O3 -fno-builtin

Using a full optimization with O3 option while again using the standard "Hello World!" printf statement yields a very unexpected result. While still not allowing the function optimization the program cannot rewrite very much code, and yields a result nearly indentical to the O0 compilation option.

These tests with compiler options have allowed for some conclusions to be made. Firstly, for the best run time speeds following program completion, it is crucial to have the -O3 compilation flag and minimal function calls. Secondly, for debugging purposes, the -g option should always be added as a compilation option. Lastly, the program should be only knowingly compiled with the -static compiler flag as it appears to include the entire library in the assembly code instead of calling pieces when needed.





Comments

Popular posts from this blog

Final Project Part 01 - Final Summary

To end part 1 I will summarize what information I have gathered for part 2: I am optimizing clib, a package manager. After benchmarking, it is clear there is a significant time delay in some advanced calls of the SHA1 function, such as ones that call update many times. To optimize, I am going to add the -O3 flag and remove a loop condition (currently). Some other observations: This project is relatively small with no ./configure for other platforms. The Sha1 code is unique and does not conform to the simple sha1 algorithm such as on    Wikipedia . The documentation (i.e. README) is relatively vague at describing the dependancies. It suggests only syntax that implies installation and isn't clear at documenting development vs. published code.   I have learned alot getting to this point in part 1. Firstly, I learned that library files can only be linked by using sudo ldconfig and the files must be in usr/lib. Secondly, I learned how to alter an advanced Makefile's fla

Final Project Part 02 - Sha1 Function Enhancements

To try to squeeze out a bit more performance I attempted to some compiler optimizations. Unfortunately, due to the sheer complexity of the algorithm, I was unable to find other logic complexities to simplify. I tried some loop unrolling to make the compiler have to work a little less, some examples are here below: I made a graph to demonstrate the minute differences this makes in the test vectors below: At most a few millisecond difference is all that can be acquired, and this is only from the finalcount[] array as the digest array produces errors if not compiled in a loop along with other for loops in the code. To test this I simply altered the sha1.c code and ran the make file to see if the vectors passed or failed. As mentioned this is a compiler optimzation, in other words it is completed already, especially at the -O3 level where the benchmarking was done. I would not  recommend this change to be pushed upstream normally due to the insignificant time change

Final Project Part 02 - Final Summary

In conclusion, the -O3 flag was the most important discovery with trying to optimize clib. It offered a significant speed up with no interference, and provided the chance to uniform a many times used function, strdup. Overall the function is built extremely well with very advanced logic. Attempting to alter said logic sprouted many errors and warnings and left only simple compiler optimizations such as loop unrolling which made small differences in speed up. Clib as a whole is a great idea, offering many compartmentalized features for the C programming language that programmers could definitely find useful when developing. I hope in the future I can get more involved in writing code for open source projects such as clib whether that be doing optimization work or building from the ground up. This project not only gave me an insight on how these open source projects work but also at real world coding and improvement opportunities. I can honestly say now that I have had some experience