Part 1 After given an Inline assembler version of the volume program I made in the last lab, I got some results that shocked me. After running it with the same 500,000,000 sample size It took only 1.2 seconds of computing time, which is better than even the best variant (bit-shifting) of the program I had made by over 50%. I answered some questions below to further my understanding: 1. What is another way of defining variables instead of the (type name register) format? This can be done using normal type variables as the compiler will automatically put values into registers. 2. For the line vol_int = (int16_t) (0.5 * 32767.0); should 32767 or 32768 be used? 32767 should be used because the int will round the value and 32768 is not in the int16_t range. 3. What does __asm__("dup v1.8h,w22"); do? The duplicate simply means copy the int value into a new vector register . This is for SIMD instructions. 4. What happens if we remove : "=r"(in_cursor)
In conclusion, the -O3 flag was the most important discovery with trying to optimize clib. It offered a significant speed up with no interference, and provided the chance to uniform a many times used function, strdup. Overall the function is built extremely well with very advanced logic. Attempting to alter said logic sprouted many errors and warnings and left only simple compiler optimizations such as loop unrolling which made small differences in speed up. Clib as a whole is a great idea, offering many compartmentalized features for the C programming language that programmers could definitely find useful when developing. I hope in the future I can get more involved in writing code for open source projects such as clib whether that be doing optimization work or building from the ground up. This project not only gave me an insight on how these open source projects work but also at real world coding and improvement opportunities. I can honestly say now that I have had some experience