Skip to main content

Loops in Assembly Architectures (Lab 3)

The modern C and C++ languages are often taken for granted for their simple syntax and compiler intuition. In Assembly, every variable and value are stored in registers, simple output lines are multiple statements in length, and a number can only be 0 to 9. Below I examine two different architectures, X86_64 and aarch64:


x86_64

The loop I have created that outputs Loop: number thirty times can be found here. Writing this was extremely difficult, constantly fighting a lack of good documentation and examples online. To write this code I viewed many examples online, wrote pseudo code, and wrote the program in C++ before beginning. Debugging was difficult as the errors given were vague or rididculous at times (no // style commenting comes to mind). The Visual Studio compiler for writing C or C++ for example, is alternatively much more informative and allows for easy: break points, live variable information, and detailed information on error codes. I dislike X86 for the limited amount of available registers but like it for it's simpler syntax.

aarch64

The loop previously mentioned can be found here. I found aarch64 harder to write because of it's complicated syntax and was very grateful I wrote the X86_64 code first as it was simply rewriting it. The debugging in aarch64 converesely is superb and actually offers suggestions on how to fix code (i.e. did you mean...?). aarch64 has many available registers however some syntax issues are strange as although x (register#) works in most cases it sometimes uses w (register#) for some lines. I found it not too time consuming to write this however I did get stuck on the division for a little while due to the unneccessarily complicated syntax.


In conclusion, for the amount of time it took to write in assembly the pros of smaller size and faster compile time are not worth the frustration of writing it. This lab however was an excellent experience for seeing how the compiler interprets code from higher level languages.  

Comments

Popular posts from this blog

Final Project Part 01 - Final Summary

To end part 1 I will summarize what information I have gathered for part 2: I am optimizing clib, a package manager. After benchmarking, it is clear there is a significant time delay in some advanced calls of the SHA1 function, such as ones that call update many times. To optimize, I am going to add the -O3 flag and remove a loop condition (currently). Some other observations: This project is relatively small with no ./configure for other platforms. The Sha1 code is unique and does not conform to the simple sha1 algorithm such as on    Wikipedia . The documentation (i.e. README) is relatively vague at describing the dependancies. It suggests only syntax that implies installation and isn't clear at documenting development vs. published code.   I have learned alot getting to this point in part 1. Firstly, I learned that library files can only be linked by using sudo ldconfig and the files must be in usr/lib. Secondly, I learned how to alter an advanced Makefile's fla

Comparing Open Source Software Packages (Lab 1)

This post examines code review processes to understand how and where to look to push code upstream OpenLDAP This software is best described on the application's project overview page as a " robust, commercial-grade, fully featured, and open source LDAP suite of applications and development tools ". ( https://www.openldap.org/project/ )   This software operates under it's own OpenLDAP public license and accepts patches through the OpenLDAP Issue Tracking System . Patches are approved by the OpenLDAP Core Team , most noticeably Howard Chu and Kurt Zeilenga. An example of a closed patch is Contribution# 5410 where a developer Peter O'Gorman added a patch to allow building of a module with a different compiler addressed to Howard Chu (Chief Architect). The issue was concluded over nine days after two replies to the original message.  Change was implemented and the developer was very prompt (three hours after Architect reply) to respond. The Issue Tracker

Final Project Part 02 - Sha1 Function Enhancements

To try to squeeze out a bit more performance I attempted to some compiler optimizations. Unfortunately, due to the sheer complexity of the algorithm, I was unable to find other logic complexities to simplify. I tried some loop unrolling to make the compiler have to work a little less, some examples are here below: I made a graph to demonstrate the minute differences this makes in the test vectors below: At most a few millisecond difference is all that can be acquired, and this is only from the finalcount[] array as the digest array produces errors if not compiled in a loop along with other for loops in the code. To test this I simply altered the sha1.c code and ran the make file to see if the vectors passed or failed. As mentioned this is a compiler optimzation, in other words it is completed already, especially at the -O3 level where the benchmarking was done. I would not  recommend this change to be pushed upstream normally due to the insignificant time change