Inline Assembler (Lab 7)

Part 1

After given an Inline assembler version of the volume program I made in the last lab, I got some results that shocked me. After running it with the same 500,000,000 sample size It took only 1.2 seconds of computing time, which is better than even the best variant (bit-shifting) of the program I had made by over 50%. I answered some questions below to further my understanding:

1. What is another way of defining variables instead of the (type name register) format?

This can be done using normal type variables as the compiler will automatically put values into registers.

2. For the line vol_int = (int16_t) (0.5 * 32767.0); should 32767 or 32768 be used?

32767 should be used because the int will round the value and 32768 is not in the int16_t range.

3. What does __asm__("dup v1.8h,w22"); do?

The duplicate simply means copy the int value into a new vector register. This is for SIMD instructions.

4. What happens if we remove : "=r"(in_cursor) : "r"(limit), "r"(in_cursor), "r"(out_cursor));

If these lines are removed the Inline assembler will fail because it will not know where to get and place the values. It also violates inline asm syntax.

5. Are the results correct?

Seeing as the values are multiplied by the volume sample and sent to output without manipulation (besides rounding) these results are correct and usable.

This is a fantastic approach to speed up, however, asm is architecture specific, so it will only be a speed performance here and alternate logic would need to be made elsewhere.

Part2

I chose the aMule package for the individual research part of this lab. According to their site, they are an "all-platform p2p client". I did an investigation to find and analyze Inline Assembler code. I answered the following questions about my findings:

1. How much assembly-language code is present?

Very trace amounts of assembly is present, mainly used for rotation (i.e. rol) and a define statement.

2. Which platform(s) it is used on?

This works with all x86_64 architectures.

3. Why it is there (what it does)?

It is there to easily define the rotate function they are using with the assembly built in rol feature.

4. What happens on other platforms?

On other platforms a C implementation is defined with bit shifting.

5. What is your opinion of the value of the assembler code VS the loss of portability/increase in complexity of the code?

Personally, I think this assembly implementation is very smart because it is easy to write and increases the performance of the software dramatically as we have seen in the volume program. The code is only a few lines in either case so the speed up is definitely worth it especially for many calls in my opinion.

SPO600

Search This Blog

Inline Assembler (Lab 7)

Comments

Post a Comment

Popular posts from this blog

Final Project Part 02 - Sha1 Function Enhancements

Final Project Part 02 - Final Summary