Decoding Multiple Instructions and Suffixes
Things get more complicated when different opcodes and variable instruction lengths are added to the mix!
This is the second video in Part 1 of the Performance-Aware Programming series. Please see the Table of Contents to quickly navigate through the rest of the course as it is updated weekly. Homework data is available on the github. A lightly-edited transcript of the video appears below.
In the previous video I gave you some homework where I asked you to decode a register-to-register move instruction on the original 8086. I hope it wasn't too difficult! I tried to pick something simple for the first assignment, but at the end of the day, nothing is ever really that simple in the land of 8086. Even a simple instruction is kind of tricky to decode, as you may have noticed!
I hope it went okay, but if you struggled with it, that's perfectly fine! At the end of these instruction decode videos, I will go over the code that I wrote as a reference, and you can see potential solutions to any problems you may have had. For now, what I'd like to do is add a little bit more to the problem. I'd like to give you a chance to work out a few more details of 8086 instruction decode, just to get you in the mindset of a CPU and what it really does.
Last time we talked about the two-byte encoding for register-to-register move:
In this encoding, we relied on the MOD field to tell us it was a register-to-register move — when the MOD field was 11, that meant register-to-register.
But in the homework assignment, you could have just skipped that check because I was only asking you asking you to decode this specific kind of instruction. You didn't really have to check the MOD field, and honestly you didn’t really even have to check the 100010 opcode because every instruction was going to have those two things set the same way.
What I'd like to do now is expand the decoding so that you do have to check everything. This is just to really make sure you’ve got it all down, and also so you can see how the actual decoding process changes dramatically from instruction to instruction even when you're looking at the same assembly language mnemonic. Even if we just stick with MOV, as you’ll see, the assembler may generate radically different instruction encodings depending on what is being copied where.