Performance-Aware Programming Series
This series is designed for programmers who know how to write programs, but don’t know how hardware runs those programs. It’s designed to bring you up to speed on how modern CPUs work, how to estimate the expected speed of performance-critical code, and the basic optimization techniques every programmer should know.
The course is broken into parts, with the first part (the “prologue”) being strictly a demonstration with no associated homework. Later parts feature weekly homework.
Q&A session videos are posted every Monday, and cover the comments on the posts from the previous week.
Prologue: The Five Multipliers (3 1/2 hours, no homework)
Welcome to the Performance-Aware Programming Series! (22:05)
Waste (32:56)
Instructions Per Clock (22:05)
Caching (22:55)
Multithreading (32:11)
Python Revisited (36:22)
Interlude (1 hour, no homework)
The Haversine Distance Problem (30:28)
Part 1: 8086/8088 (in progress)
Instruction Decoding on the 8086 (28:28 +2)
Decoding Multiple Instructions and Suffixes (43:51 +2)
Opcode Patterns in 8086 Arithmetic (20:01 +2)
8086 Decoder Code Review (1:17:49)
Simulating Non-memory MOVs (18:00 +3)
[Multi-part: When Branches Were Branches]
[Multi-part: When Memory Was Memory]
[Multi-part: When Cycles Were Cycles]
Part 2 of the course has not yet been scheduled.
I became a paid subscriber solely for this course. I am SUPER stoked!!
Can’t wait for this!