Q&A #43 (2024-02-06)

Playback speed

Share post at current time

Share from 0:00

0:00

Paid episode

The full episode is only available to paid subscribers of Computer, Enhance!

Q&A #43 (2024-02-06)

Answers to questions from the last Q&A thread.

Feb 07, 2024

∙ Paid

In each Q&A video, I answer questions from the comments on the previous Q&A video, which can be from any part of the course.

Questions addressed in this video:

[00:03] Going over the challenge homework from last week.
[12:38] “Could you please clarify notion of false dependency?
After watching video I've got an impression that false dependency is when we don't read from a register. Such a dependency doesn't actually create a dependency chain.
But AMD manual says the reason for slower 'mov al, [rdx]' is a false dependency on previous rax value. Why do they call it a 'false' dependency when its effects are real?”
[16:46] “On the homework of the execution ports video, when we're making a single read per loop, shouldn't the processor be able to use both ports by doing reads of different iterations of the loop in parallel given that it does speculative execution?”
[23:48] “I've ran read test on Zen3 machine and found out that the performance increases linearly up to 5 memory reads per iteration, and then even more up to 7 loads per iteration, albeit marginally.
However Zen3 documentation says that the core can do 3 loads per cycle. Am I seeing some other effect that will become clear later in the course?”
[25:35] “Regarding execution ports, is a SIMD operation like 'add 4 f32 numbers' handled as submitting 4 micro-ops to 4 'add' ports, or are there separate SIMD ports? Does this vary between CPUs?”
[28:52] “In relation to the discussion in the video, do you see any value in a computer architecture where the entire thing is only a 'GPU' plus memory, which runs the OS, and all software as well as rendering graphics?”

Computer, Enhance!

Paid episode

Q&A #43 (2024-02-06)

The full video is for paid subscribers