Paid episode

The full episode is only available to paid subscribers of Computer, Enhance!

Q&A #41 (2024-01-23)

Answers to questions from the last Q&A thread.

In each Q&A video, I answer questions from the comments on the previous Q&A video, which can be from any part of the course. Transcripts are not available for Q&A videos due to length. I do produce closed captions for them, but Substack still has not enabled closed captions on videos :(

Questions addressed in this video:

  • [02:46] “For position data (float3) do you use SoA/AoSoA layouts or do you use AoS?”

  • [15:54] “You say that there's no reason to free memory if your program is about to exit since that memory will be reclaimed. Is that reclaiming handled by the OS? If so, what happens if you are working in a scenario where you don't have an OS to take care of that for you? What do you have to do differently?”

  • [23:03] “I've heard that some LLM/AI software does run-time code compilation/generation in order to get the best performance out of the CPU, since CPU perf varies so widely between architectures and even between different CPUs from the same architecture. What are your thoughts on this, and more generally on the practice of run-time "tests" used to compile/generate high-performance code?”

  • [26:55] “So someone optimizing assembly code can totally ignore any potential false dependencies created by reusing registers? Or are there some patterns that can trip up the RAT?”

  • [30:58] “Does a register file have a slot for zero value? That would make 'mov rax, 0' a slot change.”

  • [33:13] “Casey, I'd like to hear your perspective on the following (and disagreement is absolutely welcome). Many people have talked about trying to find software jobs that are more performance-oriented. That's not a bad thing, but those people (myself included) should equally consider staying where they already are and planting a performance-oriented mindset there. Yes, our industry is suffering from an epidemic of self-serving waste. And yes, there are certainly organizations beyond repair. But I'm sure many others can be salvaged through effort, hard data, and compelling results. To put it another way, sometimes you've got to abandon ship, but sometimes you just need to start bailing water.”

  • [36:31] “As you described the RAT, it keeps tracks of every register value changes. Something bother me though: inherently the RAT will grow and have useless entries.

    I can see a hardware component checking time to time and clean the RAT's entries, like a garbage collector, is that a thing (seems having poor performance though) ? Or is it the scheduler's role (more probable)?”

  • [42:33] “Can the back end make execution optimization on uops ? Like skip an instruction because you did let's say add rax, 1 dec rax, 1?”

  • [47:26] “Is status register also translated through RAT?”

The full video is for paid subscribers

Programming Courses
A series of courses on programming topics.