Each Monday I answer questions from the comments on the prior week’s videos. Transcripts are not available for Q&A videos due to length. I do produce closed captions for them, but Substack still has not enabled closed captions on videos :(
Questions addressed in this video:
[00:10] “In your example you choose not to write Label to profiler slot in profile_block(), but rather hold it in the struct until the ~profile_block(), where you write everything all at once into the slot. Is it the way to better utilise cache lines?”
[15:51] “Somewhat offtopic but would this C++ profiler still function well if used as bindings to another language? Other than target-language-specific details that may not map perfectly to C++'s features, are there any caveats when calling rdtsc + others as a binding to another language?”
[17:42] “Tangential question, is this type of work someone's job currently? Profiling and such of code? If so, what are those roles called? Ideally, I'd hope most devs know this stuff, but how is profiling or performance handled these days?”
[21:14] “there seems to be another edge case which is that the code I'm profiling makes use of coroutines in unity (somehow getting 126% without children). I haven't tried to solve the recursive problem yet for this weeks homework but I'm wondering if this is a separate issue we'd have to consider.”
[24:06] “I had two approaches in mind for the home work, one based on a central stack and one adding bit more data to the individual profile blocks and linking blocks to their parent instance via a pointer. What is a good way to measure which approach is faster and less invasive? We can't profile the profiler code with itself.”
[27:52] “Are there any tricks that make it easier to navigate the optimized assembly or is this more a question of better tooling?”