<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Computer, Enhance!: Programming Courses]]></title><description><![CDATA[A series of courses on programming topics.]]></description><link>https://www.computerenhance.com/s/programming-courses</link><image><url>https://substackcdn.com/image/fetch/$s_!7DRL!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe4c646b6-92ad-4e9d-95da-e629e19689f4_800x800.png</url><title>Computer, Enhance!: Programming Courses</title><link>https://www.computerenhance.com/s/programming-courses</link></image><generator>Substack</generator><lastBuildDate>Wed, 29 Apr 2026 14:19:26 GMT</lastBuildDate><atom:link href="https://www.computerenhance.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Casey Muratori]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[computerenhance@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[computerenhance@substack.com]]></itunes:email><itunes:name><![CDATA[Casey Muratori]]></itunes:name></itunes:owner><itunes:author><![CDATA[Casey Muratori]]></itunes:author><googleplay:owner><![CDATA[computerenhance@substack.com]]></googleplay:owner><googleplay:email><![CDATA[computerenhance@substack.com]]></googleplay:email><googleplay:author><![CDATA[Casey Muratori]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Block Interleaving]]></title><description><![CDATA[Breaking up dependency chains to better suit the processor's out-of-order scheduling gets most of the benefit of in-order interleaving without requiring a fully interleaved instruction stream.]]></description><link>https://www.computerenhance.com/p/block-interleaving</link><guid isPermaLink="false">https://www.computerenhance.com/p/block-interleaving</guid><pubDate>Tue, 28 Apr 2026 22:45:56 GMT</pubDate><enclosure url="https://substack-video.s3.amazonaws.com/video_upload/post/195810514/9101c701-4c74-4b8f-a18d-de21b13ce504/transcoded-44129.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fEGb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704fff5b-a28e-4f85-9ea7-c36f478d496f_1920x826.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fEGb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704fff5b-a28e-4f85-9ea7-c36f478d496f_1920x826.jpeg 424w, https://substackcdn.com/image/fetch/$s_!fEGb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704fff5b-a28e-4f85-9ea7-c36f478d496f_1920x826.jpeg 848w, https://substackcdn.com/image/fetch/$s_!fEGb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704fff5b-a28e-4f85-9ea7-c36f478d496f_1920x826.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!fEGb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704fff5b-a28e-4f85-9ea7-c36f478d496f_1920x826.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fEGb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704fff5b-a28e-4f85-9ea7-c36f478d496f_1920x826.jpeg" width="1456" height="626" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/704fff5b-a28e-4f85-9ea7-c36f478d496f_1920x826.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:626,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:625880,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.computerenhance.com/i/195810514?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704fff5b-a28e-4f85-9ea7-c36f478d496f_1920x826.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fEGb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704fff5b-a28e-4f85-9ea7-c36f478d496f_1920x826.jpeg 424w, https://substackcdn.com/image/fetch/$s_!fEGb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704fff5b-a28e-4f85-9ea7-c36f478d496f_1920x826.jpeg 848w, https://substackcdn.com/image/fetch/$s_!fEGb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704fff5b-a28e-4f85-9ea7-c36f478d496f_1920x826.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!fEGb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F704fff5b-a28e-4f85-9ea7-c36f478d496f_1920x826.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>This is the eleventh video in Part 5 of the Performance-Aware Programming series. Please see the <a href="https://www.computerenhance.com/p/table-of-contents">Table of Contents</a> to quickly navigate through the rest of the course as it is updated, and <a href="https://github.com/cmuratori/computer_enhance">the code repository</a> for downloadable code listings.</em></p>
      <p>
          <a href="https://www.computerenhance.com/p/block-interleaving">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Q&A #84 (2026-04-20)]]></title><description><![CDATA[Answers to questions from the last Q&A thread.]]></description><link>https://www.computerenhance.com/p/q-and-a-84-2026-04-20</link><guid isPermaLink="false">https://www.computerenhance.com/p/q-and-a-84-2026-04-20</guid><pubDate>Tue, 21 Apr 2026 02:46:08 GMT</pubDate><enclosure url="https://substack-video.s3.amazonaws.com/video_upload/post/194840638/9e584af3-4ea6-4c55-ba8d-3951879bbcc0/transcoded-61489.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>In each Q&amp;A video, I answer questions from the comments on the previous Q&amp;A video, which can be from any part of the course.</em></p><p>The questions addressed in this video are:</p><ul><li><p><strong>[00:04]</strong> &#8220;Quick question (for Casey or anyone with a suggestion!): I am neither a novice nor an experienced programmer, and would like to learn enough C++ to write some useful code, but I am struggling to find good material.&#8221;</p></li><li><p><strong>[02:18]</strong> &#8220;In the episode &#8216;In-order Interleaving&#8217;, you have a loop that handles multiple elements per loop iteration. When the list of elements isn&#8217;t a multiple of your &#8216;elements per loop&#8217;, we need something after the loop to handle the last few remaining elements. My question is: Is there a particular way to write this &#8216;residual handling&#8217; part? Whenever I&#8217;ve had to do this in my own code it always felt a little awkward.&#8221;</p></li><li><p><strong>[13:25]</strong> &#8220;Hey Casey, what&#8217;s your take on Arm making their own chips now? Sincerely, an Arm engineer.&#8221;</p></li><li><p><strong>[17:05]</strong> &#8220;On the topic of growable arenas, concretely, how do you implement them with respect to the &#8216;layered&#8217; architecture? To grow them, naturally you would need to mmap or VirtualAlloc more memory, but if you&#8217;re far removed from the platform layer and aren&#8217;t writing a program that is frame-based (for example, a compiler) and thus have no opportune time to go back to the platform layer and request more memory, what are your strategies? I can only come up with round-tripping back to the OS using a platform-layer API like &#8216;RequestMoreMemory()&#8217;, or something like this. For more context, this would be for the kind of program that also cannot place upper-bounds on memory usage (again, like in a compiler), but still does not want to be malloc&#8217;ing/free&#8217;ing excessively. I&#8217;m keen to hear how you would approach this problem?&#8221;</p></li><li><p><strong>[26:14]</strong> &#8220;Have you had occasion to use coroutines to write algorithms that must pause/resume? E.g., protocols in networking or software in a hardware simulator.&#8221;</p></li></ul>
      <p>
          <a href="https://www.computerenhance.com/p/q-and-a-84-2026-04-20">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[In-order Interleaving]]></title><description><![CDATA[By handing the CPU an instruction stream it can execute in order, we can exceed the limits we hit when we rely on its out-of-order execution capabilities.]]></description><link>https://www.computerenhance.com/p/in-order-interleaving</link><guid isPermaLink="false">https://www.computerenhance.com/p/in-order-interleaving</guid><pubDate>Thu, 16 Apr 2026 03:09:52 GMT</pubDate><enclosure url="https://substack-video.s3.amazonaws.com/video_upload/post/194365258/709f78f7-1627-4ee5-bf05-c53a76e185a0/transcoded-47060.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CKM1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc63b4f35-7718-4927-85e6-2f53f12fff82_5634x2864.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CKM1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc63b4f35-7718-4927-85e6-2f53f12fff82_5634x2864.jpeg 424w, https://substackcdn.com/image/fetch/$s_!CKM1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc63b4f35-7718-4927-85e6-2f53f12fff82_5634x2864.jpeg 848w, https://substackcdn.com/image/fetch/$s_!CKM1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc63b4f35-7718-4927-85e6-2f53f12fff82_5634x2864.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!CKM1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc63b4f35-7718-4927-85e6-2f53f12fff82_5634x2864.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CKM1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc63b4f35-7718-4927-85e6-2f53f12fff82_5634x2864.jpeg" width="1456" height="740" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c63b4f35-7718-4927-85e6-2f53f12fff82_5634x2864.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:740,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2362184,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.computerenhance.com/i/194365258?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc63b4f35-7718-4927-85e6-2f53f12fff82_5634x2864.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CKM1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc63b4f35-7718-4927-85e6-2f53f12fff82_5634x2864.jpeg 424w, https://substackcdn.com/image/fetch/$s_!CKM1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc63b4f35-7718-4927-85e6-2f53f12fff82_5634x2864.jpeg 848w, https://substackcdn.com/image/fetch/$s_!CKM1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc63b4f35-7718-4927-85e6-2f53f12fff82_5634x2864.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!CKM1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc63b4f35-7718-4927-85e6-2f53f12fff82_5634x2864.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>This is the tenth video in Part 5 of the Performance-Aware Programming series. Please see the <a href="https://www.computerenhance.com/p/table-of-contents">Table of Contents</a> to quickly navigate through the rest of the course as it is updated, and <a href="https://github.com/cmuratori/computer_enhance">the code repository</a> for downloadable code listings.</em></p>
      <p>
          <a href="https://www.computerenhance.com/p/in-order-interleaving">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Q&A #83 (2026-03-11)]]></title><description><![CDATA[Answers to questions from the last Q&A thread.]]></description><link>https://www.computerenhance.com/p/q-and-a-83-2026-03-11</link><guid isPermaLink="false">https://www.computerenhance.com/p/q-and-a-83-2026-03-11</guid><pubDate>Thu, 12 Mar 2026 04:19:57 GMT</pubDate><enclosure url="https://substack-video.s3.amazonaws.com/video_upload/post/190667397/dd549d8f-4ef2-4367-8a4f-573842929d35/transcoded-194421.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>In each Q&amp;A video, I answer questions from the comments on the previous Q&amp;A video, which can be from any part of the course.</em></p><p>The questions addressed in this video are:</p><ul><li><p><strong>[0:00:03]</strong> &#8220;Building on using free lists to manage objects with complex lifetimes, how do you handle cases where an object type can vary in size?&#8221;</p></li><li><p><strong>[0:02:33]</strong> &#8220;Can you go into more detail about how to handle IO with queues vs callbacks? A small example on the fancy light-board would be nice.&#8221;</p></li><li><p><strong>[0:07:14]</strong> &#8220;With the recent widespread usage of AI to program in the industry and a fierce push from upper management to enforce usage of AI in development tasks, I am finding myself less and less motivated to program, and also depressed. I have mostly ceased my programming activities outside of working hours.</p><p>Do you have any advice on how to deal with the AI wave and also to continue motivated to program as a hobby and professionally in this environment?&#8221;</p></li><li><p><strong>[0:20:00]</strong> &#8220;Now that you are switching to Linux in full swing have you ported your codebases or what&#8217;s your strategy? I did the opposite thing and fled from Linux because I wanted to start my first big codebase in a more stable environment so it has been the year of the Windows desktop for me which after running a debloat script hasn&#8217;t actually been bad, nevertheless seeing Microsoft&#8217;s behavior does have me yearning for the penguin, and knowing how painful releasing something for Linux is has me wondering if programming against the windows API and using Wine for most of one&#8217;s graphical software is a valid way to proceed.&#8221;</p></li><li><p><strong>[0:29:50]</strong> &#8220;How much of the software you currently write is based on your own thoughts and creativity? I&#8217;ve found myself relying heavily on reading and using open-source projects where similar problems were already solved. After a while, this started to make me feel like an impostor. Do great programmers like yourself write their own code, or do they adapt and build on existing work?&#8221;</p></li><li><p><strong>[0:35:38]</strong> &#8220;This maybe jumping the gun a fair bit. but when you say back of the envelop calculation, does that include the time complexity of whatever algorithm to do correctly?&#8221;</p></li><li><p><strong>[0:39:45]</strong> &#8220;Have you ever written anything in ISPC? From my experience it seems much easier to write than rolling the simd myself, and much more portable.&#8221;</p></li><li><p><strong>[0:42:10]</strong> &#8220;Do you have any recommendations on how to use &#8216;pen &amp; paper&#8217; in the software development process? I&#8217;ve noticed that drawing a problem helps me visualize the problem easier than jumping straight into coding. I&#8217;m working with MIDI and writing a document explaining in my own words how does MIDI encode data helped me in unexpected ways. I&#8217;m wondering if I&#8217;m missing out on any other &#8216;pen &amp; paper&#8217; practice. Any useful tips?&#8221;</p></li><li><p><strong>[0:46:47]</strong> &#8220;Hello I come with another question. How does one go about implementing smooth window resizing with Dangerous Thread Crews? I have the two threads and do get non-blocking resizing but the UI contents of the window jiggle and stretch a bit when resizing (which gets way worse with vsync) and nothing I try (like adding synchronization and using WM_WINDOWPOSCHANGING to allow only one resize per frame) gets rid of the apparent discrepancy between the window size and the size my renderer targets that causes some ugliness. The only reference I have of a program that has smooth resizing and doesn&#8217;t achieve this by updating on each WM_PAINT is your refterm program but I don&#8217;t see any glaring differences between the way you rendered stuff there and it seems like there you just called GetClientRect on each update with no need for anything else... so am I just doing something dumb and missing something or is there some computer wizardry I need to invoke to fix this when programming more complex UIs?&#8221;</p></li><li><p><strong>[0:51:06]</strong> &#8220;In this talk about pathfinding in Age of Empires 2, around the 15-16 minute mark, he mentions turning off SIMD because they were losing floating point precision. I don&#8217;t understand why that was the case for them, can you provide more insight?&#8221;</p></li><li><p><strong>[1:01:33]</strong> &#8220;I&#8217;ve been trying to test how many simultaneous loads can I get with the repetition tester when I do 1, 2, 3 or 4 moves in a loop on the two machines I have access to. My laptop (which has Intel Skylake chip) reports that the memory bandwidth doubles only when I go from 1 to 2 moves per loop, which is expected and reproduces your results from the course. But when I do the same on my desktop (which has Intel Raptor lake chip), the results are the same. The Raptor lake apparently have two type of cores: P-cores and E-cores, where E-cores also have 2 ports capable of executing loads, while P-cores are supposed to be equipped with 3 ports of that type (at least that&#8217;s what I read in Agner Fog manual). To my understanding it means that I should see a bandwidth bump when going from 1 to 2 moves per loop, and when going from 2 to 3 moves per loop. But that doesn&#8217;t happen, I see only one bump (from 1 to 2 moves). I guess that there are some nuances with running the tester on this system that I might not be aware of. But one of them - which is clear to me - is that it is the OS who decides on which core should the tester be run on. So I set the affinity of the tester to CPU1 to make sure that it runs on the P-core. Process Explorer confirmed that it runs on CPU1. But I still could not see the improvement when going from 2 to 3 memory reads. Then I repeated the test with all the cores (one at a time), but I saw no difference in the results.</p><p>It is either my test setup that&#8217;s completely broken, or some other factor that I can&#8217;t see which prevents the bandwidth improvement of a 3 reads loop. I would be grateful if you could give me some pointers here.&#8221;</p></li><li><p><strong>[1:07:25]</strong> &#8220;What are your thoughts about the new dynamicdeopt in msvc that lets you run optimized builds that you can debug with full information because they switch the executable on the fly?&#8221;</p></li><li><p><strong>[1:08:15]</strong> &#8220;I&#8217;d like to ask you about the pass-by-ref vs pass-by-value &#8216;debate&#8217;. Traditional C++ advice is &#8216;always pass by const&amp; anything bigger than 8 bytes&#8217;, but I&#8217;ve recently started seeing some people advocate that 16 byte structs should also be pass-by-value. I know that you couldn&#8217;t care less about const, and you do seem to pass by value small stuff without worrying too much about it in your own code, so .. is there anything I&#8217;m failing to consider here? is this a stupid thing to worry about in the abstract, or is there some general principle that could be useful to keep in mind here?&#8221;</p></li></ul>
      <p>
          <a href="https://www.computerenhance.com/p/q-and-a-83-2026-03-11">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Dependency Chain Stalls]]></title><description><![CDATA[The CPU's ability to extract parallelism has its limits.]]></description><link>https://www.computerenhance.com/p/dependency-chain-stalls</link><guid isPermaLink="false">https://www.computerenhance.com/p/dependency-chain-stalls</guid><pubDate>Wed, 04 Mar 2026 03:31:39 GMT</pubDate><enclosure url="https://substack-video.s3.amazonaws.com/video_upload/post/189840675/0e97fb25-8257-4e41-9a4f-67b0d50eadf4/transcoded-77573.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AcnN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee94ce40-8301-48bb-be91-4060d4336723_1920x826.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AcnN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee94ce40-8301-48bb-be91-4060d4336723_1920x826.jpeg 424w, https://substackcdn.com/image/fetch/$s_!AcnN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee94ce40-8301-48bb-be91-4060d4336723_1920x826.jpeg 848w, https://substackcdn.com/image/fetch/$s_!AcnN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee94ce40-8301-48bb-be91-4060d4336723_1920x826.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!AcnN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee94ce40-8301-48bb-be91-4060d4336723_1920x826.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AcnN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee94ce40-8301-48bb-be91-4060d4336723_1920x826.jpeg" width="1456" height="626" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ee94ce40-8301-48bb-be91-4060d4336723_1920x826.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:626,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:625880,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.computerenhance.com/i/189840675?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee94ce40-8301-48bb-be91-4060d4336723_1920x826.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AcnN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee94ce40-8301-48bb-be91-4060d4336723_1920x826.jpeg 424w, https://substackcdn.com/image/fetch/$s_!AcnN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee94ce40-8301-48bb-be91-4060d4336723_1920x826.jpeg 848w, https://substackcdn.com/image/fetch/$s_!AcnN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee94ce40-8301-48bb-be91-4060d4336723_1920x826.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!AcnN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee94ce40-8301-48bb-be91-4060d4336723_1920x826.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>This is the ninth video in Part 5 of the Performance-Aware Programming series. Please see the <a href="https://www.computerenhance.com/p/table-of-contents">Table of Contents</a> to quickly navigate through the rest of the course as it is updated, and <a href="https://github.com/cmuratori/computer_enhance">the code repository</a> for downloadable code listings.</em></p>
      <p>
          <a href="https://www.computerenhance.com/p/dependency-chain-stalls">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Q&A #82 (2026-01-27)]]></title><description><![CDATA[Answers to questions from the last Q&A thread.]]></description><link>https://www.computerenhance.com/p/q-and-a-82-2026-01-27</link><guid isPermaLink="false">https://www.computerenhance.com/p/q-and-a-82-2026-01-27</guid><pubDate>Tue, 27 Jan 2026 23:58:15 GMT</pubDate><enclosure url="https://substack-video.s3.amazonaws.com/video_upload/post/186010676/f8099312-2207-4fc3-a203-77f3cc4aaad6/transcoded-03188.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>In each Q&amp;A video, I answer questions from the comments on the previous Q&amp;A video, which can be from any part of the course.</em></p><p>The questions addressed in this video are:</p><ul><li><p><strong>[01:02]</strong> &#8220;Could you suggest a resource to learn how to work with large datasets that don&#8217;t fit into VRAM, or even regular RAM?&#8221;</p></li><li><p><strong>[07:24]</strong> &#8220;I do not know how this compares to other approaches, as I&#8217;ve not really tried to optimize my profilers much as it&#8217;s not really my main area of work, but your profiling explanation made me think of a lock-debug profiler I built. It sounds similar to me at least. Do you agree?</p><p>In my solution, each worker thread owns its profiling buffers completely. I keep a thread-local pointer to a block-based buffer that grows in chunks when needed, and all writes are done by the thread with no locks. The only shared step is that when a thread creates its first block, it registers it once by pushing it into a global linked list so the UI can later iterate all threads. To avoid stalling the writers, I double-buffered it. Each thread allocates two buffers and writes into one of them.&#8221;</p></li><li><p><strong>[09:46]</strong> &#8220;About memory usage and the program stack, I believed that this stack could more easily fit in the L1 cache. Isn&#8217;t there a higher risk of cache misses at that level if the data to use has been allocated elsewhere? Would it even be noticeable in terms of performance?&#8221;</p></li><li><p><strong>[14:32]</strong> &#8220;What&#8217;s the best way to multithread a software rasterizer? In my experience, scan line interlacing per triangle was horrific.&#8221;</p></li><li><p><strong>[18:18]</strong> &#8220;Do you know why in the file processing test on the same machine on linux mmaping the file could be outperforming all other methods on medium and large mapped chunk sizes, and on windows the same test was worse than everyone else?&#8221;</p></li><li><p><strong>[21:18]</strong> &#8220;Regarding callbacks, I think I understood the argument about moving things to a queue instead. However, I also can see the benefit of a callback because that&#8217;s synchronous. What if the file IO read some data and stored in a buffer, and the callback now is supposed to handle that read data. If instead you move things to a queue, you&#8217;d have to keep allocating new buffers to keep working on those asynchronous reads, because you don&#8217;t know when the client code will handle those reads. If instead you can very quickly handle this buffer of read data synchronously in the callback, the file IO can reuse the same buffer to store the next chunk.&#8221;</p></li><li><p><strong>[25:45]</strong> &#8220;Can you steelman cases for which an arena/bump allocator (are these the same thing?) is not the preferred way to allocate memory (I imagine it is when lifetimes are not apriori known, but perhaps I am missing more subtlety)? In such cases, what is your preferred method of allocation? Are you forced to go back to new/delete?&#8221;</p></li><li><p><strong>[33:01]</strong> &#8220;Re: Dead Code Elimination Prevention Macros, I&#8217;ve got identical results with `asm volatile (&#8221;&#8220; : &#8220;+v&#8221;(Value));` which tells the compiler that `Value` is both input and output, forcing it to initialize it, as well as preventing from assuming a specific value. It would generate `vpxor` for `0.0` and `vmovaps` for `0.5`. The advantage here is that it&#8217;s quite generic and doe<code>n&#8217;t depend on operand size/type.</code>Could you please elaborate on your choice of explicitly using instructions?&#8221;</p></li><li><p><strong>[35:14]</strong> &#8220;Concerning callbacks, is there a benefit to use callbacks for print-outs/messaging? We have a part of the program that do some calculations which can take time, and it uses callbacks to notify the user about the progress, and waiting to the end of all the computations is not good enough. We also use a callback for a type of mesh calculation that depends on things that this other part of the program shouldn&#8217;t know about. (These things were not my decision, but I guess making the module as isolated as possible makes it easier to use it as a module in another program)&#8221;</p></li><li><p><strong>[37:42]</strong> &#8220;How do you determine if a solution to a problem is more complex than it needs to be? And for inherently complex and interconnected problems, how do you determine if it needs to be subdivided or not? Is there a general approach for working on a complex system with lots of moving parts? (other than me complaining about it :) )&#8221;</p></li></ul>
      <p>
          <a href="https://www.computerenhance.com/p/q-and-a-82-2026-01-27">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Dead Code Elimination Prevention Macros]]></title><description><![CDATA[Watch now (26 mins) | This is the eighth video in Part 5 of the Performance-Aware Programming series.]]></description><link>https://www.computerenhance.com/p/dead-code-elimination-prevention</link><guid isPermaLink="false">https://www.computerenhance.com/p/dead-code-elimination-prevention</guid><pubDate>Mon, 29 Dec 2025 17:03:03 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!s3T5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadaddecc-49f3-4dd3-98d5-816573e1352d_1920x754.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!s3T5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadaddecc-49f3-4dd3-98d5-816573e1352d_1920x754.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!s3T5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadaddecc-49f3-4dd3-98d5-816573e1352d_1920x754.jpeg 424w, https://substackcdn.com/image/fetch/$s_!s3T5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadaddecc-49f3-4dd3-98d5-816573e1352d_1920x754.jpeg 848w, https://substackcdn.com/image/fetch/$s_!s3T5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadaddecc-49f3-4dd3-98d5-816573e1352d_1920x754.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!s3T5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadaddecc-49f3-4dd3-98d5-816573e1352d_1920x754.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!s3T5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadaddecc-49f3-4dd3-98d5-816573e1352d_1920x754.jpeg" width="1456" height="572" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/adaddecc-49f3-4dd3-98d5-816573e1352d_1920x754.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:572,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:989517,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.computerenhance.com/i/182383215?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadaddecc-49f3-4dd3-98d5-816573e1352d_1920x754.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!s3T5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadaddecc-49f3-4dd3-98d5-816573e1352d_1920x754.jpeg 424w, https://substackcdn.com/image/fetch/$s_!s3T5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadaddecc-49f3-4dd3-98d5-816573e1352d_1920x754.jpeg 848w, https://substackcdn.com/image/fetch/$s_!s3T5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadaddecc-49f3-4dd3-98d5-816573e1352d_1920x754.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!s3T5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadaddecc-49f3-4dd3-98d5-816573e1352d_1920x754.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>This is the eighth video in Part 5 of the Performance-Aware Programming series. Please see the <a href="https://www.computerenhance.com/p/table-of-contents">Table of Contents</a> to quickly navigate through the rest of the course as it is updated, and <a href="https://github.com/cmuratori/computer_enhance">the code repository</a> for downloadable code listings.</em></p>
      <p>
          <a href="https://www.computerenhance.com/p/dead-code-elimination-prevention">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Q&A #81 (2025-12-22)]]></title><description><![CDATA[Answers to questions from the last Q&A thread.]]></description><link>https://www.computerenhance.com/p/q-and-a-81-2025-12-22</link><guid isPermaLink="false">https://www.computerenhance.com/p/q-and-a-81-2025-12-22</guid><pubDate>Tue, 23 Dec 2025 00:10:14 GMT</pubDate><enclosure url="https://substack-video.s3.amazonaws.com/video_upload/post/182146727/3e1b2055-4dc9-4e9e-b1a5-55add88adf7f/transcoded-119548.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>In each Q&amp;A video, I answer questions from the comments on the previous Q&amp;A video, which can be from any part of the course.</em></p><p>The questions addressed in this video are:</p><ul><li><p><strong>[00:04]</strong> &#8220;What are your top 5 recommendations for must have projects or applications to build at least once in your life to make you a better programmer ?&#8221;</p></li><li><p><strong>[14:32]</strong> &#8220;My question is regarding your previous comment on stack below, where you mentioned that we don't want to do a lot with the stack for performance oriented code. Before this course, I had impression that I should prefer stack allocation to heap allocation (maybe just C++) because I read/watched somewhere saying that 1) stack allocation/dealloc is faster (just move the stack pointer) than heap allocation 2) stack object has more obvious lifetime, 3) stack has no fragmentation concern which improves locality 4) you call on stack object directly while heap object involves indirection.</p><p>I wonder whether my understanding above is correct or not. If not, does that mean I can prefer using heap over stack allocation most of the time? Could you please further comment? Thanks!&#8221;</p></li><li><p><strong>[32:48]</strong> &#8220;One idea is to avoid dynamic allocation in the critical path by pushing each measurement into a dedicated logging/profiler thread through a channel. The worker thread would record measurements while the computation threads only perform the minimal &#8216;send&#8217; operation. But atomic operations, queues, and cross-thread communication can also add overhead, but also distort the original program execution &#8230; I&#8217;d like advice on how to judge whether this dedicated-thread idea is sound, and in general how to think about designing a low-overhead profiler.&#8221;</p></li><li><p><strong>[44:20]</strong> &#8220;What is your opinion on using callbacks for say signal handling? They often seem necessary but I may be overusing them, which leads me to believe my overall architecture could be flawed. But generally speaking, what&#8217;s your view on callbacks? Love them or hate them? Do you try to avoid them?&#8221;</p></li></ul>
      <p>
          <a href="https://www.computerenhance.com/p/q-and-a-81-2025-12-22">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Our Nemesis Returns]]></title><description><![CDATA[To avoid spoiling the surprise for people who have not yet done the homework, I cannot be any more specific in the title.]]></description><link>https://www.computerenhance.com/p/our-nemesis-returns</link><guid isPermaLink="false">https://www.computerenhance.com/p/our-nemesis-returns</guid><pubDate>Mon, 03 Nov 2025 23:19:53 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!yaoM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd957f3c7-f9b3-44ab-85cf-52e2fda1e1ff_1920x1089.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yaoM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd957f3c7-f9b3-44ab-85cf-52e2fda1e1ff_1920x1089.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yaoM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd957f3c7-f9b3-44ab-85cf-52e2fda1e1ff_1920x1089.jpeg 424w, https://substackcdn.com/image/fetch/$s_!yaoM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd957f3c7-f9b3-44ab-85cf-52e2fda1e1ff_1920x1089.jpeg 848w, https://substackcdn.com/image/fetch/$s_!yaoM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd957f3c7-f9b3-44ab-85cf-52e2fda1e1ff_1920x1089.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!yaoM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd957f3c7-f9b3-44ab-85cf-52e2fda1e1ff_1920x1089.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yaoM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd957f3c7-f9b3-44ab-85cf-52e2fda1e1ff_1920x1089.jpeg" width="1456" height="826" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d957f3c7-f9b3-44ab-85cf-52e2fda1e1ff_1920x1089.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:826,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:420478,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.computerenhance.com/i/177937774?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd957f3c7-f9b3-44ab-85cf-52e2fda1e1ff_1920x1089.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yaoM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd957f3c7-f9b3-44ab-85cf-52e2fda1e1ff_1920x1089.jpeg 424w, https://substackcdn.com/image/fetch/$s_!yaoM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd957f3c7-f9b3-44ab-85cf-52e2fda1e1ff_1920x1089.jpeg 848w, https://substackcdn.com/image/fetch/$s_!yaoM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd957f3c7-f9b3-44ab-85cf-52e2fda1e1ff_1920x1089.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!yaoM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd957f3c7-f9b3-44ab-85cf-52e2fda1e1ff_1920x1089.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>This is the seventh video in Part 5 of the Performance-Aware Programming series. Please see the <a href="https://www.computerenhance.com/p/table-of-contents">Table of Contents</a> to quickly navigate through the rest of the course as it is updated, and <a href="https://github.com/cmuratori/computer_enhance">the code repository</a> for downloadable code listings.</em></p>
      <p>
          <a href="https://www.computerenhance.com/p/our-nemesis-returns">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Q&A #80 (2025-10-31)]]></title><description><![CDATA[Answers to questions from the last Q&A thread.]]></description><link>https://www.computerenhance.com/p/q-and-a-80-2025-10-31</link><guid isPermaLink="false">https://www.computerenhance.com/p/q-and-a-80-2025-10-31</guid><pubDate>Sat, 01 Nov 2025 03:18:21 GMT</pubDate><enclosure url="https://substack-video.s3.amazonaws.com/video_upload/post/177680510/727f848b-cbfd-4080-bd9e-2556b1252cc8/transcoded-62883.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>In each Q&amp;A video, I answer questions from the comments on the previous Q&amp;A video, which can be from any part of the course.</em></p><p>The questions addressed in this video are:</p><ul><li><p><strong>[00:05]</strong> &#8220;Question regarding the &#8216;fat struct&#8217; approach: do you ever find yourself thinking about excess memory consumption caused by some entities having unused fields?&#8221;</p></li><li><p><strong>[04:29]</strong> &#8220;About fat structs, you said we would initialize the struct to be able to be A or B depending on the situation, if you imagine that fat struct to be of type Result with cases Success or Error, how would you be able to initialize the Error in the case you have a Success as there was no Error, and vice-versa. This type for example exists in F# and it makes sense to initialize Success only or Error only.&#8221;</p></li><li><p><strong>[07:40]</strong> &#8220;Do you have any suggestions for an off the shelf tool to measure bandwidth and flops for a machine your run it on?&#8221;</p></li><li><p><strong>[13:10]</strong> &#8220;I sense a recurring theme of reliability and predictability &#8211; preferring simple control flow to early returns, preferring simple compilers to strict aliasing, preferring large blocks of memory to pointer festivals, etc. You&#8217;ve also spoken about improving _robustness_ by preferring zero/dummy values that just flow through code to null pointers, and preferring handles to pointers (from the iCloud example we just saw).</p><p>What do you mean with &#8220;robustness&#8221;, and what other techniques can I use to make my code more robust in this way?&#8221;</p></li><li><p><strong>[18:00]</strong> &#8220;Do you have any thoughts on Apple&#8217;s approach to SIMD?&#8221;</p></li><li><p><strong>[20:10]</strong> &#8220;Hi Casey, on the recent podcast with Marco you said that if you could choose a single piece of software to be magically redesigned it would definitely be the browser because the software platform it defines is bad. Could you please elaborate on that? What are the main problems with today&#8217;s browsers from your perspective?&#8221;</p></li><li><p><strong>[22:30]</strong> &#8220;I&#8217;m a bit confused when analyzing the bandwidth I get when reading directly from the volume (using ReadFile with the path &#8216;\.\C:&#8217; with an offset, for example). As far as I know, that&#8217;s the correct way to read system files like the $MFT (which I can currently read properly, by the way).</p><p>When reading 20 GB of contiguous data from the beginning of the volume, I get 2.6 GB/s, and it doesn&#8217;t matter whether I use the FILE_FLAG_NO_BUFFERING flag or not&#8212;the result is the same. I&#8217;d expect something closer to a non-cached read (4.9 GB/s), but I&#8217;m getting the same throughput as a cold cached read. I&#8217;m not sure where this penalty comes from (assuming the read isn&#8217;t triggering extra cache operations since it&#8217;s non-buffered).</p><p>Any idea what might be going on here? Do you think these read bandwidths make sense?&#8221;</p></li><li><p><strong>[25:46]</strong> &#8220;Will we be able to reuse the coefficients we&#8217;re currently using for f64 sine for f32 sine or will we need new ones?&#8221;</p></li><li><p><strong>[28:24]</strong> &#8220;I have a question about PC hardware components. I&#8217;m not clear on things like the motherboard and chipset. Do they play any role in performance? They vary a lot in price even with similar features, so I assume some aspects must affect overall system performance. Could you give a brief overview of these system parts, if possible?</p><p>Or, to rephrase my question: When you&#8217;re building a PC, what do you specifically look at besides just the CPU, RAM, disk and GPU in terms of performance? How do you decide what&#8217;s suitable for your specific builds?&#8221;</p></li><li><p><strong>[36:09]</strong> &#8220;In my code, I ended up having a single loop over the input that directly produced the haversine sum, rather than splitting parsing and math into two loops. But that means if I want to time parsing vs math, I have to put blocks into the loop, which (seemingly) inevitably introduces a lot of overhead.</p><p>Is there a good way to handle this? The best way I can think of is to instead temporarily comment out parts and just time the rest, though while that&#8217;s easy to do with the math part, it seems harder to do for the parsing part, since you still have to somehow produce dummy data for the math while making sure this doesn&#8217;t lead to any compiler optimizations you wouldn&#8217;t otherwise get.&#8221;</p></li><li><p><strong>[39:28]</strong> &#8220;Sorry for repeating the question from the last Q&amp;A, but here is: I have just gotten to it, did it and looked up your solution in QA47 to cross reference. There is one thing that we got differently, and I don&#8217;t quite understand your reasoning about it:</p><p>You said that shl rbx, 0 should be recognized by the frontend as a nop and not do anything with flags, but would produce rbx.</p><p>1) If the frontend sees it as a nop, why would it RAW the value of rbx, and not just be a pure nop?</p><p>2) I actually thought that it would not be recognized as a nop (I didn&#8217;t find anything about this kind of optimization, i presumed it would be somewhere near zero idiom stuff in the manual), and then it seems like shl will have not only a RAW on rbx, but also on all the flags, as it has be ready that the ALU will say that the shift was 0 and the previous value of flags should be preserved (i. e. RAW)</p><p>So the question is, why is it that rbx is a RAW and flags are skipped, and do you know if there is any place in the docs where such a frontend optimization might be mentioned?&#8221;</p></li></ul>
      <p>
          <a href="https://www.computerenhance.com/p/q-and-a-80-2025-10-31">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Better Prevention of Dead Code Elimination - Or Is It?]]></title><description><![CDATA[The most straightforward way of isolating a section of optimized code has a hidden gotcha.]]></description><link>https://www.computerenhance.com/p/better-prevention-of-dead-code-elimination</link><guid isPermaLink="false">https://www.computerenhance.com/p/better-prevention-of-dead-code-elimination</guid><pubDate>Wed, 22 Oct 2025 20:47:25 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!A3vD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F432ddd68-15ae-4a28-a4ab-dc40c4a96688_2912x1133.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!A3vD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F432ddd68-15ae-4a28-a4ab-dc40c4a96688_2912x1133.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!A3vD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F432ddd68-15ae-4a28-a4ab-dc40c4a96688_2912x1133.jpeg 424w, https://substackcdn.com/image/fetch/$s_!A3vD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F432ddd68-15ae-4a28-a4ab-dc40c4a96688_2912x1133.jpeg 848w, https://substackcdn.com/image/fetch/$s_!A3vD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F432ddd68-15ae-4a28-a4ab-dc40c4a96688_2912x1133.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!A3vD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F432ddd68-15ae-4a28-a4ab-dc40c4a96688_2912x1133.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!A3vD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F432ddd68-15ae-4a28-a4ab-dc40c4a96688_2912x1133.jpeg" width="1456" height="567" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/432ddd68-15ae-4a28-a4ab-dc40c4a96688_2912x1133.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:567,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:385791,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.computerenhance.com/i/176867812?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F432ddd68-15ae-4a28-a4ab-dc40c4a96688_2912x1133.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!A3vD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F432ddd68-15ae-4a28-a4ab-dc40c4a96688_2912x1133.jpeg 424w, https://substackcdn.com/image/fetch/$s_!A3vD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F432ddd68-15ae-4a28-a4ab-dc40c4a96688_2912x1133.jpeg 848w, https://substackcdn.com/image/fetch/$s_!A3vD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F432ddd68-15ae-4a28-a4ab-dc40c4a96688_2912x1133.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!A3vD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F432ddd68-15ae-4a28-a4ab-dc40c4a96688_2912x1133.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>This is the sixth video in Part 5 of the Performance-Aware Programming series. Please see the <a href="https://www.computerenhance.com/p/table-of-contents">Table of Contents</a> to quickly navigate through the rest of the course as it is updated, and <a href="https://github.com/cmuratori/computer_enhance">the code repository</a> for downloadable code listings.</em></p>
      <p>
          <a href="https://www.computerenhance.com/p/better-prevention-of-dead-code-elimination">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Q&A #79 (2025-09-28)]]></title><description><![CDATA[Answers to questions from the last Q&A thread.]]></description><link>https://www.computerenhance.com/p/q-and-a-79-2025-09-28</link><guid isPermaLink="false">https://www.computerenhance.com/p/q-and-a-79-2025-09-28</guid><pubDate>Mon, 29 Sep 2025 04:17:35 GMT</pubDate><enclosure url="https://substack-video.s3.amazonaws.com/video_upload/post/174646817/1658adc6-67d6-4d53-a840-0e2a6bff63ab/transcoded-368844.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>In each Q&amp;A video, I answer questions from the comments on the previous Q&amp;A video, which can be from any part of the course.</em></p><p>The questions addressed in this video are:</p><ul><li><p><strong>[00:00:09]</strong> &#8220;How far can we take SIMD use? What if we put a way bigger onus on eliminating branching and developing parallellizable code? The language K, the simdjson library, the Co-dfns compiler, and the new Box2D engine all blast competitors out of the water by thinking of data parallelism. SIMD use is arduous now - should we start designing our languages and APIs around it? What&#8217;s the path to broader cross pollination as an industry? It really seems an untapped potential, right?&#8221;</p></li><li><p><strong>[00:04:01]</strong> &#8220;Could you give examples of the kind of substrate work you&#8217;re hoping more people take seriously? Are you referring to teams like the WSL or Visual Studio performance teams, or something even deeper (or higher-level)? You&#8217;ve cited companies rewriting websites for performance and Microsoft&#8217;s console output claims - are these the kinds of things you mean? And if you mean something different, how do you think someone can get involved in that kind of work? I&#8217;m trying to understand the bigger vision. Is it that everything on the internet runs as fast as the McMaster-Carr Supply Store website?&#8221;</p></li><li><p><strong>[00:07:45]</strong> &#8220;When you critique tools like Git, are you expressing a personal frustration, or do you think there&#8217;s an objectively better way these tools could work? It&#8217;s hard for me to imagine a world where I don&#8217;t need to memorize Git or AWS minutiae - are there products today that represent what you think &#8220;just working&#8221; should look like?&#8221;</p></li><li><p><strong>[00:13:00]</strong> &#8220;What do you think about coroutines?&#8221;</p></li><li><p><strong>[00:15:50]</strong> &#8220;Will a VOD of the &#8216;Research Overview for &#8220;The Big OOPs&#8221;&#8217; livestream be made available later?&#8221;</p></li><li><p><strong>[00:16:10]</strong> &#8220;Do you find that this performance aware development you are talking about also improves the quality of the software in general? Is there any connection? Further, once you decide that something needs to have its own test or tests, what techniques(as in test code organization, handling of test data etc.) and tools you find useful in creating those tests?&#8221;</p></li><li><p><strong>[00:23:18]</strong> &#8220;I&#8217;m currently onboarding at my first ever job. It&#8217;s an giant legacy PHP codebase which primarily uses OOP. I don&#8217;t think OOP is great. But if you were forced to use classes for everything, what would be the best way to do it?&#8221;</p></li><li><p><strong>[00:26:40]</strong> &#8220;It&#8217;s interesting how I get much better performance on Zen3 on Linux, than reference haversine even for the replacement case. &#8221;</p></li><li><p><strong>[00:30:47]</strong> &#8220;I&#8217;ve been watching nearly all the BSC talks, and they&#8217;ve made me realize I might have a wrong understanding of what a type is. What would be the best definition?&#8221;</p></li><li><p><strong>[00:35:19]</strong> &#8220;It seems like there has been a push in recent years towards languages with stronger type systems and static analysis (such as Rust&#8217;s borrow checker). Do you think that this trend meaningfully improves software quality, and if so what static analysis tools (both existing and hypothetical) do you think would be the most beneficial for a performance-minded programmer?&#8221;</p></li><li><p><strong>[00:40:35]</strong> &#8220;I watched a YouTube video about the Montana mini-computer, and I understood how the concept of a function is implemented at the assembly level. I was wondering: what is a virtual function as defined in a high-level language, and how does that translate to a CPU? Along the same lines, I didn&#8217;t fully understand the concept of volatile. It seems to be related to the stack&#8212;could you explain how a volatile variable is represented at the assembly/CPU level?&#8221;</p></li><li><p><strong>[01:04:10]</strong> &#8220;With Intel AMX becoming more mature or widely-known about, do you think it is or will be possible to start doing things typically done on GPUs (texturing, filtering, convolving, etc) on CPUs in the future? Assuming it will be, do you think GPU vendors will finally start opening up a l&#225; the 30 million line problem in a fight to remain competitive with CPU vendors?&#8221;</p></li><li><p><strong>[01:13:00]</strong> &#8220;Unlike games, a lot of the value in the Apple world comes from tight integration with all of their ecosystem and design language (ex: iphone widgets, watch now-playing view, siri searchability, sharing to other apps or airdrop, accessibility integration, etc...) But of course Apple is has a lot of OOP style and &#8220;declarative frameworks,&#8221; which means that I wouldn&#8217;t have control over the code that actually runs these features.</p><p>How can I still use a handmade philosophy of writing my own simpler more focused code rather than depending on lots of slow and volatile libraries?&#8221;</p></li><li><p><strong>[01:15:36]</strong> &#8220;do you have any advice for gracefully avoiding or recovering when Apple helpfully deletes your stuff in the background (kills your process when you switch apps, makes you redownload files from icloud)&#8221;</p></li><li><p><strong>[01:19:21]</strong> &#8220;In your talk at the Better Software Conference you mention the &#8220;fat struct&#8221; as a good default option for programming in a systems level language. I think I know the gist of what you mean by this, but I am curious if you have a slightly more formal definition for the term and a general explanation for why it&#8217;s a good default approach.&#8221;</p></li><li><p><strong>[01:27:32]</strong> &#8220;Most of the course so far talk about programs and assembly for actual chips. How do the aspects of concern under performance aware programming change, if at all, if the target is WASM?&#8221;</p></li><li><p><strong>[01:30:13]</strong> &#8220;I perfectly understand why writing to al in a loop is slower than writing to rax, but I get different results for al, ax, eax and rax. The loop writing to al and ax goes at nearly 1/4 of the loop writing to rax, but the eax loop goes at 1/2 that speed. Shouldn&#8217;t they run the same since writing to eax does not preserve the upper bits?&#8221;</p></li><li><p><strong>[01:31:01]</strong> &#8220;If there&#8217;s really no &#8216;rax&#8217; in a cpu at any given point of time, why (how?) debuggers show a single value what you stop them, or linux will only write a single value in the core dump file? Shouldn&#8217;t there be some kind of a tree? I just don&#8217;t know if I can&#8217;t trust this information for debugging... What if it shows a register value from one branch, but the bug was caused by the value from another, won&#8217;t I be misled?&#8221;</p></li><li><p><strong>[01:40:21]</strong> &#8220;I was trying to reproduce your results from the RAT and register file lecture. I am running them on Alder Lake chip (i7-12700H). My results were quite the opposite to yours the add only loop was either having similar performance or run much faster compared to mov and add one. I found your article that hints that Alder Lake is able to decouple those chained adds.&#8221;</p></li></ul>
      <p>
          <a href="https://www.computerenhance.com/p/q-and-a-79-2025-09-28">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Reading CPU Diagrams]]></title><description><![CDATA[If you've followed the Performance-Aware Programming course up to this point, you already know everything you need to know to ballpark CPU performance with nothing more than IHV marketing slides.]]></description><link>https://www.computerenhance.com/p/reading-cpu-diagrams</link><guid isPermaLink="false">https://www.computerenhance.com/p/reading-cpu-diagrams</guid><pubDate>Wed, 20 Aug 2025 22:17:02 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!_TPy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F955cff14-04c3-4c35-a89b-f81a9525c2aa_1920x589.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_TPy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F955cff14-04c3-4c35-a89b-f81a9525c2aa_1920x589.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_TPy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F955cff14-04c3-4c35-a89b-f81a9525c2aa_1920x589.jpeg 424w, https://substackcdn.com/image/fetch/$s_!_TPy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F955cff14-04c3-4c35-a89b-f81a9525c2aa_1920x589.jpeg 848w, https://substackcdn.com/image/fetch/$s_!_TPy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F955cff14-04c3-4c35-a89b-f81a9525c2aa_1920x589.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!_TPy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F955cff14-04c3-4c35-a89b-f81a9525c2aa_1920x589.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_TPy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F955cff14-04c3-4c35-a89b-f81a9525c2aa_1920x589.jpeg" width="1456" height="447" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/955cff14-04c3-4c35-a89b-f81a9525c2aa_1920x589.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:447,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:669958,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.computerenhance.com/i/171311364?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F955cff14-04c3-4c35-a89b-f81a9525c2aa_1920x589.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_TPy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F955cff14-04c3-4c35-a89b-f81a9525c2aa_1920x589.jpeg 424w, https://substackcdn.com/image/fetch/$s_!_TPy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F955cff14-04c3-4c35-a89b-f81a9525c2aa_1920x589.jpeg 848w, https://substackcdn.com/image/fetch/$s_!_TPy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F955cff14-04c3-4c35-a89b-f81a9525c2aa_1920x589.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!_TPy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F955cff14-04c3-4c35-a89b-f81a9525c2aa_1920x589.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>This is the fifth video in Part 5 of the Performance-Aware Programming series. Please see the <a href="https://www.computerenhance.com/p/table-of-contents">Table of Contents</a> to quickly navigate through the rest of the course as it is updated.</em></p>
      <p>
          <a href="https://www.computerenhance.com/p/reading-cpu-diagrams">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Q&A #78 (2025-07-21)]]></title><description><![CDATA[Answers to questions from the last Q&A thread.]]></description><link>https://www.computerenhance.com/p/q-and-a-78-2025-07-21</link><guid isPermaLink="false">https://www.computerenhance.com/p/q-and-a-78-2025-07-21</guid><pubDate>Mon, 21 Jul 2025 22:33:09 GMT</pubDate><enclosure url="https://substack-video.s3.amazonaws.com/video_upload/post/168897645/177a70f8-f447-4c1e-be75-53990d2f215a/transcoded-27507.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>In each Q&amp;A video, I answer questions from the comments on the previous Q&amp;A video, which can be from any part of the course.</em></p><p>The questions addressed in this video are:</p><ul><li><p><strong>[00:03]</strong> &#8220;Hi, I've seen you talk before about the state of gpu apis, and I am aware that you were talking about SoC solution to this and many other problems with current computers. My question is this, how would you design a gpu api, if you had a magic stick that you could shake to make it appear and be common on all computers today? If I understand your position, you'd have the programmer talk directly(or at least through a paper thin layer) to the gpu, but how would that work in practice? If that's not too much to ask, I'd love seeing a short snippet of pseudocode of how it would work on the consumer side, thanks!&#8221;</p></li><li><p><strong>[04:10]</strong> &#8220;When dealing with async io, do you think an await style interface like in javascript/go is a good model, or do you think some simple state machine or any other method is better? I find that writing state machines for this kind of stuff gets overly explicit in a lot of cases. Thanks!&#8221;</p></li><li><p><strong>[07:52]</strong> &#8220;Given the course, I was trying to apply some techniques to my own toy problem that given a list of words and an NxN grid tries to generate a wordpuzzle where each word is added horizontally, vertically or diagonally.</p><p>I tried to reduce pessimization as much as possible. The program performs a recursive search where each word added successfully to the grid increases the depth by one. Memory is pushed and popped with a single memory arena of max 512KiB. The hot code is the check that given a word and the current board, to see if the word will fit.</p><p>And I am having a hard time vectorizing this loop such that it's actually more performant. The single byte checks seem to outperform vectorization by a factor 10 as the data is too sparse? I also could not find an AVX "scatter" function that does the opposite of a movemask_epi8. Was wondering if you have any thoughts on how one would optimize this further.&#8221;</p></li><li><p><strong>[13:03]</strong> &#8220;I was making a homework about asm volatile, and I noticed that all fma instructions were using memory operands and now I'm wondering why there are no fma instructions with immediate operands? Surely it should be beneficial to bake values in some cases, right? Or maybe there are never immediate operands for floating point instructions.&#8221;</p></li><li><p><strong>[17:54]</strong> &#8220;Will estimating the cost of more &#8216;branchy&#8217; workloads like lexing or e.g. json parsing be covered?</p><p>For context I am trying to apply what I've learned from this course to optimize a lexer for a programming language and have gotten from ~0.8 GB/s to ~1.3 GB/s lexing the linux source code. However, it seems impossible for me to get it to run any faster, eventhough 1.3 GB/s is nowhere near memory bandwidth and most of the work is just deciding what kind of token to spit out and how much to advance. It feels to me like ~2GB/s or so could be the limit to how fast you could lex one token at a time, and going above that would require producing more than one token like I believe simdjson does. However, I have no clue if this is remotely correct, since my intuition about the cost of branchy code is provably very bad.&#8221;</p></li><li><p><strong>[20:12]</strong> &#8220;Do you have any good resources that outline how to improve a user's experience. The course emphasise performance but I'm curious what other things would you consider to improve a user's experience.&#8221;</p></li></ul>
      <p>
          <a href="https://www.computerenhance.com/p/q-and-a-78-2025-07-21">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Selectively Preventing Optimizations]]></title><description><![CDATA[When we want to microbenchmark code in a high-level language, we want almost all optimizations applied - except for the ones that would remove the code entirely.]]></description><link>https://www.computerenhance.com/p/selectively-preventing-optimizations</link><guid isPermaLink="false">https://www.computerenhance.com/p/selectively-preventing-optimizations</guid><pubDate>Sun, 29 Jun 2025 05:11:19 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!hsX2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F561b2ddf-8537-4134-b0f4-b6ce49c2942c_1920x1279.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hsX2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F561b2ddf-8537-4134-b0f4-b6ce49c2942c_1920x1279.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hsX2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F561b2ddf-8537-4134-b0f4-b6ce49c2942c_1920x1279.jpeg 424w, https://substackcdn.com/image/fetch/$s_!hsX2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F561b2ddf-8537-4134-b0f4-b6ce49c2942c_1920x1279.jpeg 848w, https://substackcdn.com/image/fetch/$s_!hsX2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F561b2ddf-8537-4134-b0f4-b6ce49c2942c_1920x1279.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!hsX2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F561b2ddf-8537-4134-b0f4-b6ce49c2942c_1920x1279.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hsX2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F561b2ddf-8537-4134-b0f4-b6ce49c2942c_1920x1279.jpeg" width="1456" height="970" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/561b2ddf-8537-4134-b0f4-b6ce49c2942c_1920x1279.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:970,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:430452,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.computerenhance.com/i/167083580?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F561b2ddf-8537-4134-b0f4-b6ce49c2942c_1920x1279.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hsX2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F561b2ddf-8537-4134-b0f4-b6ce49c2942c_1920x1279.jpeg 424w, https://substackcdn.com/image/fetch/$s_!hsX2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F561b2ddf-8537-4134-b0f4-b6ce49c2942c_1920x1279.jpeg 848w, https://substackcdn.com/image/fetch/$s_!hsX2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F561b2ddf-8537-4134-b0f4-b6ce49c2942c_1920x1279.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!hsX2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F561b2ddf-8537-4134-b0f4-b6ce49c2942c_1920x1279.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>This is the fourth video in Part 5 of the Performance-Aware Programming series. Please see the <a href="https://www.computerenhance.com/p/table-of-contents">Table of Contents</a> to quickly navigate through the rest of the course as it is updated. The listing referenced in the video (listing 196) is available <a href="https://github.com/cmuratori/computer_enhance/tree/main/perfaware/part4">on the github</a>.</em></p>
      <p>
          <a href="https://www.computerenhance.com/p/selectively-preventing-optimizations">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Q&A #77 (2025-06-19)]]></title><description><![CDATA[Answers to questions from the last Q&A thread.]]></description><link>https://www.computerenhance.com/p/q-and-a-77-2025-06-19</link><guid isPermaLink="false">https://www.computerenhance.com/p/q-and-a-77-2025-06-19</guid><pubDate>Fri, 20 Jun 2025 05:51:21 GMT</pubDate><enclosure url="https://substack-video.s3.amazonaws.com/video_upload/post/166360526/023acde1-d79a-4696-96ca-dba146b1d2ef/transcoded-98329.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>In each Q&amp;A video, I answer questions from the comments on the previous Q&amp;A video, which can be from any part of the course.</em></p><p>The questions addressed in this video are:</p><ul><li><p><strong>[00:02]</strong> &#8220;This is more of a learning personal project and I'm looking for something I can do that has a little bigger impact. Do you have some other tool/program/improvement you'd like to see but don't have the bandwidth to do yourself that you can share or just an avenue you'd like to see explored more that I could look into to produce something of value to other fellow programmers?&#8221;</p></li><li><p><strong>[02:57]</strong> &#8220;Hey Casey, I posted a question about non-text programming languages and fear I was a little late on the Q&amp;A cycle, Substack didn't send me an email for it :( , for the question to make it into this Q&amp;A's rotation. I am still very curious what your thoughts about a non-text language are, or if you have talked about them before and have a link to those articles/videos!&#8221;</p></li><li><p><strong>[03:51]</strong> &#8220;I wanted to know if you have already covered &#8216;False Sharing&#8217; from a cache point of view and its impact on performance. If not, do you have any plans for it? Do you plan to also cover effects of NUMA on performance?&#8221;</p></li><li><p><strong>[08:05]</strong> &#8220;I just rewatched your video on the GJK algorithm from 2006, and when you talk about the triangle case you say that, although there are six ifs and elses written down, you only ever execute three of them at most, so it's at most three tests and jumps. Since the logic only relies on dot products and cross products, I would assume that the multiplies and the adds would have a lesser impact on the performance compared to the penalty of mispredicting multiple branches so close to each other.</p><p>Am I getting something wrong?&#8221;</p></li><li><p><strong>[11:50]</strong> &#8220;According to you, what would be the impact of AI on performance aware programming? I mean, is it likely to become more common that organizations / individuals offload this task to sophisticated AI coding agents? I know its a very vague query with possibly no clear answers but just wanted to know what this community thinks about it.&#8221;</p></li><li><p><strong>[17:34]</strong> &#8220;Hi Casey, I was working on a simple CSV reader to analyze some logs but I have many huge files and it takes a while. I wanted to try out Intel V Tune and it says that the code is mostly front-end bound and has 100% DSB misses. I was wondering what are those? Are they a problem? How can we solve them?&#8221;</p></li><li><p><strong>[23:10]</strong> &#8220;Why is memory management so obscure, and why are people inventing so many languages to ( ostensibly ) fix that, while criticizing C for being unsafe? Why are use-after-free / double-free and so on, issues mentioned when talking about C being unsafe? In other words, why would I need to free memory, when I know exactly how much memory a program will ever need and I can keep on reusing it ( you can't know at a given time how much memory you need, but there is a limit, since you can't allocate infinite memory ).</p><p>In real life, when we discover new concepts ( like in physics ) we don't invent a new language to explains those concepts in., instead, we improve our current language / language practices.<br>No worries if you can't include everything from the msg in the Q&amp;A.&#8221;</p></li><li><p><strong>[29:33]</strong> &#8220;Hi Casey, before taking the course I would consider using SIMD instead of scalar operations and utilizing memory caches properly to be optimizations. But it seems like you prefer to think about them as a performance aware programming. Then I wonder what would be some concrete examples of optimizations? I just can't imagine something beyond that to get even more performance.&#8221;</p></li><li><p><strong>[32:59]</strong> &#8220;Hello, does anybody have any idea why could there be a huge difference between reading a file to mmap'ed memory and malloc'ed memory on Linux, AMD Zen3 chip?&#8221;</p></li></ul>
      <p>
          <a href="https://www.computerenhance.com/p/q-and-a-77-2025-06-19">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Simplified Haversine Candidates]]></title><description><![CDATA[Even for a computation as simple as our haversine loop, removing waste yields a surprisingly large performance improvement for very little effort.]]></description><link>https://www.computerenhance.com/p/simplified-haversine-candidates</link><guid isPermaLink="false">https://www.computerenhance.com/p/simplified-haversine-candidates</guid><pubDate>Sun, 15 Jun 2025 00:00:11 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08a56680-75ec-45de-8c92-223125bcac35_5616x2535.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dboe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08a56680-75ec-45de-8c92-223125bcac35_5616x2535.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dboe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08a56680-75ec-45de-8c92-223125bcac35_5616x2535.jpeg 424w, https://substackcdn.com/image/fetch/$s_!dboe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08a56680-75ec-45de-8c92-223125bcac35_5616x2535.jpeg 848w, https://substackcdn.com/image/fetch/$s_!dboe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08a56680-75ec-45de-8c92-223125bcac35_5616x2535.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!dboe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08a56680-75ec-45de-8c92-223125bcac35_5616x2535.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dboe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08a56680-75ec-45de-8c92-223125bcac35_5616x2535.jpeg" width="1456" height="657" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/08a56680-75ec-45de-8c92-223125bcac35_5616x2535.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:657,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1489541,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.computerenhance.com/i/165824301?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08a56680-75ec-45de-8c92-223125bcac35_5616x2535.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dboe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08a56680-75ec-45de-8c92-223125bcac35_5616x2535.jpeg 424w, https://substackcdn.com/image/fetch/$s_!dboe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08a56680-75ec-45de-8c92-223125bcac35_5616x2535.jpeg 848w, https://substackcdn.com/image/fetch/$s_!dboe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08a56680-75ec-45de-8c92-223125bcac35_5616x2535.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!dboe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08a56680-75ec-45de-8c92-223125bcac35_5616x2535.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>This is the fifteenth video in Part 4 of the Performance-Aware Programming series. Please see the <a href="https://www.computerenhance.com/p/table-of-contents">Table of Contents</a> to quickly navigate through the rest of the course as it is updated. The listings referenced in the video (listing 194 and 195) are available <a href="https://github.com/cmuratori/computer_enhance/tree/main/perfaware/part4">on the github</a>.</em></p>
      <p>
          <a href="https://www.computerenhance.com/p/simplified-haversine-candidates">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Q&A #76 (2025-05-23)]]></title><description><![CDATA[Answers to questions from the last Q&A thread.]]></description><link>https://www.computerenhance.com/p/q-and-a-76-2025-05-23</link><guid isPermaLink="false">https://www.computerenhance.com/p/q-and-a-76-2025-05-23</guid><pubDate>Fri, 23 May 2025 21:08:03 GMT</pubDate><enclosure url="https://substack-video.s3.amazonaws.com/video_upload/post/164261039/0326dfe4-4347-4852-971a-73c622dbc001/transcoded-86101.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>In each Q&amp;A video, I answer questions from the comments on the previous Q&amp;A video, which can be from any part of the course.</em></p><p>The questions addressed in this video are:</p><ul><li><p><strong>[00:02]</strong> &#8220;Not the original author but I have a follow up question on the BTS/bits thread. On x86, writes are atomic only with byte granularity if I'm not mistaken, so in order for BTS to work properly and have multiple threads toggling bits _on the same byte_, there needs to be some sort of atomicity baked in for the RMW series of operations. Seeing the hand rolled series of operations doesn't have any lock/atomic op in there, wouldn't it have a bug when multiple threads try to toggle bits on the exact same byte?&#8221;</p></li><li><p><strong>[08:00]</strong> &#8220;I have only dabbled in low-level OS and/or bare-metal hypervisor code, but when I did, I was surprised to find how much setup was required to get multiple cores working in x86. Surprisingly little was set up out of the box - it seemed that the cores all started off running the same code and accessing the same data, and I assume that each core's registers started off in the same state (not sure about the details). So, what are the main considerations to get from that initial blank state (post BIOS/entering 64-bit mode) to where your OS/hypervisor can have multiple cores running separate threads of code (and maybe scheduling them)?</p><p>A related question, at the level of OS applications: You mentioned once that you prefer to roll your own multithreading code instead of relying on libraries. Having only ever used threading libraries in various languages, it's not clear where to even start, or how my hand-rolled threading code would even benefit me over generic threading libraries. What are the main considerations for rolling my own threading code instead of using something like pthreads? Or am I confusing threading libraries with OS-level threading APIs?&#8221;</p></li><li><p><strong>[18:12]</strong> &#8220;As a developer with 10 years of experience in Java, I&#8217;m considering a career move&#8212;not due to layoffs, but in search of a more fulfilling role. (I am also considering other languages not just Java)</p><p>I know companies like RAD Game Tools are rare (if not unique), so I&#8217;m curious:</p><p>Where would you recommend looking for opportunities that prioritize meaningful work?&#8221;</p></li><li><p><strong>[25:26]</strong> &#8220;I imagine that you don't write your own math implementations for all your projects, since you emphasized that we do this for educational purposes. What do you actually use?&#8221;</p></li><li><p><strong>[29:36]</strong> &#8220;I recently had an assignment for a 2D graphics course at my university, where i had to implement a edge detection filter in python and c. The goal of the assignment was to compare the performance and later implement a Cython version of the function. I implemented a AVX version and the professor was super stoked. He told me that he used to fiddle with MMX but that it was always a pain to have it compile and that it never ran on other machines because of incompatibility. Was it really that bad back in the day? I feel like even AVX2 now has pretty wide spread support on most computers today, and getting something to compile is just one compiler flag. How do you deal with different architectures / different feature sets to make it more convenient to program? Do you have a wrapper? If so, would you mind sharing it?&#8221;</p></li><li><p><strong>[34:57]</strong> &#8220;I finally put together a minimal reproducer for the bts question.&#8221;</p></li><li><p><strong>[35:25]</strong> &#8220;What is the performance difference between the original haversine versus the new haversine with your own sin/cos/etc functions? I feel like that was missing from the last video&#8221;</p></li><li><p><strong>[36:00]</strong> &#8220;If you could design your own computer science undergrad program, what sorts of classes would you include and focus on and what would you change from the way current universities do things?&#8221;</p></li></ul>
      <p>
          <a href="https://www.computerenhance.com/p/q-and-a-76-2025-05-23">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Removing Waste]]></title><description><![CDATA[As we saw in the very beginning of the Performance Aware Programming series, a CPU can be brought to a crawl by drowning it in unnecessary work. How does this "waste" accrue, and how do we remove it?]]></description><link>https://www.computerenhance.com/p/removing-waste</link><guid isPermaLink="false">https://www.computerenhance.com/p/removing-waste</guid><pubDate>Thu, 15 May 2025 21:15:47 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00f634b8-b46b-4e2d-bf98-d3c0bbbfb5bd_2803x1152.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Mb8u!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00f634b8-b46b-4e2d-bf98-d3c0bbbfb5bd_2803x1152.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Mb8u!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00f634b8-b46b-4e2d-bf98-d3c0bbbfb5bd_2803x1152.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Mb8u!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00f634b8-b46b-4e2d-bf98-d3c0bbbfb5bd_2803x1152.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Mb8u!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00f634b8-b46b-4e2d-bf98-d3c0bbbfb5bd_2803x1152.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Mb8u!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00f634b8-b46b-4e2d-bf98-d3c0bbbfb5bd_2803x1152.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Mb8u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00f634b8-b46b-4e2d-bf98-d3c0bbbfb5bd_2803x1152.jpeg" width="1456" height="598" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/00f634b8-b46b-4e2d-bf98-d3c0bbbfb5bd_2803x1152.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:598,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:508170,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.computerenhance.com/i/163662753?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00f634b8-b46b-4e2d-bf98-d3c0bbbfb5bd_2803x1152.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Mb8u!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00f634b8-b46b-4e2d-bf98-d3c0bbbfb5bd_2803x1152.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Mb8u!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00f634b8-b46b-4e2d-bf98-d3c0bbbfb5bd_2803x1152.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Mb8u!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00f634b8-b46b-4e2d-bf98-d3c0bbbfb5bd_2803x1152.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Mb8u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00f634b8-b46b-4e2d-bf98-d3c0bbbfb5bd_2803x1152.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>This is the fourteenth video in Part 4 of the Performance-Aware Programming series. Please see the <a href="https://www.computerenhance.com/p/table-of-contents">Table of Contents</a> to quickly navigate through the rest of the course as it is updated. The listings referenced in the video (listing 192 and 193) are available <a href="https://github.com/cmuratori/computer_enhance/tree/main/perfaware/part4">on the github</a>.</em></p>
      <p>
          <a href="https://www.computerenhance.com/p/removing-waste">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Our Very Own Haversine]]></title><description><![CDATA[We've built all the pieces - now it's time to assemble them into a haversine distance function that uses only math we've hand-coded ourselves so we can analyze it in its entirety.]]></description><link>https://www.computerenhance.com/p/our-very-own-haversine</link><guid isPermaLink="false">https://www.computerenhance.com/p/our-very-own-haversine</guid><pubDate>Mon, 05 May 2025 19:46:57 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F340e4b4b-b8fe-45a1-af57-dc3f08414a89_1920x862.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vp-d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F340e4b4b-b8fe-45a1-af57-dc3f08414a89_1920x862.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vp-d!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F340e4b4b-b8fe-45a1-af57-dc3f08414a89_1920x862.jpeg 424w, https://substackcdn.com/image/fetch/$s_!vp-d!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F340e4b4b-b8fe-45a1-af57-dc3f08414a89_1920x862.jpeg 848w, https://substackcdn.com/image/fetch/$s_!vp-d!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F340e4b4b-b8fe-45a1-af57-dc3f08414a89_1920x862.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!vp-d!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F340e4b4b-b8fe-45a1-af57-dc3f08414a89_1920x862.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vp-d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F340e4b4b-b8fe-45a1-af57-dc3f08414a89_1920x862.jpeg" width="1456" height="654" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/340e4b4b-b8fe-45a1-af57-dc3f08414a89_1920x862.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:654,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:341418,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.computerenhance.com/i/162914952?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F340e4b4b-b8fe-45a1-af57-dc3f08414a89_1920x862.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vp-d!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F340e4b4b-b8fe-45a1-af57-dc3f08414a89_1920x862.jpeg 424w, https://substackcdn.com/image/fetch/$s_!vp-d!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F340e4b4b-b8fe-45a1-af57-dc3f08414a89_1920x862.jpeg 848w, https://substackcdn.com/image/fetch/$s_!vp-d!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F340e4b4b-b8fe-45a1-af57-dc3f08414a89_1920x862.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!vp-d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F340e4b4b-b8fe-45a1-af57-dc3f08414a89_1920x862.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>This is the thirteenth video in Part 4 of the Performance-Aware Programming series. Please see the <a href="https://www.computerenhance.com/p/table-of-contents">Table of Contents</a> to quickly navigate through the rest of the course as it is updated. The listings referenced in the video (listing 190 and 191) are available <a href="https://github.com/cmuratori/computer_enhance/tree/main/perfaware/part4">on the github</a>.</em></p>
      <p>
          <a href="https://www.computerenhance.com/p/our-very-own-haversine">
              Read more
          </a>
      </p>
   ]]></content:encoded></item></channel></rss>