Playback speed
Share post
Share post at current time

Paid episode

The full episode is only available to paid subscribers of Computer, Enhance!

Q&A #31 (2023-10-23)

Answers to questions from the last Q&A thread.

In each Q&A video, I answer questions from the comments on the previous Q&A video, which can be from any part of the course. Transcripts are not available for Q&A videos due to length. I do produce closed captions for them, but Substack still has not enabled closed captions on videos :(

Questions addressed in this video:

  • [00:25] “How is it possible then to use debuggers to poke around any random process's memory? What's stopping any process whatsoever to act as a debugger in this manner? Is the protection guaranteed only for dead processes?”

  • [03:56] “Why not use mmap (Unix) or CreateFileMapping (Windows) to map the 1gb of OS cache into the application's address space instead of copying it? Is it necessary to copy the data so you can modify it in place later? Or does creating a file mapping have some other downside?”

  • [05:31] “How does an empty table that generates a page fault look like? Does it mean that entries in this table will have the first bit, aka ‘Present’ bit set to 0, and ‘walking’ them will trigger a page fault that OS would have to resolve?”

  • [08:05] “I could be wrong, but if it is true that sequentially reading from DRAM is faster than random reads. Is there a way we can make VM subsystem provide us contiguous pages in physical memory?”

  • [11:10] “Could initializing memory with multiple threads result in a speedup? I think you mentioned this idea in a past video.”

  • [13:11] “Do GPUs have virtual memory?”

  • [15:43] “Even when I have millions of profile blocks, the run-time of my program is consistently lower (by about 800-1000ms) when I enable the profiler vs when I do not. Any idea, what might be causing this?”

  • [17:10] “Wouldn't it have been better to check (Anchor->HitCount == 0) before writing the label, thereby allowing us to replace writing every time with reading every time but writing only once”

  • [18:42] “Why didn't you write any tests for the profiler as you were making changes to it?”

  • [22:25] “To avoid using repeatedly using new memory and suffering page faults is there a way to reuse some allocated memory for some `struct Foo` for another unrelated `struct Bar`?”

  • [25:31] “Is there a technique in game development to leverage dirty pages to restore a previously saved game state? I'm sure virtual machines do this but I'm curious if there is a performant way of doing it.”

  • [27:02] “For 4-level paging on x64, am I right in saying that the operating system must at least create a single top level 4k page to inform the CPU where to start when translating virtual addresses. Where this top level 4k page essentially acts as a head to a tree that starts out with null children. And then only when the CPU encounters a null child while processing a virtual address does it somehow ask the OS, "what should this entry be?" And, if so, is that the part that is considered a "page fault"? Where the page fault is more an interaction between the CPU & OS and not really the OS & my application?”

  • [29:01] “As fun as I think it is, why would Intel/AMD prioritize a mode where using the top 7 bits of a pointer is officially supported. To you, does it seem like those 7 bits could really be of use to many developers?”

  • [31:27] “If one VirtualAllocs 1 gigabyte of memory, aside from guaranteeing a contiguous chunk of virtual memory, how much does Windows attempt to map that contiguous chunk of virtual memory to a contiguous chunk of physical memory? Would it matter if it didn't?”

  • [33:19] “In response to the "Sparse Memory" section of "Powerful Page Mapping Techniques" video, are you saying we could literally call malloc(64 gigabytes) on start up? I feel like that would be awesome but as we've seen in the course, it seems like debug mode malloc might actually touch all that memory to initialize it to something like 0xDEADBEEF. And if it did that, I imagine we'd have a high likely hood of just immediately crashing on most systems, no? So instead of using malloc/new we'd have to make sure we call VirtualAlloc or the specific OS's equivalent to use this technique and still have a properly working debug environment. It seems like a problem that our programming languages play ignorant as if we are allocating physical memory...”

  • [36:42] “‘Virtual functions are mainly bad because of code fusion.’ I understand you’ll be going down this rat hole at a later time but can you expand ever so slightly? Just a short synopsis of what “code fusion” is and how it is related to virtual functions?”

  • [41:28] “How would one go about figuring out minimum specs for a project (tool, game, etc)? I know the best way is to test on actual hardware; but, for a small starting indy developer, funds are tight and thus hardware is limited.”

The full video is for paid subscribers

Programming Courses
A series of courses on programming topics.