Paid episode

The full episode is only available to paid subscribers of Computer, Enhance!

Four-Level Paging

What do the bits of a pointer actually mean?
14

This is the sixth video in Part 3 of the Performance-Aware Programming series. Please see the Table of Contents to quickly navigate through the rest of the course as it is updated weekly. The listings referenced in the video (listings 117 and 118) are available on the github.

As with the previous post, this one is not essential to performance-aware programming. I’ve added it in since subscribers seemed interested in how memory paging works, and I always want to encourage people to follow their curiosity!

So today we ask the two (rather advanced!) questions we did not have answers for when we did our page fault analysis:

  1. Why did we see extra page faults? We allocated and then touched a specific number of 4k pages, and yet we saw more than that number of page faults. Why?

  2. What were those weird “flat anomalies” we saw in our Windows pre-mapping sawtooth pattern? Can we construct a theory that explains when they happen?

These questions are harder to answer than the one in the previous post. In fact, it would be difficult or impossible to answer them without knowing what a modern x64 pointer actually is.

Most programmers do not know what a pointer is. I don’t mean that in the “because they program in JavaScript, which doesn’t have pointers” way. I mean even most C/C++ programmers have no idea that a pointer is actually a packed structure, not an arbitrary series of bits.

This is “by design”. Operating systems are intentionally trying to provide a consistent programming model that does not depend on the specifics of the hardware on which it runs. In a sense, we don’t want programmers assuming too much about what a pointer actually is, because if they did, then we could not run the same binaries on different CPUs that had different pointer representations!

But, as with all abstractions, this one ceases to become helpful when we run into anomalies caused by the mismatch between what the abstraction pretends is going on vs. what is actually going on. At that point, the abstraction becomes obfuscation, and we have to look below it to fully understand.

That is exactly what is happening with our two unanswered questions. In order to answer them, we are going to have to learn what the bits of a pointer actually mean.

We typically say that modern x64 memory addresses are “64 bit”:

The full video is for paid subscribers

Programming Courses
A series of courses on programming topics.