Paid episode

The full episode is only available to paid subscribers of Computer, Enhance!

A First Look at Profiling Overhead

How much does our profiler affect the performance of our haversine program?
10

This is the ninth video in Part 2 of the Performance-Aware Programming series. Please see the Table of Contents to quickly navigate through the rest of the course as it is updated weekly. The listings referenced in the video (listings 89, 90, 91, 92, 93, 94, and 95) are available on the github.

We now have a quick and easy way to capture performance data throughout the entire development cycle of any program. Because our approach is so lightweight and so trivial to deploy, we really don't have to do anything to continuously monitor this data. We can mark up blocks anywhere in our program, and every time we run, we will get the basic profiling information for those blocks alongside the result of the program automatically.

This is fantastic for building performance intuition. From now on, every time we make a change to our program we will immediately get detailed feedback about how that change affected the performance of the program. Constant feedback like this across the entire development cycle is invaluable for learning what kinds of changes make code slower, and which make code faster.

That's fantastic, and that's a great reason to build one of these integrated profilers. But you might also be able to find this functionality in an existing commercial profiler somewhere. So why did I have us build our own, instead of looking for an existing solution and teaching that?

As with most things in the course, the main goal is learning what’s actually happening. If we used a commercial or open-source profiler of some kind, it would be much more difficult to learn what it’s actually doing to gather that performance data. And believe it or not, knowing exactly how performance data is captured is every bit as important as the data itself!

Absent special hardware, there really isn’t a way to capture performance data in modern programs — especially multi-core programs — without affecting the way the program performs. Every time we collect performance data, we must also take care to understand what effects may have occurred solely as a result of our collection process.

We’re going to take our first look at these effects today. It will be a brief look, because we haven’t yet gained enough knowledge about modern compilers and microarchitectures to know why some of these effects are happening. But even with our current level of knowledge, we can easily see that they are happening, and take steps to ensure that our profile measurements remain useful.

Let’s begin by trying something that would seem fairly straightforward: we built a great little TimeFunction macro, so let’s time all the functions in our JSON parser! All we have to do is put “TimeFunction;” as the first line of every function, and we should get a complete breakdown of the performance of the JSON parser, right?

The full video is for paid subscribers

Programming Courses
A series of courses on programming topics.