The application (as usual) is my N-Body modeling code. In order to discuss profiling there’s very little you need to know about N-body models. Suffice to say once the user starts the application it updates the model and periodically displays the results on the screen.
In order to keep the UI responsive and to ensure the best possible performance the application makes extensive use of the Task Parallel Library (for .NET) and Parallel Patterns Library (for Visual C++). This means that there are lots of threads running at once.
What are they doing and are they doing it efficiently? What follows is a short story as to why Scenario Markers are really, really useful…
Click on the image to see the full Visual Studio IDE. You’ll see I’ve removed all but the threads which appear to be running my code. The highlighted block has a call stack which is in a call to DomainModel::MagnitudeSquared (my code).
But… there real problem here is not just the amount of information. When I’m profiling the basic question I’d like to answer is… what’s the application doing (from a user perspective), what code executes as a result and how efficiently is it doing so?
From the profiler output as is I can see lots of work is being done but not how it relates to the individual model updates.
That’s where Scenario markers come in…
What’s a Scenario marker?
“A Scenario object is a highly-accurate timer that logs ETW events (Event Tracing for Windows) when you start and stop it. It’s designed to be used for performance instrumentation and benchmarking, and comes in both C# and C++ versions.”
Simply put its a way to add flags to your profiling data. Scenario markers are the Debug.WriteLine of profiling. Like WriteLine, adding scenario markers is very straightforward:
Scenario.Scenario marker = new Scenario.Scenario();
// Code to include in this scenario...
// More code...
The marker can take a long denoting the size of the scenario and a string as it’s name. The name will appear in the profiler window when you mouse over the grey bar. I’ve placed my marker code inside if/endif blocks so I don’t get it compiled into my final binaries.
That’s about it. Now when you re-build and re-profile the application the threads view contains new information that provides more context as to what’s going on.
In this example I added two markers one to show me calculation steps (the upper blue bar) and the second to show when it updates the display (the lower bar). If you look at the full screen shot (click on the image) you’ll see this in more detail. I also used the measuring tool to get the length of a calculation, 32.8ms.
The problem is pretty obvious. The bottom set of bars denotes screen updates. These are relatively quick and occur more frequently than the calculation steps. In fact, on average I’m doing two screen updates for every calculation step!
Scenario markers have made a potential performance issue very clear. Doing a second screen update then the model hasn’t changed is wasted work. I might as well skip that update completely as the model hasn’t changed. Any saved processor cycles can be devoted to calculation instead.
Now if you look at the profiler thread view you’ll see that the screen updates are happening much less frequently, roughly once per iteration.
This is just the start… Clearly there’s a lot going on here and the worker threads seems to spend par too much time not doing useful work (i.e. not enough green). At least now it’s possible to see what happens when in relation to each update of the model.
I’ve already made some improvements there by replacing some heavy weight code using Handles and Waits with lock free code which relies on .NET’s atomic read/write guarantees and Interlocked.CompareExchange. This also improved the number of updates per second the UI could handle. Like the book says, “Locks are the gotos of parallel programming”.
You can find out more about the using the profile and scenario markers at:
The scenario marker binaries are available on MSDN Code gallery.