7 Profiler2
Langbart edited this page 2021-01-29 16:08:17 +01:00

Using Profiler2 for profiling

Profiler2 is 0 A.D.'s advanced profiler. It is defined for very finely grained profiling, since it can measure events down to every single occurrence. A simple use of Profiler 2 would be to detect frame rate spikes or measure absolute performance improvements over a simulation turn.

Creating a profiler2.jsonp file

Non-visual replay (command line)

You will need a replay file (command.txt) that you can create yourself or get from someone else, make sure it is compatible with your Alpha version. When you run the following command, a profiler2.jsonp file is automatically created in your working directory (~/0ad).

pyrogenesis -mod=public -replay=[PATH to a command.txt file]

Non-visual replay is also recommended if you want to compare two different states of your working copy with the same replay file. This ensures that your two .jsonp files are in sync and can be better compared using the source/tools/profiler2/profiler2.html file.

Visual replay (game interface)

By default, profiler2 is not enabled. To see the results, you need to enable the HTTP server. You can do this by setting the following configuration in your user.cfg file: "profiler2.autoenable = true". Alternatively, you can press Ctrl+F11 (by default) to activate the HTTP profiler in-game. Currently, enabling Profiler2 in replay mode can only be done by recompiling the game with the following patch:

Index: source/ps/Replay.cpp
===================================================================
--- source/ps/Replay.cpp	(revision 18423)
+++ source/ps/Replay.cpp	(working copy)
@@ -133,6 +133,8 @@
 	g_ScriptStatsTable = new CScriptStatsTable;
 	g_ProfileViewer.AddRootTable(g_ScriptStatsTable);
 
+	g_Profiler2.EnableHTTP();
+
 	const int runtimeSize = 384 * 1024 * 1024;
 	const int heapGrowthBytesGCTrigger = 20 * 1024 * 1024;
 	g_ScriptRuntime = ScriptInterface::CreateRuntime(shared_ptr<ScriptRuntime>(), runtimeSize, heapGrowthBytesGCTrigger);

Once the HTTP server is enabled, you simply need to open source/tools/profiler2/profiler2.html in a modern web browser. You should see that the " Live" report is loaded. If you don't have HTTP profiler enabled, nothing will be displayed, otherwise you should get a screen with a timeline of frames and events. You can save report sessions by clicking "Save Live Report to file". This will save the file to your log folder instead of your working directory as with non-visual replay. If you do this over a few runs, you can get several report files to compare.

Interface and comparison

The Report interface

Profiler can give you a very detailed view of the game.

The top timeline shows you the time taken by each frame to complete. The higher and wider, the slower. This allows you to easily notice spikes. The scale is logarithmic.
Clicking anywhere on the frame timeline will show you the events around that point.

The events timeline gives you a finely grained view of all profiled events, as well as their duration. Clicking on this view allows you to zoom in.
This view can help you notice which parts of the engine have suddenly spiked, and how frames generally go.

Profiler2 allows you to open several reports (even in addition to the live report). You can open older profiling run using the top button. If you have several reports open, you can select the one you which to examine by clicking on it.

The Analysis interface

The lower part of the screen shows you useful graphs. On the "choices" panel, you will see a list of all profiled events. If you click on one or more of these events, you will get 2 graphs showing you information.

The Frequency graph gives you a global representation of the average runtime of this event. It is constructed by computing the time spent in this event in each frame, then sorting them by time. Thus, a function that always takes the same time will appear flat. You can notice spikes on the right side, if there are any.

The images below show profiling of pathfinder functions. As you can see from the sudden rise on the right-side, those function sometimes take much longer than their average case.

This graph shows you a function which generally takes under 5ms, but about 50% of the time takes 40ms or more.

The frame-by-frame graphs shows you how long an event took frame-by-frame, which can be used along with the report timelines to find out precisely what frames are slow and thus what causes the slowness.

Comparing reports

You can easily compare multiple runs using the source/tools/profiler2/profiler2.html file by loading multiple profiling reports. Load them in the interface and click on some of the events. The graphs will then show you the event for all the reports so you can easily compare them. You will also get a table recapitulating some information to help you decide if an optimisation is worth it.

Choosing what you wish to profile

This part requires you to change the c++ and recompile the game.

The Profiler2 uses a 4Mb buffer internally. This can occasionally overflow and you will lose earlier frames. If you wish to profile very frequent events, or over a very large number of frames, you may need to change what is profiled from SVN.

Profiler2 offers 3 ways to profile events: simple profiling, profiling above a certain runtime, and "aggregating" sub-events over a certain time.

Simple profiling

This is the simplest solution: all events will be recorded uniquely in the timeline. To use simple profiling, use the 'PROFILE2(name of your event)' macro. This will profile the current scope, if you need to split, note that this is just a wrapper for a CProfile2Region so you can use that directly to get more control.

Profiling in case of spikes

Sometimes, you don't really care about a certain event, unless it suddenly goes above a certain treshold and causes the game to hang. To profile such events, you can use the 'PROFILE2_IFSPIKE(name,minimal time in seconds)' macro. The events will only show up in the timeline if they are above this treshold. This can help you profile events over a long time if you were otherwise running in the buffer limit. Note that any "nested" profiler2 call inside a spiked region will be discarded if the spiked event does not hold, so you can get very finely grained profiling in case of spikes without cluttering your timeline by using this function.

Aggregated profiling

Other times, you do not particularly care about any specific event but rather about all those events as a whole. An example of this is pathfinder calls: you rarely care about any call in particular but rather the calls as a whole. If you want to get profiling information about some specific part of the pathfinder, and you use simple profiling, you will have thousands of events per frame which will quickly overrun your buffer and make you only able to hold a handful of frames. By using the aggregated buffer, you will see only a summary of these events. Note however that this has a small runtime cost (usually < 5ms) so it should only be used if you actually intend to profile events, not left running in the background.

The Aggregated profiling is also a Spike profiling, so you can simply discard it if it's below a certain treshold to avoid the cost of aggregating the data.

To use it, call PROFILE2_AGGREGATED(name, minimal runtime). Any nested call to PROFILE2 or PROFILE2_IFSPIKE will be aggregated.

Note that this currently levels the hierarchy and you cannot nest Aggregated calls.