For those of you who work with both the python codebase and the
backend, I found a pretty useful tool. Seeing as we work with
performance-sensitive software, profiling is very useful; but, it can be
a pain to profile our
c++ code when called through python, which
c++ wrappers to functions for basic profiling.
The solution I found is called
which is a python module made specifically to profile
In order to install, simply run:
For khmer, you should also be sure to turn on debugging at compile time:
The first is the python module implementing the profiler; the second is the tool for analyzing the resulting profile information.
There are a couple ways to use it. You can call it directly from the command line with:
-- is necessary, as it tells UNIX not to parse the resulting
arguments as flag arguments, which allows the profiler to pass them on
to the script being profiled instead of choking on them itself. Thanks
for this trick, @mr-c. Also make sure to use the absolute path to the
script to be profiled.
You can also use the module directly in your code, with:
The resulting file is then visualized using google-pprof, with:
In order to get python debugging symbols, you need to use the debugging executable. So, while you may run the script in your virtualenv if using one, you give google-pprof the debug executable so it can properly construct callgraphs:
Here is some example output:
In this call graph, the python debugging symbols were not properly included; this is resolved by using the debugging executable.
The call graph is in standard form, where the first percentage is the time in that particular function alone, and where the second percentage is the time in all functions called by that function. See the description for more details.
And that’s it. Happy profiling!