Project

General

Profile

Tools for debugging and profiling code

This page gives a brief introduction and overview to a couple of useful tools which may be used to debug and to profile code. Totalview has useful features for debugging code and also for tracking memory use throughout the running of a program; Allinea is a profiler and provides a comprehensive overview of the various parts of the code -- how much time each takes to run, how long the program spends executing each command and where the code can be optimised to improve efficiency. Both are installed and available on the lbnegpvm nodes.

Totalview

This is a general purpose debugger with many features which will not be mentioned here (see the website for a more complete overview). Set up totalview (on an lbnegpvm node) using

setup totalview

assuming you have already set larsoft up. If, for some reason, you are using this outside of larsoft, first the products setup script should be sourced:

source /grid/fermiapp/products/setups.sh

Start the debugger by giving the program executable (e.g. lar) to the totalview executable:

totalview lar -a <lar run options>

For example,

totalview lar -a -c prodsingle_lbne35t.fcl

This will start the GUI and present you with a box to configure startup parameters. Click ok without selecting any.

Totalview is now ready to go. Hit go to start the program; this will run much slower than when not debugging. Tools such as pause, next, step etc allow the user to follow the execution of the code as it happens and see what's taking a lot of time or memory (although better options for these are discussed below). If the code ever crashes then totalview will stop and show you the line of code being run at the time (useful for seg faults).

Note: will work more effectively when using larsoft compiled using 'debug'.
See the debugging page on the LArIAT wiki for more information.

Memscape

Memscape is included in totalview and is a specialist tool for looking at the memory usage during the running of a program. It is very useful for finding memory leaks.

Setup totalview to use (on an lbnegpvm node):

setup totalview

and run using the memscape executable:

memscape lar -a <lar run options>

This brings up the Memory Debugging Session GUI; click 'Start memory debugging all programs in your current session'. This will start the program running and wait for you to interrupt before showing anything (note this will run very slowly!). You can follow the progress of the code by looking at the normal output to screen in the terminal.

When ready to look at the memory usage, memscape offers various options such as 'Leak detection source report', 'Heap graphical report' etc. Go for 'Leak detection source report' to begin with. It will stop running briefly and collate all relevant information to present the report. When done, it will ask whether or not you want to carry on running the code whilst you look at the current results.

The report details the total memory used so far during running and breaks it down through the various parts of the code. If there is a memory leak present it will very show very obviously in the part of the code where the leak is! These sections are all expandable so the leak can be traced down to a single line.

Allinea

Allinea is an incredibly useful tool to comprehensively detail all parts of a program to reveal where the most time is being used during execution. Highly recommended if wanting to streamline code!

It is available on the gpvm nodes using the following instructions:

Ensure your PRODUCTS environment variables includes the path /grid/fermiapp/products/local/

export PRODUCTS=${PRODUCTS}:/grid/fermiapp/products/local/
echo ${PRODUCTS}

(as of Jun 15 this is not setup in the setup_lbne.sh script).

To check the versions which are available use

ups list -aK+ allinea

and setup up the version required (e.g. v5_0_1):

setup allinea v5_0_1

Once set up, a map needs to be generated to run the profiling over. Note that the profiling is done in two stages: creating a map and running the profiling over the map -- so consider running just a few events initially so you're not waiting for an hour to generate an initial map before you can even look at the results. It will probably be obvious from a few events (~100) where the inefficient code is.

To generate the map, use the following command:

map --profile --nompi --start `which lar` -c ...

Note the full path to the executable must be provided so using `which lar, art etc` will provide this. The profile option allows profiling to be done after.

You can then use the same executable to start the GUI to look over the map:

map examplemap.map

There are many available tools to play around with to get a full appreciation of the efficiency of various parts of the code. The illustrations at the top show the memory usage etc over the course of running the code. Possibly more useful is to step through the code using the collapsable sections at the bottom, which break down the time spent in each function/line down as a percentage on total code running time. Selecting a particular function or line by double clicking pulls up the source code (if this is available to Allinea) and allows a full evaluation all the way through this also.

There are many many functions which prove hugely useful when profiling code -- this is intended only to start the process!