Project

General

Profile

Geant4 EM Physics Code Review Phone Meeting

Meeting

Date: July 22, 2013 at 02.00 PM US CDT (Fermilab local time)

Participants:

Andrea Dotti
Krzysztof Genser
Boyana Norris
Soon Yung Jun

Discussion (Summary by Krzysztof)

We met today for the initial review session of the G4PhysicsVector code.

We decided that we would review ref07 version of the code and that we
would concentrate on G4PhysicsVector itself today and skip other
potential classes like G4PhysicsTable for now.

Our current conclusions, directions:

0.) move comment about G4PhysicsVector being pure virtual to the class description
0.1.) document/comment about all the choices made if we implement them
1.) reorder the if statement in the Value function
2.) replace some ifs with the ?: operator
3.) test various variants of SplineInterpolation
4.) see if one can avoid certain ifs/calls by not testing useSpline
as its value is known after the initialization
5.) replace
G4PVDataVector dataVector; // Vector to keep the crossection/energyloss
G4PVDataVector binVector; // Vector to keep energy
G4PVDataVector secDerivative; // Vector to keep second derivatives
with arrays (as the size of the objets is know at the
initialization) and may be with something like an array of structs
of 3 (2?) values to simplify the data structures and improve the
data locality

Soon will provide Boyana with an example which could be made into a
unit test to try the above ideas. Andrea would also make some tests
related to 1. and 2. above. We shall meet again on Friday (9am Central).
Boyana shall send the phone instructions.

Follow up Discussion

  • Krzysztof : Here are my two suggestions re ?: etc...
inline
  G4bool G4PhysicsVector::IsFilledVectorExist() const
{
return (numberOfNodes > 0);
}

inline
  G4double G4PhysicsVector::Interpolation(size_t idx, G4double e) const
{
return (useSpline ? SplineInterpolation(idx, e) : LinearInterpolation(idx, e));
}

  • Soon : add a unit test program into g4hpcbenchmarks/cmsExp.
To use it

1) udpate your local git (actually only for cmsExp)

git clone http://cdcvs.fnal.gov/projects/g4hpcbenchmarks

2) create physics table files (this step is same for your usual test except
    additional flag for writing physics tables with CMSEXP_LAMBDA_TABLE=1)

cd cmsExp
mkdir bin ; cd bin
cmake -DProject=cmsExp -DGeant4_DIR=<your_geant4_build_dir> ..
make
cd ..

export PHYSLIST=cmsExpEmPhysics
export EMPROCESS=EmStandard
export CMSEXP_LAMBDA_TABLE=1
./bin/cmsExp run_eGamma.g4

All EM tables (for one material, PbWO4) will be created under ./table directory

3) build the unit test program (testPV.cc)

cd bin
cmake -DProject=testPV -DGeant4_DIR=<your_geant4_build_dir> ..
make
cd ..
./bin/testPV

if you put your modified version of G4PhysicsVector directly under /src and /include,
and recompile it to test performance without recompiling the whole geant4;

src/G4PhysicsVector.cc
include/G4PhysicsVector.hh
include/G4PhysicsVector.icc

In case that you want to use the latest geant4 reference release (geant4-09-06-ref-07)
that we talked about at the meeting, please download it with wget

wget http://oink.fnal.gov/perfanalysis/g4p/download/geant4.9.6.r07.tar.gz

(data files should be compatible with the geant4 10-beta release, which you already have)

  • Andrea :
I have tried out the simple modifications you have proposed, plus I have re-organized "if" statements (most probable case on top) and change the signature of one function that was always used as in the following to update "a" itself from a "hint":
a = FindBin( … , a);
changed from:
int FindBin( …, int a);
to:
void FindBin( … , int& a);    -to avoid copies around of the parameter a-

All these changes do not make any measurable difference in performances in the simple test I am running.
(however they clean up the code, so we may want to keep them anyway).

I would like to come back to a possible study that we briefly mentioned during the phone meeting. The main interface we are analyzing is inlined:

inline G4double Value(G4double theEnergy) const;
This calls a non inlined method (that could become inlined too):
G4double Value(G4double theEnergy, size_t& lastidx) const;

This method calls FindBin (that is also inlined).
However FindBin calls the *pure virtual* method
virtual size_t FindBinLocation(G4double theEnergy) const=0;

Due to the fact that this method is pure virtual I think the compiler can not inline the chain of calls. Am I correct? Plus there is an additional indirection through the vtable.
I imagine we could optimize out all this by removing the virtual specification and doing something else (this method implements a different algorithm for each type of G4PhysicsVector: linear, logarithmic, …).

I tried out a possible design with template parameters and specialization (to avoid any call to virtual methods). Attached a minimalistic example.

There is a non trivial change required in several places in G4 (I can estimate something like O(100) lines of code). What currently is done is something like:
aPhysicsVector = new G4PhysicsLogVector( …. );
should become:
aPhysicsVector = new G4PhysicsVector( …. , LOG ); //E.g. no inheritance anymore, specify vector "type" in constructor

I am not expert enough to understand if removing the call to virtual and a more aggressive use of "inline" could actually save CPU at all or if it is worth (basically the virtual call is replaced by a switch statement).

Could the unit test created by Soon help on taking this decision?
What do you think?

To tell the truth I have been a little bit simplistic , there are "virtual" methods that would require template specialization (but all these are much less critical).
  • Boyana: Response to Andrea
Some comments below...

I would like to come back to a possible study that we briefly mentioned during the phone meeting. The main interface we are analyzing is inlined:

inline G4double Value(G4double theEnergy) const;
This calls a non inlined method (that could become inlined too):
G4double Value(G4double theEnergy, size_t& lastidx) const;

This method calls FindBin (that is also inlined).
However FindBin calls the *pure virtual* method
virtual size_t FindBinLocation(G4double theEnergy) const=0;

Due to the fact that this method is pure virtual I think the compiler can not inline the chain of calls. Am I correct? Plus there is an additional indirection through the vtable.

Yes, most likely. This is why I was suggesting to implement a source transformation tool to do this inlining specifically for Geant4.

I imagine we could optimize out all this by removing the virtual specification and doing something else (this method implements a different algorithm for each type of G4PhysicsVector: linear, logarithmic, …).

I tried out a possible design with template parameters and specialization (to avoid any call to virtual methods). Attached a minimalistic example.

There is a non trivial change required in several places in G4 (I can estimate something like O(100) lines of code). What currently is done is something like:
aPhysicsVector = new G4PhysicsLogVector( …. );
should become:
aPhysicsVector = new G4PhysicsVector( …. , LOG ); //E.g. no inheritance anymore, specify vector "type" in constructor

I am not expert enough to understand if removing the call to virtual and a more aggressive use of "inline" could actually save CPU at all or if it is worth (basically the virtual call is replaced by a switch statement).

Yes, inlining will help in two ways -- the first is in removing the overhead of one or more (virtual) function calls.  The second is the potential for more optimizations because the compiler has more straight-line code to work with. On the negative side, inlining would result in creating several specialized functions for each process, i.e., code bloat.

Could the unit test created by Soon help on taking this decision?
What do you think?

I think so, if nothing else, we can try inlining by hand for a couple of specific processes and measure the difference.