Project

General

Profile

Geant4 EM Physics Code Review Weekly Phone Meeting

Meeting

Date: August 23, 2013 at 10.00 AM US CDT (Fermilab local time)
Place: 1-866-740-1260 (ReadyTalk line) (Host : Krzysztof Genser)

Participants:

Andrea Dotti
Krzysztof Genser
Boyana Norris
Soon Yung Jun

Discussion (Summary by Krzysztof)

Boyana was able to setup Tau; has added preliminary results to the wiki
will send instructions on how to use it

Soon has profiled r07,r07a with -fno-inline

CLHEP::MTwistEngine::flat() count seems similar with those options,
-fno-inline causes the new code being slightly slower (as it was the
case with -O0)

will post results shortly

Boyana: inlining is a "global" operation, so it may be difficult to
concentrate on very specific functions

Soon: r07->r07a modification had changed the library structure a lot
(based on using objdump)

We decided to concentrate on using SimplifiedCalo for now (and use
global profiling done by Soon if needed), we'll probably not use
cmsExp

When running SimplifiedCalo one needs to set specific environment
variables, look at the README file
(Soon added ways to select EM processes as it was the case in cmsExp)

Boyana will try to change G4VMultipleScattering and eliminate one
layer (and may be make some functions not virtual)

We need to know what needs to be reimplemented if
G4VContinuousDiscreteProcess and G4VDiscreteProcess are removed:
there is a flag saying what is what; need to check

Boyana will look at finding out which derived class had caused
G4VEmProcess to be so prominent in the profiling reports

Andrea: using random number generators array interface - seems 10% faster

Count fraction of G4SampleFluctuations goes down after using array
interface but the global change is negligible very frequently used^
~50k/event (in a loop of a random length) we should use array interface
if possible (as it can allow for more future optimizations)
we should suggest that clhep should use array interface internally

We should suggest cashing the pointer to the random number generator singleton
NThreads
Soon will put G4Physics2DVector info on the wiki

Performance of GXPhysics2DVector: total 4096*32 invocation of Value(x,y)
---------------------------------------------------------------------------------                 
input (tracks)   double  double  double  double  float   float   float    float
table (phys2d)   double  float   float   double  float   float   double   double
calcu (return)   double  double  float   float   float   double  double   float
--------------------------------------------------------------------------------
CPU[ms]          14.352  16.177  14.060  15.730  12.175  13.523  13.038  14.396
GPU[ms]           0.377   0.333   0.312   0.390   0.252   0.271   0.315   0.349
CPU/GPU          38.057  48.571  45.086  40.289  48.314  49.985  49.985  41.303
--------------------------------------------------------------------------------
CPU: AMD Opteron(tm) Processor 6136 (800x4 MHz)
GPU: Nvidia M2090 with <<<NBlocks,NThreads>>> = <<<32,128>>> thread organization

Boyana added vectorization reports to the wiki: (look for "WAS VECTORIZED")

Krzysztof will contact Vladimir regarding the talk at the Geant4
collaboration workshop

Next meeting in 2 weeks: Sept 6th 10am Central