Project

General

Profile

Geant4 EM Physics Code Review Weekly Phone Meeting

Meeting

Date: August 2, 2013 at 10.00 AM US CDT (Fermilab local time)
Place: 1-866-740-1260 (ReadyTalk line) (Host : Boyana Norris)

Participants:

Andrea Dotti
Krzysztof Genser
Boyana Norris
Soon Yung Jun

Discussion (Summary by Krzysztof)

We had another phone meeting today (2013 08 02) to continue the review
of G4PhysicsVector code.

Boyana: switched to 9.6.ref07, then could not reproduce the ~1.5%
improvement seen with 9.6.ref06, working on the details
(including getting FNAL Kerberos credentials)

Krzysztof: working with 9.6.ref07:

Introduced a new helper class G4xyd.cc with three G4double data
members, operator<, and c'tors(), (G4double x,G4double y,G4double d)

Then modified:

G4PhysicsVector class replacing

G4PVDataVector dataVector; // Vector to keep the crossection/energyloss
G4PVDataVector binVector; // Vector to keep energy
G4PVDataVector secDerivative; // Vector to keep second derivatives

with

std::vector<G4xyd> comboVector;

initialized all data mebers in the c'tors, removed copy c'tor and operator=,
used FillSecondDerivatives in ScaleVector and used ?: in Interpolation

Then modified

G4LPhysicsFreeVector.cc
G4LPhysicsFreeVector.icc
G4PhysicsFreeVector.cc
G4PhysicsLinearVector.cc
G4PhysicsLnVector.cc
G4PhysicsLogVector.cc
G4PhysicsLogVector.hh
G4PhysicsOrderedFreeVector.cc
G4PhysicsOrderedFreeVector.icc
G4PhysicsVector.cc
G4PhysicsVector.hh
G4PhysicsVector.icc

accordingly, plus replaced the hand-coded binary search in
G4LPhysicsFreeVector::FindBinLocation with std::lower_bound

Soon had profiled this modified (let's call it ref07a) version using
the standard Geant4 profiling methodology and saw ~1-2% improvement
over ref07, see: https://oink.fnal.gov/perfanalysis/g4p/

A comment regarding reproducibility of results: the final random numbers
for modified and unmodified (Geant4) SimplifiedCalo runs were the same
even when using the simple profiler.

The printout of the track parameters in cmsExp case was also the same
for modified and unmodified Geant4 however it was not the case when
using the (simple) profiler on cmsExp indicating a problem with the
cmsExp executable.

valgrind reports many uninitialized variables and memory problems with cmsExp

Discussion of other points:

We shall look briefly at G4Physics2DVector classes to be able to
comment on the class structure etc..., but we shall not attempt any code
modifications there.

We shall review G4VEmProcess class then and afterwards decide which one to
look at next.

Regarding the review report:

We shall divide it into strong and soft recommendations and
e.g. informational part (where e.g. we may suggest using the Intel MKL
(math kernel lib) Boyana mentioned)

Soon will create a new project/repository in Redmine (done)
https://cdcvs.fnal.gov/redmine/projects/g4emreview

Krzysztof will send info on the simple profiler (done)
https://cdcvs.fnal.gov/redmine/projects/fast

We shall meet again on Friday (10am Central).

Follow up Discussion

  • Krzysztof : link to the profiler code
    Here is the link to the profiler code etc...:
    
    https://cdcvs.fnal.gov/redmine/projects/fast
    
    Start from:
    
    https://cdcvs.fnal.gov/redmine/attachments/5220/README.pdf
    
    the tarball is in:
    https://cdcvs.fnal.gov/redmine/projects/fast/files
    
    (Marc promised to make the code mods this afternoon, but before he does, the
    required change(s) were in:
    
    SimpleProfiler/SimpleProfiler.h
    
    #include "CommonTypedefs.h" 
    #include <vector>
    #include <stdint.h>
    #include <sys/time.h> <-------
    #include <string>
    #include <pthread.h>
    #include <signal.h>
    #include <stdio.h>
    #include "Defines.h" 
    
    one runs it like
    
    profrun ./cmsExp run_eGamma.g4
    
    Look at profdata_nnn_nnn_mmm_names for function ids (first column) and their
    fractional times etc...
    
    to make call graphs do e.g. :
    
    profgraph -n profdata_nnn_nnn_mmm fid 50 50
    
    (each programs prints help if run with no arguments)
    
    the manual explains the fields in the profdata_nnn_nnn_mmm_names file
    
  • Soon : profiling result with 9.6.ref07a
Just for your information,  the result of profiling with the update by Krzysztof
is available at

https://oink.fnal.gov/perfanalysis/g4p/

see the item  9.6.r07a  under "8) Other Test Results" which can be directly
compared to "9.6.r07" under "2) Profiling Results".

As a short summary, CPU changes compared to 9.6.ref07 are list below.

# 9.6.r07a        9.6.r07   (r07a-r07)/r07  (particle.physics.energy.magnetic field)
------------------------------------------------------------------------------------
  374.4500       380.7700  -1.65979    higgs.FTFP_BERT.1400.4
    0.0185         0.0189  -2.1164     e-.FTFP_BERT.1.0
    0.0926         0.0952  -2.73109    e-.FTFP_BERT.5.0
    0.1858         0.1902  -2.31335    e-.FTFP_BERT.10.0
    0.9131         0.9388  -2.73754    e-.FTFP_BERT.50.0
    0.0241         0.0247  -2.42915    e-.FTFP_BERT.1.4
    0.1211         0.1240  -2.33871    e-.FTFP_BERT.5.4
    0.2434         0.2488  -2.17042    e-.FTFP_BERT.10.4
    1.2006         1.2293  -2.33466    e-.FTFP_BERT.50.4
    0.0320         0.0319  0.31348     pi-.FTFP_BERT.1.0
    0.1398         0.1396  0.143266    pi-.FTFP_BERT.5.0
    0.2680         0.2676  0.149477    pi-.FTFP_BERT.10.0
    1.2116         1.2185  -0.56627    pi-.FTFP_BERT.50.0
    0.0356         0.0357  -0.280112   pi-.FTFP_BERT.1.4
    0.1598         0.1599  -0.0625391  pi-.FTFP_BERT.5.4
    0.3085         0.3098  -0.419626   pi-.FTFP_BERT.10.4
    1.4261         1.4405  -0.999653   pi-.FTFP_BERT.50.4
    0.0356         0.0357  -0.280112   pi-.QGSP_BERT.1.4
    0.1561         0.1575  -0.888889   pi-.QGSP_BERT.5.4
    0.3048         0.3064  -0.522193   pi-.QGSP_BERT.10.4
    1.3749         1.3916  -1.20006    pi-.QGSP_BERT.50.4
    0.0321         0.0322  -0.310559   pi-.QGSP_BIC.1.4
    0.1504         0.1508  -0.265252   pi-.QGSP_BIC.5.4
    0.2905         0.2921  -0.547758   pi-.QGSP_BIC.10.4
    1.3391         1.3536  -1.07122    pi-.QGSP_BIC.50.4
    0.0841         0.0846  -0.591017   anti_proton.FTFP_BERT.1.4
    0.2163         0.2167  -0.184587   anti_proton.FTFP_BERT.5.4
    0.3617         0.3638  -0.57724    anti_proton.FTFP_BERT.10.4
    1.5070         1.5165  -0.626442   anti_proton.FTFP_BERT.50.4
    0.0272         0.0271  0.369004    proton.FTFP_BERT.1.4
    0.1629         0.1627  0.122926    proton.FTFP_BERT.5.4
    0.3159         0.3161  -0.0632711  proton.FTFP_BERT.10.4
    1.4779         1.4839  -0.40434    proton.FTFP_BERT.50.4