Driven by the needs of the HEP scientific program for better physics and increased time/memory performance, we would like to focus on a Geant4 redesign that takes full advantage of new computing technology.
- Profiling tools and protocols applied to relevant Geant4 simulation applications.
- Framework driving parallel processing including event-level parallelism.
- Geant4 high level architecture re-design and implementation.
Our understanding is that Geant4 has plans for reengineering, for performance improvement and multi-core and many-core efficiency gains and use.
- How pervasive is the effort? How much change is allowed in internals, and in user interface?
- What's the organization for the re-engineering project, who's the leader, what institutions are involved, how is the work subdivided?
- What are SLAC's and CERN's roles at the moment?
- What is the conclusion of CERN's R&D studies (Rene's efforts)?
Please see Geant4ReengineeringIdeas
Sequential performance of Geant4 code remains important. The measurement of the performance of realistic experiment codes is not easy, because substantial "hotspots" have usually been detected and remove before we become involved in the project.
We are interested in learning what capabilities the PERI tools provide, since we do not have deep expertise in their use.
We use our own tool (the FAST profiler) to sample full call-stack information for single-threaded applications, and to perform statistical analysis on the collected data. We have found that full call stack information is important in understanding large applications that involve complicated libraries, because the performance characteristics of a library routine often depends on the context from which it is called. We have found statistical analysis to be necessary because performance analysis is mostly a data exploration exercise.
Code transformation tools¶
The PERI project includes ROSE, a "source-to-source" translation tool. Here are some ways in which such tools might be of use in our involvement with Geant4:
1. ROSE would be useful for analysis of code; ROSE understands C++ at a much deeper level than do text-manipulation tools (e.g. grep)
2. ROSE might be useful in helping to generate tests; the PERI binary instrumentation stuff might be useful in this as well. Testing is a critically weak spot.
3. It might be worth investigating the use of ROSE to generate code for things like translating between array-of-struct and struct-of-array formats, which are of interest for both vectorized instruction use and GPU programming.
4. ROSE might be helpful in automated analysis of the code, to augment the analysis of the performance results. For example, we might be able to use ROSE to enhance our "metadata" about functions the profiler observes. Right now we have only function names and addresses. It would be great to be able to identify functions as belonging to specific types of classes, as taking certain kinds of arguments, of being instantiated from a certain template, etc. Parsing names to do this is very hard, but this should be trivial for ROSE.