Status meeting April 19th, 2012¶
Update from Soon.¶
With the code being completed on both CPU and GPU, if shooting 100Ge ion, with 100,000s secondary tracks, I see 50% do more than one try of RK.
In the case of 100,000s track speed up is 37 times excluding upload/download time and 35 times include upload/download.
In the case of p>100MeV, speed up is 90 times where almost none needed more than one try.
Also studied the effect of photons: converted 15% of the photons into 'electrons' will gain around factor 20% (so speed up is 45 times including upload/download).
Soon started looking at the code that calculated the step size which in the end it needs to calculate the distance from the current point to the next edge/surface.
Question: Should we investigate Burlirsch-Stoer intergration methond which is slower but more accurate than RK and does not require retried (i.e. better for GPU since there is no conditional statement/loops)?
Question: Should we pursue implementing the photon linear transportation which requires information about the geometry? Maybe we can focus on just the CMS crystals (i.e. reduce the geometry). The only question is what is the distance to the 'next' geometry boundary. (Geant4 has several geometrical mode (voxel, parametrized, etc)). Still there are many shapes.
Update from Philippe¶
Implemented using CUDA stream and not able yet to assert the performance gain (gain seems to be in the same range as the accuracy of the timings).
Update on MagErrorStepper cuda implementation (Philippe).¶
One additional discovery about the use of GPU texture memory. Most of the reported gain seems to be due to memory latency (i.e factor two compared to doing the explicit calculation using an array in main GPU memory). This shows up clearly in the fact that on the standalone test, the factor 2 exist if the input is random ; if instead the input is ordered, the difference is negligible.
This show up because switching from using the texture or not using the texture in the example using the full MagErrorStepper::Stepper and 'real' data, there was no noticeable difference between using the texture or not using the texture. This is seemingly due to the fact that the data is somewhat ordered and that MagErrorStepper::Stepper explicitly does a dozen of very close-by lookups.