Dec 2012 NQS Session


  1. make some progress on parallel i/o issues: using HDF5 for reading and writing of event data. We will ignore as much metadata as we can.
  2. make some progress on Hough algorithm in parallel version: CUDA and TBB.

We are not planning on doing an OpenMP solution for the Hough algorithm.

We are ignoring the MPI i/o facilities of HDF5.

We will still need ROOT dictionaries for whatever is used in the product-finding code.

Data products

We have both SOA and AOS versions of the data products for this task. We have the products:

  1. Hit and Hits
  2. Track and Tracks
  3. Found2DTrack Found2DTracks

HDF5 tasks

We can hand-code the i/o code for the current data products. We should keep an eye out for how we might find or write a tool that would generate the code to read and write data structures based on a data definition language. One candidate DDL is Google's protocol buffers.

Keep an eye out for what dependencies we have upon ROOT for our event queries. Also note any use of Reflex, because it has gone away.

  1. Deal with persisting the SOA version of data first
  2. Use the Source class template to write an input module that reads HDF5
  3. Write an output module that writes HDF5. We will only have one active output module (at least at a time); we don't care about parallel i/o now. Eventually we will want to have to module instances writing to the same file from different threads. We don't need to do this now!
  4. We only have to have getByLabel working; no other event-searching functions need to work (probably can skip all history information)
  5. Use HDF5 to save the ParameterSet used to configure the program (top-level object, and all its contents)

Hough algorithm tasks

We can identify several steps to the algorithm.

  1. Filtering of hits:
    1. Clustering of hits in time (may be done by histogram binning)
    2. Identification of the subset of clusters that are interesting (signal clusters vs. noise clusters);
    3. tagging the hit data to indicate what hits belong to each cluster; possibly extract the clusters or hits as data products (which may or may not be event data products)
  2. Hough transform:
    1. Generate all pairs of hits within each cluster
    2. Calculate Hough parameters for each pair of hits
    3. Fill the Hough space histogram
    4. Identify peaks in the Hough space histogram
    5. Identify the hits which contribute to each peak; these are our track candidates
    6. Calculate the track parameters from the hits
    7. Calculate residuals for hits from the found track

Our immediate goal is an adequate, not optimal, solution, engineered sufficiently well that it can be upgraded.
We'll implement an algorithm first in C++, and then in CUDA and using TBB.
We will do comparison of the found tracks to the MC truth information externally to this code.