Project

General

Profile

Rob's Outline

Introduction

I envisage that a person using the workbook will

  1. Go to the workbook web/wiki page.
  2. Log in to a GPCF machine.
  3. Follow the instructions to establish the working environment for the exercises - probably just a ups setup command.
  4. Follow the instructions on the web/wiki page to check out the workbook code from git.
  5. Follow the instructions to build and execute Exercise 1; do the suggested exercises for Exercise 1.
  6. Repeat the previous step for the remaining Exercises.
  7. Follow the links to more complete documentation whenever that makes sense.

Files prepared for use in the early exercises.

The first few exercises will read in existing files of event-data from the toy detector; these will contain generated, simulated and reconstructed data products.
Part of setting up the environment for the exercises is to ensure that these files are on disk and visible to the person doing the exercises.

The plan is for these files to be present:
  1. 10 events, from run 1, subrun 1.
  2. 20 events; 10 from run 2, subrun 1 plus 10 from run, subrun 1
  3. 2000 events; run 3 - whatever subrun structure makes sense.

The reason for 3 files will become apparent in Exercise 1.

input files

  • inputFiles/input01_data.root - run 1, subrun 0, 10 events
  • inputFiles/input02_data.root - run 2, subrun 0, 10 events
  • inputFiles/input03_data.root - run 3, subrun 0-2, 15 events
  • inputFiles/input04_data.root - run 4, subrun 0, 1000 events

One of the later exercises will be to make these files.

List of Exercises

In the list below the first 6 exercises correspond to what is actually in the art-workbook git repository.

The rest of the list is motivated by the Mu2e Software Workbook which contains art and Mu2e documentation mixed together. I learned a lot from doing that workshop so the exercises here are shorter and in a different order than the exercises in the Mu2e workbook - so the numbering scheme below has diverged from the scheme in the Mu2e workbook.

The logic behind this outline is to identify the pieces of this documentation that are really generic art documentation and to move them to the art Workbook. I will then refactor the Mu2e workbook to reference into the art Workbook and art Reference Manual.

Exercise 1

  1. Example code:
    1. A module that prints the eventId for every event. Cookbook instructions on how to build and run. # A .fcl file to run the example
  2. Activities:
    1. art --help # .fcl comment is # in any column. # Change the number of events to run; via command line; via fhicl file # Change the input file name; via command line and via fhcil file. # Concatenate two, then three, input files together; via command line and via fhcil file. # Skip the first 5 events. # Skip to at a given Run/subrun/event # Change the input filename to the name of file that does not exist or is in the wrong format; observe the error message. # Make a new module: copy Ex01_module.cc to Ex01a_module.cc; make some modification to mark it as yours; build it; modify .fcl; run it. # Make another new module using the artmod script; repeat the exercise. # The source for art::Event is in $ART_INC/art/Framework/Principal/Event.h # Look at it by: less $ART_INC/art/Framework/Principal/Event.h
    1. less is a pager, like more. # move up and down by typing lower case u and d; exit by typing a lower case q # Or look at it in a code browser: Event.h # See that the method id() returns an object of type art::EventID. # The source for art::EventID is in $ART_INC/art/Persistency/Provenance/EventID.h # Or look at it with a code browser: EventID.h # Looking at EventID.h, see how to access each ofthe run, subrun and event numbers. # Modify the code in Ex01_module.cc to print each part of the eventId separately: See answer in Ex01Activity01_module.cc and ex01Activity01.fcl # Use an incorrect name for the .fcl file; observe the error message. # Forget the -c on the command line; observe the error message # In the .fcl file, the string "hello" is called a module label; a lot more about that in Ex05. It is an almost arbitrary string that must match in the two places that it is used; it may contain any letters and numbers and case is significant; the almost is that it must not contain any underscore ( _ ) characters. # Change the module label in both places; see that it works; change it in one place, observe the error message. # Change the module_type to something that does not exist and observe the error message. # The name of the path, e1, is an arbitrary string. It must match in path definition and inside the end_paths list. Change it both places and observe; change it in one place and observe error message
  3. Ideas to discuss
    1. process name: not important for now - just an (almost) arbitrary name; must not contain underscore characters; may contain letters and numbers; case sensitive. # art::Event and art::EventID # Introduce the idea of the three part eventID. # Find the source code for the class EventID; with code browser; using $ART_INC. # What is a module? # Not all C++ classes are modules. # What makes a class a module? Need to mention inheritance but not describe it. # namespaces # Analyzer modules are not permitted to modify the contents of the event; there are other types, to be introduced later than may. # The source block in the FHiCL file. # The analyzer block in the FHiCL file. # Names that must match: file name XXX_module.cc; class name inside file XXX; module_type in .fcl file is XXX. # The module label is an (almost) arbitrary string unrelated to the XXX in previous bullet; the name in the module definition and the path must match # A module label must not contain underscore characters. # Possible cut down ( or increase ) verbiage produced by art. We recommend that to keep the settings used here; if you reduce it, sooner or later you will wish you had not. # Forward reference override but do not describe it. # Look at Ex01/CMakeLists.txt. This file is read by the build system and it tells the build system what to do. It says to find all of the files named _module.cc and turn them into .so libraries in the directory lib/ in the build area. It uses standard compiler and linker flags (for where to find them and how to modify them see ... ?). The 7 lines between the first and the last are the list of libraries against which the .so will be linked. The link flags require that all external references be resolved at link time. # Now, in the build area, do ls lib/libEx01*. See that there are files named lib/libartwb_Ex01_Ex01_module.so and lib/libartwb_Ex01_Ex01Activity01_module.so. The first was built from Ex01/Ex01_module.cc and the second from Ex01/Ex01Activity01_module.cc. # At run time, art looks for libraries named *_module.so from all locations in $LD_LIBRARY_PATH. The first element in LD_LIBRARY_PATH is the local lib directory. # Note the mangled names of the .so files. Don't explain. Link to a reference. # In the c'tor, there is no name following the ParameterSet argument. This tells that the compiler that the argument needs to be there but that we don't plan to use it. If it were present, and unused, the compiler would issue a warning and stop compilation. We have the compiler switches set to be maximally picky. This could be a real error if we made a typo somewhere in the body of the function and misspelled the name of the argument.

Exercise 2

  1. Example code
    1. Ex01 but provide the beginJob, beginRun, beginSubRun with "Hello World" type printout.
  2. Activities
    1. Add the endJob, endRun, endSubRun; solution provided in Ex02Activity01_module.cc and ex02Activity01.fcl
  3. Ideas to discuss
    1. Optional methods of a module; refer to User's Guide for file open and file close methods. # art::Run and art::SubRun and their ID classes. # CMakeLists.txt file the same as for Ex01

Exercise 3

  1. Example code
    1. Get a data product from the event (GenParticleCollection) and print out the size of the data product (number of generted tracks). # Hard code the module label of the creator module.
  2. Activities
    1. Handles can be used as pointers. See Ex03Activity01_module.cc and ex03Activity01.fcl. # ex03Activity02.fcl this runs the event dumper. Look at the names of the products in the event. Notice run scope data products. # Remove the const. Observe the error message # Misspell the name of the module that created the data product. Observe the error message. # Rewrite the exercise to use the handle as a pointer. # In the source area, art-workbook/artwb, do: diff Ex01/CMakeLists.txt Ex03/CMakeLists.txt Note that one line has been added, the name of the library that contains the data products.
  3. Ideas to discuss
    1. What is a data product # Find source for GenparticleCollection.h and GenParticle.h. # GenParticleCollection is a typedef. Under the covers it is just a std::vector<GenParticle>. Explain why we used a typedef. # art::Handle # const correctness in product access. # Explain about the overhead everytime that you dereference a handle. Use judgement about when to use which style of access. # Explain the naming scheme of data products - forward reference for the instance name part.

Exercise 4

  1. Example code
    1. Exercise 3 but get the module label of the product creator from the parameter set
  2. Activities
    1. In the fcl file, misspell "hitMakerModuleLabel". Observe the error message. # In the .fcl file, remove the line that defines "hitMakerModuleLabel". Observe the error message. # ex04Activity01.fcl defines lots of additional parameters for the module. Extend Ex04_module.cc to read these parameters and print out their values.
      The answer is given in Ex04Activity01_module.cc # The files # ex04Activity02.fcl and Ex04Activity02_module.cc introduce the idea of optional parameters. Run this example and observe the output. Comment out the definitions of debugLevel and efficency in the .fcl file; rerun and observe the changed printout. Add a line to read parameter b as an int; observe the run-time error.
  3. Ideas to discuss
    1. Use of member data to connect c'tor and analyze method. # leading _ syntax for data members ( most other options OK. Do not use double underscore or underscore capital ). # Initializer syntax for c'tor. # fhcil::ParameterSet # key : value pairs; value types: atomic, sequence, table # art parameter set = fhicl table # When are quotes required in values. ( embedded spaces or special characters ) # Required vs optional parameters; default values. # Why do we want to use parameter sets? # $FHICL_INC and source code browser. # Links to FHICL documentation

Exercise 5

  1. Example code
    1. This exercise is a small aside to reinforce the idea of module labels. It runs a job in which the same module is run three times, configured differently each time. # ex05.fcl Ex05_module.cc
  2. Activities
    1. Read the .cc and .fcl files. Run the .fcl file; observe the printout and understand why it does what it does. # Reorder the modules in the path e1. Watch the order of printout change. # Change the path to [ label1, label2, label2 ]; ie repeat a module label; rerun the job. The duplicated module label is only run once ! Why? art presumes that modules do not have side effects and it removes redundant calculations; this may seem werid but it is a feature when your experiments software suite becomes large, and its workflow develops grows many paths. # Run ex05Activity01.fcl. Read the code. Understand the scope and lifetime of the newly introduced variables. Understand how the static member function works. # ex05Activoty02.fcl. This is really not for beginners. ( Skip it on your first pass through the workbook? ) Run the .fcl. Read the code. Note the difference in the with Activity01. You will see the method shown in Activity02. We are telling you about it so that you will recognize it. We recommend against it. In this very simple and controlled example it works correctly. As we consider more complex examples in using static member functions, there are race conditions to consider. If these are not dealt with correctly, it may result in subtle, difficult to diagnose incorrect behaviour. The simplest solution
  1. Ideas to discuss
    1. There is one accident here. art does not actually guarantee the order in which modules in an endpath are run. # Why? art requires that all Analyzer modules be runnable in arbitrary order and that when the order is shuffled they produce the same result (up to trivial reordering of printout ). # On the other hand, for producer modules, which can add information to the event, and for filter modules, which can change the flow of event processing, the order in which modules run is important. We discuss this later.

Exercise 6

  1. Example code
    1. A copy of Ex04 but remove the printout of the number of hits per event and replace it with a histogram. # Run the example with: art -c ex06.fcl # After this, run root to view the histogram and to make the .pdf file: root -l ../art-workbook/artwb/Ex06/ex06.cint ( can we shorten the path to the cint file? ). After you see the block, click on the shell window from which you ran root and type ".q" ( without the quotes, with the the dot ) to exit root. # view the pdf file: display ex06.pdf ( get acroread or kpdf installed? ) # Print the pdf file ( flpr -q xxxxx output/ex06.pdf ), where xxx is the name of a print queue. There is a poster near each printer with its name.
  2. Activities
    1. Browse the output root file interactively:
    1. This uses a helper script distributed with the workbook. # type the command: browse output/ex06.root # This will open a root browser window. In the left hand panel there will a root file icon named ex06.root. Double click on this. # A subdirectory icon named "hello" will appear. Double click on this. # A histogram icon named nHits will appear; double click on this. The right panel will fill with a view of the histogram. # To exit root, return to the shell window and, at the root prompt type .q ( a dot character followed by the letter q ). # If you really want to use raw root,then:
    1. at the shell prompt: root -l # at the root prompt: TBrowser b; # This will open a root browser window. Navigate the left hand panel to find the icon for the file ex06.root. # Double click on this icon and follow the directions above. # In the .fcl file, change the name of the root output file. Rerun. Browse the new file. Modify the cint script to use the new file and make a new .pdf file. # In the .fcl file, change the module label "hello" to some other string ( in both places ). Rerun. Browse the file ex06.root interactively. Note that the subdirectory named "hello" has changed to the new module label that you used. # print the TCanvas using the pull down menu.
  3. Ideas to discuss
    1. ROOT # Histograms # TFileService # How to browse histograms using interactive root. # For each module that uses TFileService, art creates a subdirectory in the root output file. If you follow the recommended procedures, any histograms created by your module will appear in this subdirectory. Note the the subdirectories are named after the module label, not the module class name; so, if you have several instances of the same module class, each gets their own subdirectory. # ROOT convention: low edge is in the bin and high edge is not. So read the high bin as 30, not 31. # browse command # Look at the CMakeLists.txt file. Diff against any of Ex03...Ex05. There are four new lines; these are libraries needed for making histograms and for using GenParticles. # Postpone the discussion of "what is a service until Ex08".

Exercise 7

  1. Example code
    1. Ex07_module.cc ex07.fcl # art -c Ex07/ex07.fcl # root -l ../art-workbook/artwb/Ex07/ex07.cint
  2. Suggested Activities
    1. Two ways to look at source code:
    1. less $ART_WORKBOOK_BASE_INC/artwb/MCDataProducts/GenParticleCollection.h # less $ART_WORKBOOK_BASE_INC/artwb/MCDataProducts/GenParticle.h # Or via online code browser: GenParticleCollection.h and GenParticle.h # CLHEP a Class Library for HEP. "Home page": # For CLHEP, find files at
    1. $CLHEP_INC/CLHEP/Vector/ThreeVector.h, $CLHEP_INC/CLHEP/Vector/ThreeVector.icc # $CLHEP_INC/CLHEP/Vector/LorentzVector.h, $CLHEP_INC/CLHEP/Vector/LorentzVector.icc # Weird conventions: file names do not match clsas names. # Explain .icc convention; we do not recommend it (CLHEP is old). # OR code browser: Hep3Vector.h
      HepLorentzVector # Look at the printout and understand it.
    1. The ordinal number ( index in the collection ) # Particle Data Group (pdg) id code of the particle ( see ex08 ). 333 is a Phi, /- 321 are K and K-. # Position at creation ( always (0,0,0) ). # 4 Momentum at creation. # Status code ( 0=particle was not decayed in the generator; 1= particle was decayed in generator). Meaning will be clear later. # Indexing information for parent and for children. If not present then the word "none" appears. # In the parent printout, for example "1:2 0". The 1:2 is a data product ID ( defined later - basically it says that parent is in the same GenParticleCollection as the child ) and the 0 is the index into the GenParticleCollection. So both particles 1 and 2 have a parent of 0,while 0 has no parent. # In the child printout the same info is present but is formatted differently ( ooops ); the index is separated from the product Id by a dot (.) rather than whitespace. # Both HepLorentzVector and Hep3Vector have a member function named mag. They do different things. On a Hep3Vector it returns the magnitude of the momentum. On a HepLorentzVector it returns the invariant mass. This does make sense in that it mimics that standard mathematical notation; but it is a common mistake to use the mag function expected to get a momentum and instead getting a mass. # Run ex07Activity01.fcl
    1. This breaks up a long one-liner in Ex07_module.cc into smaller steps. This might make it easier to understand. The const& are very important or else you make needless and wasteful copies. # Also add a few new histograms to illustrate properties of CLHEP::Hep3Vector and CLHEP::HepLorentzVector # Needed to add the header <cmath> to get the constant M_PI. # root -l ../art-workbook/artwb/Ex07/ex07Activity01.cint # art -c Ex07/ex07Activity02.fcl
    1. Many different ways to write the same loop as was in Ex07_module.cc. We recommend the version in Ex07_module.cc but you will see the others.
  3. Ideas
    1. Looping in many different ways # Attributes of a GenParticle # CLHEP # CLHEP::Hep3Vector # CLHEP::HepLorentzVector # <cmath>

Exercise 8

  1. Example code
    1. art -c Ex08/ex08.fcl It only runs one event.
  2. Activities
    1. Inspect the printout and relate it to the classes:
    1. $ART_WORKBOOK_BASE_INC/artwb/Geometry/Geometry.h # $ART_WORKBOOK_BASE_INC/artwb/Geometry/Tracker.h # $ART_WORKBOOK_BASE_INC/artwb/Geometry/Shell.h # Also relate it to the file that defines the geometry: $ART_WORKBOOK_BASE_DIR/databaseFiles/geom01.fcl
    1. In a mature, real experiment this information would live in a data base, not a file but we don't need that complexity for illustration purposes. # This file is written in fhicl and is parsed by $ART_WORKBOOK_BASE_DIR/source/artwb/Geometry/Geometry_service.cc
  3. Ideas
    1. Big picture: The Geometry service is user written code that should know about the intervals of validity of geometry information. should recognize boundaries of validity intervals and update information as boundaries are crossed. The end user (module code) just asks the geometry service for information and does not need to be aware of the behind the scenes word that ensures validity. This is all a per-experiment responisbility, not an art responsibility. # In the code used for this example, the Geometry service assumes that all information is valid from the start of the first beginRun call until the end of the job. That is enough to illustrate the big ideas without getting bogged down in details. # The idea of a user written service; in this case the geometry service. # A user written service is just a C++ class that obeys some conventions defined by art. # A service may arrange to be called at the start of each run, at the start of each event, before any module is called and at many other times. The details are different than for a module but the big picture is much the same. # A module gets access to a service by asking art to give it a ServiceHandle to the requested service. A ServiceHandle is a form of smart pointer. # We need to distinguish two notions of validity. A ServiceHandle may be valid in the sense that it points to a properly instantiated service object. But a service may be invalid internally. For example the geometry may be run dependent so it may not legal to ask the Geometry service for information when no run is current; but it is still legal to ask the framework for a ServiceHandle to the Geometry Service. The validity of the insides of a service is an experiment dependent idea. Whether or not the ServiceHandle is valid is an art idea. # The service handle is guaranteed to be valid before the constructor of any module is called. It is guaranteed to remain valid until after the calls to each module's endJob method. (It may even be valid during the call to the d'tor ?). # It is moderately expensive to get a service handle since it involves traversing an std::map whose key is a long string. # Therefore we typically get service handles in the c'tor and hold them as member data; we may use them anywhere within the module code. # We always need to be aware when the internals of the service as valid; a well written service will issue an error if someone asks for invalid information; but this is an experiment concern, not an art concern. # The header for Service Handle is at: $ART_INC/art/Framework/Services/Registry/ServiceHandle.h # The header for the Geometry class is at $ART_WORKBOOK_BASE_INC/artwb/Geometry/Geometry.h # This file tells you what information is available from the service. For this example, the information does not change as a function of run number but it is not defined until the first run is encountered. So module code may use it in their beginRun, beginSubRun and analyze methods. They may not use it in their constructors or beginJob methods ( even though the service handle is valid at those times ).
      itself lives in $ART_WORKBOOK_BASE_LIB/libartwb_Geometry_Geometry_service.so. # To use the service we had to: # Add two lines to the link list in CMakeLists.txt # Provide a parameter set for the Service in the .fcl file. # In the fcl file, the services parameter has a sub parameter set named user. This is an art convention that mocks up namespaces. The parameter sets to initialize the art supplied services must live in the services parameter set but outside the user block; those for user defined services must live inside the user block. # In geom01.fcl the sigma parameters are ignored. They have been superseded by those in the Conditions service; they will be removed in the next release.

Exercise 9

  1. Example code
    1. art -c Ex09/ex09.fcl
  2. Activities
    1. Inspect printout and correlate with the code.
  3. Ideas
    1. Conditions service holds two entities, the particle data table and the conditions information for each tracker shell ( efficiency and resolution ). # The Particle data information is defined as soon as the conditions service is constructed and can be used at any time thereafter. # The information for the tracker shells is invalid until the first run has been encountered. # Each genparticle holds an integer code that gives its PDGCode. Information such as mass, charge and name can be found in the particle data table by using the PDGCode. # The particle data codes can be found in: $ART_WORKBOOK_BASE_INC/artwb/DataProducts/PDGCode.h # The conditions service header is at: $ART_WORKBOOK_BASE_INC/artwb/Conditions/Conditions.h # The particle data table header is at: $ART_WORKBOOK_BASE_INC/artwb/Conditions/PDT.h # The particle data table is very incomplete. It only holds a few particles relevant for this exercise. It is defined in: $ART_WORKBOOK_BASE_DIR/source/artwb/Conditions/PDT.cc

Rough notes for future exercises

From here down the exercise numbers are just placeholders.

Exercise 6

  1. Example code
    1. Ex05 but rewrite the .fcl file to introduce PROLOG, #include, and the dot notation for parameters. # Overriding .fcl parameters using the dot notation
  2. Activities
    1. Use ART_DEBUG_CONFIG # Compare canonical form with and without the PROLOG markers; observe the cruft. # Put .fcl in the wrong subdirectory and observe the error message.
  3. Ideas to discuss
    1. PROLOG # FHICL_FILE_PATH # Copy the material from Mu2e Workbook example 3

Exercise 7

  1. Example code
    1. Ex06 but add a loop over the hits and plot the pulse height of each hit. # cint script to make multipage pdf files and provide some examples of other ROOT features.
  2. Activities
    1. Add a histogram of the time of each hit. # Add this histogram to the pdf output of the cint script. # Add an ntuple containing all of the information for each hit. # Rewrite the loop over hits using the old style methods: do it twice - once using integer indexing and once using iterators.
  3. Ideas to discuss # Where to find the header and implementation for the Hit class. # Looping constructs in C++; prefer the C++11 syntax. # Recommend that showing overflows is the default.

Exercise 8

  1. Example code
    1. .fcl file to read in two input files and write one output file
  2. Activities
    1. Use Ex01 code; point it at the new file and verify that it contains the events that it should.
  3. Ideas to discuss # How to configure and output module.

Exercise 9

  1. Example code
    1. Filter module that selects either odd numbered events or even numbered events. # .fcl to write two output files.
  2. Activities
    1. Use Ex01 to read back the events from the files you wrote and to verify that each file has the correct information.
  3. Ideas to discuss # filter module # trigger_paths # out to specify SelectEvents output modules

Exercise 10

Rough notes:
  1. Run the event generator
  2. Make histograms inspect the data products it creates.
  3. Change the random number seed and re-run.
  4. Write to a file.

Exercise 11

Rough notes:
  1. Read from the file writing in Ex10 and run the simulation code.
  2. Write to a file.
  3. Make histograms of the data products.
  4. Introduce art::Ptr.
  5. Write code to navigate the mother daughter chain.
  6. Use the dump data products tool to look at the output file. Observe that the data products have different ProcessName fields.
  7. Repeat but run event generator an simulation code all in one job.
  8. Use the dump data products tool to look at this output. Observe that the data products have the same ProcessName fields.

Exercise 12

Rough notes
  1. Run the full simulation and reco chain
  2. Make histograms of the data products in the reco chain

Exercise 13

Rough notes.
  1. Introduce art::Assn

Exercise 14

Rough notes
  1. Make your own data product and add it to the event.

Exercise 15

Rough notes.
  1. Introduce cet::map_vector. Not used by all experiments.

Exercise 16

Rough notes
  1. Event mixing.

Brainstorming list

  1. Timebomb module to experiment with exceptions - both ones that are fatal and ones that are not.
  2. Write out all events that throw exceptions
  3. Open and parse your own .fcl file, outside of the parameter set system.
  4. ParameterSet helpers
  5. Message logging system.
  6. Services calling services: getting handle to other service must precede the the registration.
  7. Copy stuff from Mu2e sandbox.
  8. scan art issue tracker for errors in usage that people think are bugs.
  9. Illustrate saving seeds and reruning from the mid job.
  10. Illustrate saving seeds and read back from file.
  11. Find a way to exercise Views.
  12. Find a way to exercise Assns with many to 1; many to many; 1 to many.
  13. Exercise all 3 ways of looping over an Assns.