Project

General

Profile

TFileService

The TFileService is an art facility that enables physicists to use ROOT classes easily within art. Users do not need to worry about any awkward TDirectory management as that is handled internally by the art system. In addition, the TFileService places histograms in TFile directories that have the same name as the corresponding module labels, thus separating histograms that correspond to different instances of the same module (see the below section for more details).

Service configuration

To use the TFileService, include the following in your FHiCL file:

services : {
   TFileService: { .... }
}

The allowed configuration is printed by using the art --print-description command:

art --print-description TFileService
# ...

TFileService : {

  closeFileFast : true  # default

  fileName : <string>

  tmpDir : "<parent-path-of-filename>"  # default

  # As of art 2.10, the 'fileProperties' table is supported 
  # for configuring output-file switching
  fileProperties: {

    maxEvents: 4294967295  # default

    maxSubRuns: 4294967295  # default

    maxRuns: 4294967295  # default

    maxInputFiles: 4294967295  # default

    ## Maximum size of file (in KiB)

    maxSize: 2130706432  # default

    ## Maximum age of output file (in seconds)

    maxAge: 4294967295  # default

    ## The 'granularity' parameter specifies the level at which
    ## a file may be closed, and thereby the granularity of the file.
    ## The following values are possible:
    ##
    ##     Value        Meaning
    ##    =======================================================
    ##    "Event"       Allow file switch at next Event
    ##    "SubRun"      Allow file switch at next SubRun
    ##    "Run"         Allow file switch at next Run
    ##    "InputFile"   Allow file switch at next InputFile
    ##    "Job"         File closes at the end of Job
    ##
    ## For example, if a granularity of "SubRun" is specified, but the
    ## file has reached the maximum events written to disk (as specified
    ## by the 'maxEvents' parameter), switching to a new file will NOT
    ## happen until a new SubRun has been reached (or there are no more
    ## Events/SubRuns/Runs to process).

    granularity: "Event"  # default
  }

}
  • closeFileFast is a parameter that can improve the speed of closing the ROOT file. It is almost always correct for this to be set to true. However, if you find yourself assigning the same histogram to two directories (which you should not do), then this will be need to be false.
  • fileName is a required parameter -- its value typically ends with the suffix '.root'.
  • tmpDir specifies a directory to which temporary files should be written before they are renamed to match the value of fileName (see issue #6843 for more details).
  • fileProperties is used to specify the conditions under which a TFileService output-file switch should occur.

Examples of use in modules

Consider the following analyzer module:

#include "art/Framework/Services/Optional/TFileService.h" 
// other headers ...

class MyAnalyzer : public art::EDAnalyzer {
public:
  struct Config {
    // defined fhiclcpp configuration parameters
  };
  using Parameters = art::EDAnalyzer::Table<Config>;
  MyAnalyzer(Parameters const&);

private:
  void analyze(art::Event const&) override;

  TH1F* h1_;
  TH2F* h2_;
  TGraph* g1_;
  TGraphPolar* g2_;  
};

Notice that the pointers are bare pointers. Although this is not preferred code design, it is a constraint that is imposed by ROOT's default behavior. The TH1F, TH2F and TGraph(Polar) pointers can be assigned in MyAnalyzer's constructor by creating an art::ServiceHandle<art::TFileService> instance:

MyAnalyzer::MyAnalyzer(Parameters const& p) 
  : art::EDAnalyzer{p} 
{
  art::ServiceHandle<art::TFileService> fs;

  art::TFileDirectory dir1 = fs->mkdir("dir1");
  h1_ = dir1.make<TH1F>("test1", "test histogram #1", 100, 0., 100.);

  art::TFileDirectory dir2 = fs->mkdir("dir2");
  h2_ = dir2.make<TH2F>("test2", "test histogram #2", 100, 0., 100., 10, 0., 20.);

  g1_ = fs->makeAndRegister<TGraph>("graphAtAnalyzerLevel", "graph at top level", 10);
  g2_ = dir2.makeAndRegister<TGraphPolar>("graphInSubdirectory", "graph in subdirectory", 20);
}

Note that both TFileService and TFileDirectory objects may call mkdir, make<T> and makeAndRegister<T>. In addition to creating the object, makeAndRegister also appends the object to ROOT's gDirectory registry. For histograms, make is sufficient as histograms are automatically registered. However, for other objects (like TGraph), it is necessary to add them to the gDirectory so that ROOT can manage the memory appropriately. At the moment, it is the responsibility of the user to know which classes should need to be additionally registered.

Using TFileService in multi-threaded art jobs

In order to guarantee no data races, TFileService can be used only in legacy and shared modules. For shared modules that use TFileService, users must make the following serialize call:

MyAnalyzer::MyAnalyzer(Parameters const& p)
  : SharedAnalyzer{p}
  , // other data members
{
  serialize(TFileService::resource_name());
}

This call is implicitly made for all legacy modules.

Preparing modules for file-switching (art 2.10 and newer)

If you want your module to be usable whenever a user enables TFileService file-switching, then it must provide a file-switch callback1. For example:

TestTFileService::TestTFileService(Parameters const& p)
  : EDAnalyzer{p}
{
  ServiceHandle<TFileService> fs;
  fs->registerFileSwitchCallback(this, &TestTFileService::setRootObjects);
  setRootObjects();
}

void
TestTFileService::setRootObjects()
{
  ServiceHandle<TFileService> fs;
  h1_ = fs->make<TH1F>("test1", "test histogram #1", 100, 0., 100.);
}

In this example, we have provided the 'void setRootObjects()' function as a callback to the TFileService. After a file-switch has occurred, the callback will be invoked, resetting h1_ to a new histogram.

The callbacks are invoked only after a file switch has occurred--they are not invoked after the constructor of the module has been called. Hence why we have called setRootObjects in the constructor as well as provided it as a callback to the TFileService.

If the job is configured to switch TFileService files, and one of the modules does not register a callback, you will see an error like (e.g.):

---- Configuration BEGIN
  A TFileService error occured while attempting to make a directory or ROOT object.
  File-switching has been enabled for TFileService.  All modules must register
  a callback function to be invoked whenever a file switch occurs.  The callback
  must ensure that any pointers to ROOT objects have been updated.

    No callback has been registered for module 'a1'.

  Contact artists@fnal.gov for guidance.
---- Configuration END

1 TFileService file-switching is not allowed for multi-schedule and multi-threaded art jobs. An exception will be thrown at job start-up if a value other than 1 is specified for the number of threads and the number of schedules.
 

An unfortunate consequence

Because art does not directly interact with any objects created via the TFileService, art cannot be smart about when to open a ROOT file--art must be cautious. To that end, a ROOT file is created (a) whenever the TFileService is constructed, and (b) immediately after a file has been closed during a file switch. This means that in some cases, it is possible to get an extra file that is not necessarily desired. For example:

  • TFileService is configured to switch after 10 events
  • The job is configured to process only 10 events
  • After the 10th event is processed, TFileService closes one file and opens a new one
  • The endSubRun and endRun calls are made to all the modules
  • The newly opened file is then closed, even though the histograms may be empty if none of them were filled during endSubRun or endRun.

This consequence is largely unavoidable. It may be possible to improve the situation, but there is no apparent way to do so right now without imposing additional restrictions on TFileService usage.

TFile directory structure

Suppose two instances of MyAnalyzer (defined above) are configured in a FHiCL file:

physics : { 
   analyzers : {
      analyzer1: { module_type : MyAnalyzer ... }
      analyzer2: { module_type : MyAnalyzer ... }
   }
   e1 : [ analyzer1, analyzer2 ]
}

When the TFileService ROOT file is written, the directory structure will look like:

file
 │
 ├── analyzer1
 │      │
 │      ├──── a
 │      │     │
 │      │     └─── test1 (TH1F)
 │      │
 │      ├──── b
 │      │     │
 │      │     ├─── test2 (TH2F)
 │      │     └─── graphInSubdirectory (TGraphPolar)
 │      │
 │      └──── graphAtAnalyzerLevel (TGraph)
 │
 └── analyzer2
        │
        ├──── a
        │     │
        │     └─── test1 (TH1F)
        ├──── b
        │     │
        │     ├─── test2 (TH2F)
        │     └─── graphInSubdirectory (TGraphPolar)
        │
        └──── graphAtAnalyzerLevel (TGraph)

where the histograms are stored in their appropriate directories.

Filling objects

The ROOT objects (e.g. histograms, graphs, TTrees) can be filled at any and all of the framework entry points:

  • beginJob
  • beginRun
  • beginSubRun
  • produce
  • filter
  • analyze
  • endSubRun
  • endRun
  • endJob
  • etc.

using the standard ROOT functionalities.

Writing, closing, and deleting TFileService objects

Users of TFileService should not explicitly:
  • call any of the Write functions: (e.g.) hist_->Write();,
  • call any of the Close functions: (e.g.) file_->Close();, and
  • use a delete statement to deallocate TObject memory: (e.g.) delete hist_;.

Each of these management issues is handled implicitly either by art or by the default ROOT behavior. Users wishing to use ROOT independently from the TFileService but yet within art must handle all aspects of directory and object management themselves. Although such an approach is in no way forbidden, the state of the TDirectory objects must be handled carefully.