art series 2.01

Previous series release notes
Next series release notes

This release of art includes significant changes, not only in under-the-covers implementations, but also in the philosophy of how Run and SubRuns are perceived. We have made every effort to implement changes that minimize breaking-code changes for users. When moving to art 2.01.00 from 2.00.02, no changes to C++ code will be required. However, it is to the user's benefit (and necessity, in some cases) to take advantage of the new interface for Run and SubRun product-putting.

 New features

A flexible art/ROOT output-file system

Users are now able to switch to a new output file if a certain criterion (or set of criteria) is met:

  • A new input file, new run, or new subrun is being processed
  • The number of events written to an output file has reached a user-specified maximum.
  • The size of the art/ROOT file has reached a user-specified maximum.
  • The age of the file has reached a user-specified maximum.

Supported RootOutput configurations and details are presented here. (Resolves feature requests #896, #1470, #3708, #3984, #7566, #8593).

Run and SubRun product contributors

With art 2.01.00, Run and SubRun products now have a range-of-validity semantic associated with them. The range of validity specifies the sub-runs and events that were processed when creating the product. In addition, each art/ROOT file also has a range-of-validity that corresponds to the runs, sub-runs, and events that were processed when writing that file. Keeping track of this information is essential for ensuring consistency of products in the context of a more flexible output-file handling system. In addition, Run and SubRun products are now automatically aggregated depending on the assigned semantic. For general information, see here; for product aggregation, see here. (Resolves feature requests #11415, #11416, #11417, #11894.)

file_info_dumper utility

This new utility allows users to do the following:

  • print the range-of-validity of the input file
  • print the event-list of the input file (not necessarily the same as the range-of-validity)
  • save the internal RootFileDB SQLite database to an external file

See file_info_dumper --help for allowed configuration options.

Other new features

  • MemoryTracker enhancements: The MemoryTracker service now reports the VmHWM (high-water mark or maximum RSS) quantity, which is what is used to determine memory usage for grid jobs. The MemoryTracker has also been made more configurable to allow for SQLite database overwriting (as in the TimeTracker). (Resolves feature requests #11944 and #11952.)
  • More command-line options: The allowed art program options are now grouped according to common purpose, and displayed in an easier-to-read format. In addition, new command-line options are available:
    • --timing: enables the TimeTracker service
    • --timing-db: enables the TimeTracker service and stores the data in an SQLite database with the specified filename
    • --version: prints the art version
  • Redundant SelectEvents clause no longer required: For historical reasons, a 'SelectEvents: [p1, ... ]' sequence had to be wrapped in a 'SelectEvents: { ... }' table. Such redundancy is no longer required--i.e. only the 'SelectEvents: [p1, ...]' clause need be specified for output modules and/or analyzers if event filtering is desired. (See here for event-filtering details. Resolves feature request #6228.)
  • sam_metadata_dumper adjustment: The printed file-name in the JSON printing mode now contains only the basename. (Resolves feature request #8987.)
  • Expanded streaming for fhicl::Table<T>: a fhicl::Table can now be streamed to any object that has an operator<< function. (Resolves feature request #12356.)

Although art 2.01.00 was designed so that no user changes are required, it is highly recommended that users creating Run and SubRun products specify the semantic corresponding to the product, as explained here. For users who do not provide a semantic, art will provide an implicit semantic, which may not be correct since art cannot know a priori how a product will be used. This could result in an exception being thrown in a downstream process. Note that explicitly specifying a semantic may imply adding an additional function to the C++ type of the (Sub)Run product in question. See the here for details (same as aforementioned link).

 Behavior changes

The major features above have induced several behavior changes:

  • Change in number of calls to begin/end(Sub)Run: To facilitate the flexible output-file handling, every time an input or output file is opened begin(Sub)Run is called. Similarly, every time an input or output file is closed end(Sub)Run is called. This change is necessary to allow users to retrieve products from newly opened files, and to write products to files that are about to close. Indeed, this is one of the sources of fragmentation that necessitates the concept of a Run fragment and SubRun fragment. The aggregation facilities introduced earlier are a means for handling this fragmentation.
  • Calling respondTo{Open,Close}OutputFiles: Previous versions of art have unconditionally called respondTo{Open,Close}OutputFiles at the beginning and end of a process, regardless if any output modules were configured. This has been fixed so that the respondTo*OutputFiles callbacks are called only when at least one output module has been included in an end path.
    N.B. This could break user code if the user was relying on these hooks to be called unconditionally. If so, the proper solution is to use the beginJob and endJob interface. If access to the FileBlock is required, then consider the respondTo{Open,Close}InputFile interface instead.
  • Output-file renaming: It was always the intention that if a process was configured to write to a file that already existed, then instead of overwriting the existent file, a new file with an index would be created. This facility has not been properly functioning in previous versions of art. This has been fixed with art 2.01.00. For example:
    out.root     # First process
    out_1.root   # Repeat of same process, forgetting to change output-file name
    out_1_1.root # Repeat of same process, forgetting to change output-file name

 Deprecated configurations

A few configurations have been deprecated as of art 2.01.00:
  • services.scheduler.fileMode (details).
  • services.scheduler.MemoryTracker.filename (details).
  • SelectEvents.SelectEvents (details).

 Backwards compatibility error

art versions 2.01.00 through 2.02.02 have inadequate backwards-compatibility for Run and SubRun products. Specifically, art versions 2.01.00 and newer must assign a semantic to each Run and SubRun product produced/propagated from a previous process. For pre-art 2.01.00 products written to disk, the full-(sub)run semantic is applied to the product when a new version of art (2.01.00 and newer) reads the file. However, whereas newer files store extra information that tell art how to assign semantics for future Run and SubRun products, this extra information is not available in older art/ROOT files. This error was fixed in art 2.03.00.

  art releases