Parallelism in user code

In some circumstances, it may be desirable to invoke parallel algorithms within user code. For example, if a data product is a large collection of items for which it is prohibitive to have more than one in flight at a time, a user could configure an art job to use only one schedule but invoke a parallel algorithm in his/her module to process the data collection. There are a number of ways to do this, and we discuss a few here. If you plan to use parallelism within your own code, the efficiency and, possibly, success of your job will depend on the guidance listed here.

Invoking TBB algorithms

As discussed here, art uses Intel's Threading Building Blocks (TBB) for supporting concurrent processing of events. In addition, art users are allowed to explicitly invoke TBB's parallel algorithms within their own modules. Because art's TBB scheduler is initialized by the time a module is created, any TBB invocations within a module will use the same scheduler, ensuring that the tasks to be run in parallel are balanced with all other TBB tasks managed by that scheduler.

Please note that because TBB decides when to execute its tasks (based on a complicated, but unambiguous procedure), the ordering of each of the parallel-algorithm tasks may be intermixed with all other TBB tasks. This is not a problem, per se, but it is another source of unpredictable task-execution orderings.

ROOT's TBB scheduler

The ROOT library creates its own TBB scheduler whenever implicit multi-threading is enabled. For art 3.00.00, the framework calls ROOT::EnableThreadSafety(), but it does not call ROOT::EnableImplicitMT(). Any user who wishes to enable implicit multi-threading in their own code should be aware of its consequences. Assuming the ROOT::EnableImplicitMT() call is made after art's TBB scheduler has been initialized, then ROOT's scheduler will be initialized to the same settings as art's scheduler. If implicit multi-threading is enabled before art's scheduler is initialized, then art's multi-threading settings, determined from your configuration, will be ignored.

All modules, services, and the input source are created after art's TBB scheduler are initialized. It is therefore safe to enable implicit multi-threading in the constructor of any of these plugins.

Note that there is a known bug observed when using ROOT's fit routines with implicit multi-threading enabled:

Using the C++ standard thread library

Although users are not forbidden from creating std::thread objects or from using the C++ standard thread library's parallel algorithms introduced in C++17, there is no automatic way to coordinate TBB's task scheduling with the standard thread library's facilities. This means that any std::thread objects created (either explicitly, or implicitly via the parallel algorithms) will likely contend with the threads TBB creates to execute its tasks. Having two threading libraries operating independently generally causes inefficiencies.

In addition, whereas art constrains TBB to use only the number of threads specified by the user (or by the batch system), no such constraint can be automatically imposed for users of the standard thread library. It is, therefore, easy for more threads to be created by a user (explicitly or implicitly) than the system can handle, which is an especially important consideration when executing an art job on a grid node.

Guidance: Prefer using TBB's parallel facilities over explicitly creating std::thread objects or invoking the standard library's parallel algorithms.