Project

General

Profile

Concatenating CAFs

For datasets where a large number of files are contributing to slowness, concatenated datasets are the solution. The Production group maintain concatenated versions of all completed datasets in the directories /nova/prod/concat/ and /pnfs/nova/persistent/production/concat/.

In case you need to produce your own, here's how:

For a small quantity of input files, you can do this interactively

cafe -bq $NOVAGRIDUTILS_DIR/bin/concat_dataset.C $DATASET_NAME

This will loop through all the files in the dataset and make one combined output file called ${DATASET_NAME}.root

For longer jobs, you can submit to the grid:

submit_concat_project.sh $DATASET_NAME $OUTPUT_DIR $RELEASE

If the output will be less than 20GB this will create a single file, for larger inputs it will attempt to make multiple output files around 2GB in size with names like ${DATASET_NAME}.${N}_of_${M}.root