Super Nova PUBS Monitoring¶
Please see the DM expert documentation page for DM responsibilities relating to SN data management
Link: DM - Expert Documentation¶
The super nova data stream (SNS) is a readout stream integrated into the MicroBooNE DAQ since Dec. 9 2016. Independent of the tiggered data stream, the SNS reads out TPC and optical data continuously. The data management for the SNS involves the PUBS daemon located on the ws02 machine. A psql database called procdb_sn is independent of the production database and is used to house the SNS projects. An overview of the PUBS projects developed for DM of the SNS is described below from a talk given @ https://microboone-docdb.fnal.gov/cgi-bin/private/ShowDocument?docid=13034.
The SNS data consists of ~1.5 GB binary fragments in "ubdaq" format. The total data size per run varies per SEB which means the number of subruns produced by the SNS will be different. Some SEBs are connected to wires which produce more or less data, depending on various factors controlled by the firmware. Each SEB has approximately 13TB of space in the /datalocal/supernova volume where SNS fragments are stored. Each SEB has an independent /datalocal/supernova directory which is managed by a PUBS daemon on ws02.
There are three projects which control the first phase of SNS DM.
- Binary fragment run and subrun registration
- Binary fragment filename extraction
- Binary fragment monitoring
There are 10 projects, one per SEB, in the PUBS DB which register new incoming fragments from the SNS. The unique run, and subrun number is stored into a unique run table per SEB.
Next, a set of projects records the unique file location into a table for further DM processing.
The bulk of the SNS DM comes from a standalone projects called monitor_snova. There is no run table for this project. This projects preforms two functions. It's first function is to maintain the occupancy of each SEBs /datalocal/supernova volume. When a datalocal volume fills past 80% on any of the SEBS, the monitor project delets SNS fragments for unique run numbers across all SEBS until the datalocal volume drops below 70%. This maintains unique run numbers across all SEBS simultaneously. An example of file deletion is shown below. The X axis is time. The Y axis is disk free percentage for SEB 06. There is one particular failure mode for the monitoring PUBS projects. To keep the load on ws02 down we have been using a cpulimiting program. This program can sometimes cause the project to hang in "T" status according to the linux job scheduler. When this happens you will get a call from the shifter telling you the disk occupancy for one of the SEBs is in alarm. To clear this error, please restart the daemon on ws02 and this error will go away after ~15 mins.
The second function of the monitoring script is to move SNS data from SEBS onto Fermilab tape archive. This feature is not in active use. In the uboonepro home directory you will find a .snova.lock file. This file contains the list of runs to be copied by the monitor script from the SEBS to tape. Creating this file, and filling it out as shown in the example above will stop the monitoring script from deletion and transition it to file retrieval.
The monitoring script will internally copy the frozen list of runs to a separate runtable in the PUBS db.
Once the data is copied off onto PNFS scratch space, there are four projects which handle the SNS binary fragment on their way to the Fermilab tape system.
The first project calculates the checksum of the file.
The second project fills out a samweb metadata file
The third project registers the file to sam, giving the file a unique name. This unique name contains the SEB name.
Finally, the fourth project moves the binary fragment from scratch to the super nova dropbox location to be moved into the tape system.