Project

General

Profile

Super Nova PUBS Monitoring » History » Version 5

Victor Genty, 05/31/2018 09:13 AM

1 1 Victor Genty
h1. Super Nova PUBS Monitoring
2 2 Victor Genty
3 3 Victor Genty
h2. Please see the DM expert documentation page for DM responsibilities relating to SN data management
4 3 Victor Genty
Link: [[DM - Expert Documentation]]
5 3 Victor Genty
6 3 Victor Genty
h2. Overview 
7 3 Victor Genty
8 5 Victor Genty
The super nova data stream (SNS) is a readout stream integrated into the MicroBooNE DAQ since Dec. 9 2016. Independent of the tiggered data stream, the SNS reads out TPC and optical data continuously. The data management for the SNS involves the PUBS daemon located on the ws02 machine. A psql database called procdb_sn is independent of the production database and is used to house the SNS projects. An overview of the PUBS projects developed for DM of the SNS is described below from a talk given @ https://microboone-docdb.fnal.gov/cgi-bin/private/ShowDocument?docid=13034.
9 3 Victor Genty
10 4 Victor Genty
11 4 Victor Genty
!{width:700px}00.png!
12 4 Victor Genty
13 4 Victor Genty
The SNS data consists of ~1.5 GB binary fragments in "ubdaq" format. The total data size per run varies per SEB which means the number of subruns produced by the SNS will be different. Some SEBs are connected to wires which produce more or less data, depending on various factors controlled by the firmware. Each SEB has approximately 13TB of space in the /datalocal/supernova volume where SNS fragments are stored. Each SEB has an independent /datalocal/supernova directory which is managed by a PUBS daemon on ws02.
14 1 Victor Genty
15 5 Victor Genty
!{width:300px}01.png!
16 5 Victor Genty
17 5 Victor Genty
There are three projects which control the first phase of SNS DM.
18 5 Victor Genty
# Binary fragment run and subrun registration
19 5 Victor Genty
# Binary fragment filename extraction
20 5 Victor Genty
# Binary fragment monitoring
21 5 Victor Genty
22 5 Victor Genty
!{width:700px}02.png!
23 5 Victor Genty
24 5 Victor Genty
There are 10 projects, one per SEB, in the PUBS DB which register new incoming fragments from the SNS. The unique run, and subrun number is stored into a unique run table per SEB.
25 5 Victor Genty
26 5 Victor Genty
!{width:700px}03.png!
27 5 Victor Genty
28 5 Victor Genty
Next, a set of projects records the unique file location into a table for further DM processing.
29 5 Victor Genty
30 5 Victor Genty
!{width:700px}04.png!
31 5 Victor Genty
32 5 Victor Genty
The bulk of the SNS DM comes from a standalone projects called monitor_snova. There is no run table for this project. This projects preforms two functions. It's first function is to maintain the occupancy of each SEBs /datalocal/supernova volume. When a datalocal volume fills past 80% on any of the SEBS, the monitor project delets SNS fragments for unique run numbers across all SEBS until the datalocal volume drops below 70%. This maintains unique run numbers across all SEBS simultaneously. An example of file deletion is shown below. The X axis is time. The Y axis is disk free percentage for SEB 06. There is one particular failure mode for the monitoring PUBS projects. To keep the load on ws02 down we have been using a cpulimiting program. This program can sometimes cause the project to hang in "T" status according to the linux job scheduler. When this happens you will get a call from the shifter telling you the disk occupancy for one of the SEBs is in alarm. To clear this error, please restart the daemon on ws02 and this error will go away after ~15 mins.
33 5 Victor Genty
34 5 Victor Genty
!{width:700px}scrot.png!
35 5 Victor Genty
36 5 Victor Genty
!{width:700px}05.png!
37 5 Victor Genty
38 5 Victor Genty
The second function of the monitoring script is to move SNS data from SEBS onto Fermilab tape archive. This feature is not in active use. In the uboonepro home directory you will find a .snova.lock file. This file contains the list of runs to be copied by the monitor script from the SEBS to tape. Creating this file, and filling it out as shown in the example above will stop the monitoring script from deletion and transition it to file retrieval.
39 5 Victor Genty
40 5 Victor Genty
!{width:700px}06.png!
41 5 Victor Genty
42 5 Victor Genty
The monitoring script will internally copy the frozen list of runs to a separate runtable in the PUBS db.
43 5 Victor Genty
44 5 Victor Genty
!{width:300px}07.png!
45 5 Victor Genty
46 5 Victor Genty
Once the data is copied off onto PNFS scratch space, there are four projects which handle the SNS binary fragment on their way to the Fermilab tape system.
47 5 Victor Genty
48 5 Victor Genty
!{width:700px}08.png! 
49 5 Victor Genty
50 5 Victor Genty
The first project calculates the checksum of the file.
51 5 Victor Genty
52 5 Victor Genty
!{width:700px}09.png! 
53 5 Victor Genty
54 5 Victor Genty
The second project fills out a samweb metadata file
55 5 Victor Genty
56 5 Victor Genty
!{width:700px}10.png! 
57 5 Victor Genty
58 5 Victor Genty
The third project registers the file to sam, giving the file a unique name. This unique name contains the SEB name.
59 5 Victor Genty
60 5 Victor Genty
!{width:700px}11.png! 
61 5 Victor Genty
62 5 Victor Genty
Finally, the fourth project moves the binary fragment from scratch to the super nova dropbox location to be moved into the tape system.