Project

General

Profile

Summary

The online monitoring software basically consists of two halves. The first half is the "producer" and is the work horse of the whole package. It connects to the event dispatcher, takes in and unpacks the raw data, and produces a long list of histograms. The other half is the "viewer" and as the name implies, it is the GUI interface that pretties-up and plots the histograms that the producer makes. The two halves communicate through a block of shared memory that is managed by a separate bit of code. The organization of the OnlineMonitoring package reflects this and is divided into three sub-directories:

  1. producer
  2. viewer
  3. util

What code resides in the "producer" and the "viewer" directories should be obvious. The "util" directory contains code which must be shared by the producer and the viewer. This includes the code for the shared memory management, the hardware mapping from pixel coordinates to FEB and DCM number and vice versa, the spreadsheet with all of the information about what histograms to make, etc.

Running the producer

The producer can either take input from a raw file(s) or from the event dispatcher. To set up your own dispatcher, see the section on "How to set up a test connection to the event dispatcher."

How to run the producer over raw files on disk

  • In your test release on any machine with NOvASoft installed, add the OnlineMonitoring package.
    ~/my_test_release/% addpkg_svn -h OnlineMonitoring
    
  • The default OnMon job ( OnlineMonitoring/producer/onmonjob.fcl ) is already set up to take raw files in as input. If you are reading in raw files from either ND or NDOS data, then you will need to change the detector type in the first few lines of OnlineMonitoring/producer/OnMonProd.fcl to read:
    std_onmonprod:
    {
    module_type: OnMonProd
    #=====================
    CSVFile:    "onmon-histos.csv"   # location/name of CSV file with histo info
    Detector:   "ND"                 # NDOS, ND, or FD
    SHMHandle:  "RAW0"               # 4-character handle for shared memory segment
    }
    

    You can also change the shared memory handle here to be something unique if you desire. NOTE: If you are running on a machine where there is already a shared memory segment in use with the name "RAW0" then you MUST change the SHMHandle to something else.
  • Build the OnMon package
    ~/my_test_release/OnlineMonitoring/% gmake clean
    ~/my_test_release/OnlineMonitoring/% gmake
    

    You will see some errors flash by at the beginning complaining about things like:
    ERROR: ButtonBank inherits from TObject but does not have its own ClassDef
    

    Ignore this. Everything will compile just fine.
  • You can now run the producer as a normal NOvA job.
    ~/my_test_release/% nova -c job/onmonjob.fcl [path to raw files]/*.raw
    
  • If running correctly, you should see output like this scrolling past:
    onmon_prod: run/sub/evt=11974/1/48656  type:size=0:1969
    onmon_prod: run/sub/evt=11974/1/48666  type:size=0:1660
    onmon_prod: run/sub/evt=11974/1/48676  type:size=0:1936
    onmon_prod: run/sub/evt=11974/1/48686  type:size=0:2752
    onmon_prod: run/sub/evt=11974/1/48696  type:size=0:2032
    onmon_prod: run/sub/evt=11974/1/48706  type:size=0:1945
    onmon_prod: run/sub/evt=11974/1/48716  type:size=0:2086
    onmon_prod: run/sub/evt=11974/1/48726  type:size=0:1702
    onmon_prod: run/sub/evt=11974/1/48736  type:size=0:2326
    onmon_prod: run/sub/evt=11974/1/48746  type:size=0:1840
    
  • For instructions on how to connect the viewer to this producer, see connecting the viewer to shared memory below.

How to run the producer from the event dispatcher

  • The onmonjob.fcl file must be properly configured to define the dispatcher as its input source. If you wish to connect to the far dispatcher for partitions 0 or 1, then job files already exist for these specific configurations. Simply execute:
    ~/somewhere/% nova -c job/fd-pX-onmonjob.fcl
    

    where X is the partition you want to connect to (either 0 or 1.) The same is true for the near detector.
  • In the event that something changes (e.g. - we have switched from datadisk-01 to datadisk-02 or the dispatcher port number is no longer 9121) then to run the producer, you will have to edit the OnlineMonitoring/producer/onmonjob.fcl file. Make the following changes:
    #
    #  To read from dispatcher use the following:
    # 
      module_type: NOvASocketInput
      fileNames:  [ "XXXX.fnal.gov:YYYY" ]
    #
    # To read from .raw files use the following:
    #
    #  module_type: NOvARawInputSource
    #  fileNames:   [""]
    #
    

    where XXXX is the machine name (e.g. - novadaq-far-datadisk-03) and YYYY is the port number (e.g. - 9121).
  • Then recompile the producer in your test release and fire up it up like a normal NOvA job:
    ~/my_test_release/% nova -c job/onmonjob.fcl
    

This will start the producer and created a shared memory block to be used in communicating with the viewer.

Cleaning up shared memory blocks

To clean up after yourself, you can check to see if there is a shared memory block allocated to you and remove it using ipcs and ipcrm:

% ipcs

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status      
0x32100122 4227079    heisenberg 666        503316640  0                       
0x4d454d53 9404433    anorman    666        32768000   0                       
0x00000000 6946834    bohr69     777        40000      2          dest         
0x4d454d54 9469980    godzilla   666        20971520   0                       
0x4b52414d 13402152   mbaird42   666        32000000   0   

% ipcrm -m 13402152
% ipcs

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status      
0x32100122 4227079    heisenberg 666        503316640  0                       
0x4d454d53 9404433    anorman    666        32768000   0                       
0x00000000 6946834    bohr69     777        40000      2          dest         
0x4d454d54 9469980    godzilla   666        20971520   0                                          

------ Semaphore Arrays --------
key        semid      owner      perms      nsems     

------ Message Queues --------
key        msqid      owner      perms      used-bytes   messages 

Running the viewer

Like the producer, the viewer has the ability to read input from different sources. It can read histograms that the producer is generating through the shared memory block or it can read from a root file on disk.

How to set up the viewer to read from a shared memory block.

  • First, you must be working on a machine where the producer is already running.
  • Next determine the name of the shared memory block. This name is specified in the file OnlineMonitoring/producer/OnMonProd.fcl as the parameter "SHMHandle". For example:
    BEGIN_PROLOG
    
    std_onmonprod:
    {
    module_type: OnMonProd
    #=====================
    CSVFile:    "onmon-histos.csv"   # location/name of CSV file with histo info
    Detector:   "NDOS"               # NDOS, ND, or FD
    SHMHandle:  "NDOS"               # 4-character handle for shared memory segment
    }
    
    END_PROLOG
    

    For this example, the name of the shared memory block is "NDOS".
  • Now run the viewer specifying the name of the shared memory block:
    ~/my_test_release/% onmon_viewer -f [shared memory block name].shm -d [detector type]
    

How to set up the viewer to read from a root file that was created by the producer

  • With the online monitoring package checked out and built as outlined above, do the following:
    ~/my_test_release/% onmon_viewer -f [filename].root -d [detector type]
    

    Note that every 30 seconds, the viewer will always request a refreshed copy of the current histogram from the histogram source (be it shared memory or root file.) So when reading histograms in from a root file, you might want to pause the viewer.
  • The detector type must be either "NDOS", "ND", or "FD".
  • The NDOS onmon root files from the last 30 days of actual data are kept on the novadaq-ctrl-datamon machine in the directory:
    /online_monitor/offline/onmon_temp_root_files/
    Files older than thirty days are deleted. If you need files from an old run, you can regenerate them from the raw files.
  • The FD onmon root files from the last 30 days of actual data are kept on novadaq-far-datamon in the same directory as listed above.

How to use the viewer to look at histograms in a file made by something other than the producer

Motivation: Any detector information that is displayable in the same type of hardware view that OnMon uses (for example, DSO scan data) can be opened and displayed using the OnMon viewer so that you can take advantage of the fancy OnMon drawing features and other things like the generation of comparison plots. This is not limited to just hardware plots, any old 1D or 2D histograms can be viewed as well.

  • First you must organize your data into the histograms you want to view. If you have any 2D histograms binned by hardware coordinates (the coordinates as seen from the catwalk) then you must fill the histograms using the correct hardware coordinates. For help on mapping the hardware coordinates correctly, see the section on hardware address mapping in the development notes. I recommend using the HardwareMapping.cxx code in the OnlineMonitoring/util directory which is already set up to do things like take in Diblock, DCM, FEB, and pixel numbers and convert them to an (X,Y) pair of hardware coordinates for you to use to fill your plots. You will then have to write your own code to take in all of your input data and fill all of your histograms. Do this however you like such that the end result is a root file containing the histograms you want to view.
  • Next, you will have to give the viewer some information about the histograms you are trying to view. In typical running (in which the viewer is reading the standard set of OnMon histograms made by the producer) this information comes from the file OnlineMonitoring/util/onmon-histos.csv. You can specify a different .csv file when starting the viewer from the command prompt by executing
    ~/my_test_release/% onmon_viewer -f [histogram file] -d [detector type] -v [csv file]
    

    Note that every 30 seconds, the viewer will always request a refreshed copy of the current histogram from the histogram source (be it shared memory or root file.) So when reading histograms in from a root file, you might want to pause the viewer.
  • To create your own .csv file, fire up your favorite spreadsheet editor (just remember to save the file in .csv format NOT in .xls format or whatever) and add the desired histograms according to the format outlined in the existing OnlineMonitoring/util/onmon-histos.csv file. Make sure that the first line of the file ("Name","Title","Category"...) is identical to the one in onmon-histos.csv otherwise the viewer will complain that the header in the .csv file is not correct. Next, fill out the .csv file according to you histograms. Here is a description of what each entry means:

"Name" -- name of histogram in the root file
"Title" -- standard root title ("histo title; X axis; Y axis")
"Category" -- name of the folder where the histogram will be put in the viewer's browser, make up whatever categories you want to organize your histos
"Type" -- TH1F, TH2F
"nx, x1, x2, ny, y1, y2" -- defines histo binning (leave ny, y1, y2 blank for a 1D histo)
"Option" -- (see below)
"Detector" -- "NDOS", "near", "far", or "all"
"Reset" -- how often the histo is reset (which only affects the producer, so if you aren't using the producer, just set this to "run")
"Look back" -- how many previous histos are kept when the histo is reset (again, this only affects the producer so you can set it to 0)
"Caption" -- any additional descriptive text you want displayed about the histogram in the viewer window

  • In the "Options" entry of your .csv file, you can specify what plot drawing options you want to use. You can specify more than one option by separating them with a ":" (for example - "colz:logz"). Here is a descriptive list of available options:

"zoomhour" -- for a plot vs. UTC hour, auto zoom on the most recent two hours
"autozoomx", "autozoomy", "autozoomz" -- autozoom on the chosen axis
"hwlbl_det" -- remove all labels on the X and Y axes and draw the full detector, hardware view labels (for 2D, full detector histos like DCMHitMap)
"hwlbl_dcm" -- remove all labels on the X and Y axes and draw the DCM level hardware view labels (for 2D, DCM specific histos like PixelsDCM_XX_XX)
"alert" -- for 2D histos, use the alert histogram color pallet (0 = green, 1 = red)
"logx", "logy", "logz" -- does just what you think it does
"gridx", "gridy" -- draw grid lines in X or Y

  • If you want to take advantage of the "drill-down" feature (for a detector level 2D histogram, double clicking on a specific DCM will bring up the appropriate histogram for that DCM) then you must make your histogram titles according to the following scheme: The detector level histogram must be titled "histonameDET_dd" where DET_dd specifies detector level drill-down. Your specific DCM titles then need to be "histonameDCM_XX_YY" where "histoname" is the same string as the DET_dd histogram, XX specifies diblock number and YY specifies DCM number in two digit format (i.e. "02" instead of "2"). Note that in your .csv file, you do not need a separate row to specify each diblock and DCM histogram of a certain type. You can fill out one row using the wild card "*" in the title. For example, if you want a DSO scan result histogram for each DCM, then you only need to fill out one row in the .csv file with the title "DSOscanDCM*" which will be applied to all histograms in your root file that fit this format such as "DSOscanDCM_02_05" and "DSOscanDCM_12_12".
  • NOTE: For each histogram in your root file, there must be a corresponding set of information in your .csv file otherwise the viewer will seg fault.

Running OnMon from the control room

  • Look to the machine labeled NOVA-DAQ-2. On the desktop in the bottom right screen you will see four links to scripts for controlling OnMon (they may be cleverly hidden under other windows.)
  • If it is not already running, double-click the green arrows icon to start the producer. This must be done first otherwise the viewer will complain about not having a shared memory block to read from and quit.
  • Wait about 20-30 seconds until you see the data scrolling by in the producer window indicating that it is successfully plugged into the dispatcher and making histograms. The output should look like this:
    Event contains ### bytes.
    Event contains ### bytes.
    Event contains ### bytes.
    Event contains ### bytes.
    Event contains ### bytes.
    
  • Now double-click the other green arrows icon to start the viewer. After about 10 seconds, the viewer window should pop up and you will be free to browse the available histograms.
  • To quit all of OnMon (both the producer and the viewer), double-click on the red stop icon labeled "Quit ALL OnMon." This will kill all processes related to both the viewer and the producer. To quit just the viewer, use the "Quit OnMon - Viewer" icon.

Trouble Shooting and Errors

"shmget failed" error from the producer

  • This error means that the producer failed to set up the shared memory segment (used for communication with the viewer.)
  • This will occur if the shared memory segment you are trying to set up already exists on the computer where you are trying to run the producer. This can happen if someone else is running the producer with the same shared memory segment name or if the last time the producer quit, the shared memory segment was not properly deleted (which can happen if you hit ctrl-C too many times when quitting the producer.)
  • See the above section on cleaning up shared memory to fix the situation. BUT, be EXTRA careful not to delete a shared memory segment for the wrong job or for someone else's job!

"shmget" error from the viewer

  • This error means that the viewer failed to get the shared memory segment with the name that you specified (the viewer can not create a new shared memory segment, it can only use an existing one set up by the producer.)
  • This means you are either trying to run the viewer on a machine where the producer is not running, or you are using the wrong shared memory name.

Problems with resizing the viewer window

  • It is a known bug that the viewer does not redraw properly after resizing the main window.
  • There is currently no fix for this. Kill the viewer and restart it so it will draw itself correctly again.