Project

General

Profile

How to Interpret the Nearline Plots

THIS PAGE IS OUTDATED. Try the new pages listed below (L.Suter 1st Oct 2015)

Near Detector plots: https://cdcvs.fnal.gov/redmine/projects/datacheck/wiki/How_to_Interpret_the_Nearline_Frontage_Plots_Near_Detector
Far Detector plots: https://cdcvs.fnal.gov/redmine/projects/datacheck/wiki/How_to_Interpret_the_Nearline_Frontage_Plots_Far_Detector

General Information

There is a LOT of information on the nearline webpage and its difficult to interpret all of it unless you are an expert. This page is designed as a guide to give you some hints as to what a "normal" version of each of the checklist plots is supposed to look like and what some departures from normal can indicate about the detector performance.

The best advice on how to look for oddities in these plots is simply to compare the 24 hour version of each plot with either the week or the month version. If the detector performance has been steady over the past week/month but has changed in the past 24 hours, then the differences should stand out. Sometimes of course, differences are expected (depending on the constantly changing detector running conditions) so if you see something that you don't understand or expect, then make a log book entry (with a picture) and investigate it or ask an expert.

Nearline Checklist Plots

The following are the plots used for the nearline checklist including descriptions of what kind of odd behavior to look for and examples.

This section is being developed. Please contribute whatever information you can!

Common Features Plot
Number of Active FEBs per Subrun
Low Number of Active FEBs: Plot background will turn RED with white data points and warning message.
Make sure Auto StartDAQ is on.
If there is no maintenance being done on the detector or if you are unaware of any missing FEBs/Diblocks, you may issue an "Enable FEB Data Flow" from the TDUControl.
Let the OnMon and Nearline plots update and look at the EventDisplay.
If the issue still persists and you see large gaps, you may need to contact an expert to issue a GPS sync.

Track Fractions vs Time
Drops in NuMI Trigger: Only expected if Delivered Beam Spill Rate drops!
If NuMI Trigger rate is lower than Beam Spill rate we might be losing data!
Gaps in Cosmic: Only expected if we are not taking data.
Gaps in Delivered Beam Spill Rate: Only expected if beam database is down. Email nova_nearlineonmon_support.
Gaps in NuMI Trigger: Only expected if we are not taking data.

Track Fractions vs Time
Gaps: Check other nearline plots for gaps. Was there a run going at the time? Are the nearline Plots Updating?
Instabilities: Has the number of active DCMs changed for this partitions? Are there DCM Sync
issues?
Good Subruns
Gaps: Check other nearline plots for gaps. Was there a run going at the time? Are the nearline Plots Updating?
Partial Detector: Some part of the detector is either missing or bad. You should check the FEBHitRateMapMipADC plot in OnMon to identify where the issue is. The detector configuration plot may also help.
Failure Modes:
Failed Reco: There were too many 2D tracks and/or there were too many/few slices per trigger
Failed Diblock: There were less than 2 good consecutive diblocks in the far detector (4 for near detector).
Failed Hit Rate: The median hit rate in the MIP region was too high/low in the detector.
Failed Live Time: The whole run had less than 1 second of live time.
(In the near detector, subruns are required to have at least 1000 NuMI triggers)
Failed Other: The subrun had no activity or there were bad timestamps in the subrun.
Additional cuts for NearDet
Failed Empty Spill Fraction: The ratio of empty spills to total spills is > 3%.
Failed Timing Peak: The timing peak occurs in the wrong position.
Failed Slice Cut: The number of slices (POT scaled) was < 3.5 slices/spill or > 5.5 slices/spill.
If you see a failure mode, check other GoodRuns plots to try to understand the issue.
Preliminary
Represents subruns that do not have complete information yet.
Runs that have not finished will show as preliminary in lighter colors.
FarDet:
1) The latest OnMon file in a run is younger than 1h15min (The run hasn't finished).
2) No nearline-Ana file was found for some subrun in the run (The nearline-Ana hasn't finished).
NearDet:
- No nearline-Ana file was found for the subrun (The nearline-Ana hasn't finished).
While no nearline-Ana file is found for a subrun, it will be kept as preliminary until it is 24 hours old.
After 24 hours, the status of a subrun is made permanent.
Order of cuts
The plots are shown as stacked histograms where the bottom entries are applied first.
The legend can be read in order from bottom right to top left, representing the order of the stacked histograms.
Entries on top have passed all cuts from the previous entries on the bottom.
Preliminary subruns are placed first and so the appear on the bottom.
For example, in the NearDet, first we apply the Timestamp cut, followed by Run Duration, then Timing Peak, etc.
Detector Configuration
Gaps: Check other nearline plots for gaps. Was there a run going at the time? Are the nearline Plots Updating?
Red Diblocks: This means at least one DCM was bad in the diblock. Check if the number of active FEBs has dropped at that time. Otherwise this can be a problem with hot/cold channels or maybe missing DCMs.
Note that these plots take longer than other nearline plots to update!
Runs that have not finished will show as preliminary in lighter colors.
Preliminary means that the latest OnMon file in a run is younger than 1h15min. If no nearline-Ana file is found for a subrun, the corresponding run is kept as preliminary until all OnMon files are at least 24 hours old.
[[http://nova-docdb.fnal.gov:8080/cgi-bin/RetrieveFile?docid=11056&filename=GoodRuns_20140404.pdf&version=1]]
OnMon FEB Hit Rate Spectrum vs. Time
First note the time stamp on the bottom left corner of the plot. For the 24 hours plots, this should
be within the last 10 minutes (within the last hour for the week plots and within the last day for the
month plots.) If this is not the case, then this indicates that the web-plot making scripts aren't
running (e-mail )
Next note on the 24 hours plots that the data goes almost the the right edge of the plot.
There might be 20-30 minutes worth of white space between the most recent data and the right
edge of the plot (which is fine since it takes about 20-30 minutes for new data to make it into the
plots.) If there is lots of white space on the right side of the plot and the detector has been on for
that partition for a while, then this is an indication that the nearline processing has stopped
(e-mail .)
The plot shown on the left shows typical behavior for partition 1. Strange features (lots of FEBs
going noisy or quiet or some kind of oscillating behavior) are things that should be
reported/investigated. For example, in the week plot shown on the right, the burst of activity around
02/12 is what is expected when the DCMs are recovering from having had their high voltages
turned off (recovering from a water leak.)
Timing Peak
.
Timing Peak vs Time
.

OnMon FEB Hit Rate Spectrum vs. Time

  • First note the time stamp on the bottom left corner of the plot. For the 24 hours plots, this should be within the last 10 minutes (within the last hour for the week plots and within the last day for the month plots.) If this is not the case, then this indicates that the web-plot making scripts aren't running (e-mail && .)
  • Next note on the 24 hours plots that the data goes almost the the right edge of the plot. There might be 20-30 minutes worth of white space between the most recent data and the right edge of the plot (which is fine since it takes about 20-30 minutes for new data to make it into the plots.) If there is lots of white space on the right side of the plot and the detector has been on for that partition for a while, then this is an indication that the nearline processing has stopped (e-mail .)
  • The plot shown above shows typical behavior for partition 1. Strange features (lots of FEBs going noisy or quiet or some kind of oscillating behavior) are things that should be reported/investigated. For example, in the week plot shown below, the burst of activity around 02/12 is what is expected when the DCMs are recovering from having had their high voltages turned off (recovering from a water leak.)

Average Number of Hits per Trigger

  • Look for large variations here. For example, is the plot holding steady or slowly increasing?
  • In the above example, there is a large jump around 11:00 am which corresponds to starting a new run with one more diblock than before (so this is fine and expected.) BUT it looks like there is a small sinusoidal variation in the number of hits over the next 4 hours. Is part of the detector not properly being cooled?

Number of Active DCMs per Subrun

  • This is a simple plot and should reflect the number of DCMs in the current run. A decrease in this plot (such as the one that occurs at 10:00) would imply that DCMs have dropped out of the run (unless of course a run was started without certain DCMs on purpose.)

Number of Active FEBs per Subrun

  • The above plot shows typical behavior over a 24 hour period of time.
  • Typical behavior can be when a new run is started by a shifter, several FEBs will drop out soon after this (due to them being too noisy) but then things should settle into a steady number.
  • Note that in the above plot you see the number of FEBs periodically decrease by one but that a large unexpected drop occurs at 10:00.
  • Do not be alarmed by what appear to be major variations in this plot. Pay attention to the Y-axis scale. Big jumps may only be changes of a few FEBs depending on how the plot is zoomed.

Number of Active Channels per Subrun

  • The above plot shows typical behavior over a 24 hour period of time.
  • Typical behavior can be when a new run is started by a shifter, several FEBs will drop out soon after this (due to them being too noisy) which causes the number of channels to drop by multiples of 32. Things should settle into a steady number after a few hours. But this plot should be ~32x (the FEB plot shown above.)
  • Do not be alarmed by what appear to be major variations in this plot. Pay attention to the Y-axis scale. Big jumps may only be a changes of a few FEBs ( +/- 32 pixels = one FEB turning on or off.)

Percent Empty Events

  • This plot is pretty straight forward. This should be zero or mostly zero as shown above.
  • Occasionally a bad subrun will sneak in with more empty events than normal. This usually happens when a run ends badly.
  • Shown below is a period of time when the number of empty events was consistently bad for an extended time.

NuMI Trigger Rate

  • The above plot shows typical behavior. When the beam is on, this should be pretty steady around 0.55 Hz.
  • The two parallel band structure seen around 6:00 is normal and is a function of the fact that there are an integer number of triggers within a ~1 minute long subrun.
  • The reduced rate seen around 21:00 can occur if the beam is down or running at a reduced rate. If the beam was running normally during this time, then that is potentially a problem.
  • In the plot shown below, note that there is a doubling of the trigger rate on 02/28. This was due to the fact that there were two trigger processes running at the same time (which was unintentional and needed to be fixed.)

Number of Slices per Trigger

This should be very steady, but the above plot shows many features.
  • First note the giant spike around 6:00pm on 02/12. This corresponds to the time when the FEB hit rates were very high after recovering from a water leak (see the FEB rate spectra plots above.) So, if it is known that the FEB rates are very high and if THAT is expected, then you should expect a spike like this in the number of slices per trigger.
  • Next note the decrease around 2:00am on 02/15. This was NOT expected behavior and in this case was a direct cause of the DCMs being out of sync (which causes the reconstruction to fail.)
  • Lastly, note the increase around 11:00am on 02/17. This corresponded to starting a run with 3 full diblocks instead of 2 so we expect the number of slices to increase but then hold steady.

PE Distributions

  • The above plot shows the PE distribution for all hits in the noise slice (black) all hits in non-noise slices (red) and all hits on 3D tracks (blue.) This is a normal version of this plot.
  • Abnormal variations of this plot would include strange bumps in the middle, dramatic shifts up, down, left, or right in any of the spectra, or major differences in the behavior of either end of any of the spectra.

Other Plots Not on the Checklist

Average Plane/Cell Hit per Slice

This plot shows the average plane and the average cell of all the hits within each non-noise slice. Above is "normal" behavior and was taken from a run that included diblocks 1-3. You will note the vertical stripes down the middle in both views and the horizontal stripes in the middle in the Y view, and hot spots in the corners form "corner clippers." All of this is normal.

  • Below is an abnormal plot. Note the offset vertical stripe in the Y view. This was an indication that there were lots of slices isolated in diblocks 2 and 3. This was caused by a timing offset between diblocks 1 and diblocks 2 and 3. The appropriate course of action in this case was to contact Evan Niner and have him do his TDU magic... (There is also a hot FEB in the X view.)

  • Below is another abnormal plot. Note that there are "warmer clumps" in the center of the DCMs in the Y view. This implies that we are making slices out of the hits in individual DCMs instead of slices that crossed the boundaries between multiple DCMs. In this case, the cause turned out to be having started this run with the wrong configuration. The solution was to stop the run and start a new one with the right configuration.