Project

General

Profile

OM - Troubleshooting

The Big Red Box

This indicates a problem with the data! Contact the readout, DAQ, or other system experts as appropriate.

Plots not updating / Age of plots is more than 10 minutes / New run is not showing data

In the run control system, there are two text boxes near the bottom of the screen, showing the Online Monitor and the Dispatcher. Both of these windows should be there and should be scrolling.

- If one has disappeared, then one of the executables crashed. The simplest procedure to follow is to start a new run. This problem seems to happen randomly every few days for unknown reasons.

- If both windows are there and scrolling, see the steps below for a stuck web page.

- If the problem is not solved by the above measures or persists after a restart, call the Online Monitor shift (usually Nathaniel).

Problems seeing the web page (the Lizard)

Follow these steps if the OM GUI is not responding (no data getting loaded, circular icon circling, current run not on the screen, etc)

(Note that these steps do not resolve the Big Red Box - that indicates a problem with the data, not a problem with the OM.)

1. Reload the web page. (If you are off-site, make sure your SSH tunnel or VPN connection is still good.)

2. Restart the browser. (Close the window, then open "Opera" from the desktop. Note that on linux, Firefox is not a very fast browser and can be frustrating to use.)

3. Check that the OM is actually integrating the run. Go to this link:
[[http://ubdaq-prod-near2.fnal.gov/Lizard/server/file_browser.cgi?path=/datanear1/om]]

You should see something like:

File    Date  ▾    Size
run_00003963_000.om.root    Dec 9, 2015 10:04    4.55 MB
current.root    Dec 9, 2015 10:04    4.55 MB
current_copy.07.root    Dec 9, 2015 10:04    4.55 MB
current_copy.06.root    Dec 9, 2015 10:03    4.55 MB
current_copy.05.root    Dec 9, 2015 10:03    4.55 MB
current_copy.04.root    Dec 9, 2015 10:03    4.55 MB
current_copy.03.root    Dec 9, 2015 10:03    4.55 MB
current_copy.02.root    Dec 9, 2015 10:02    4.55 MB
current_copy.01.root    Dec 9, 2015 10:02    4.55 MB
current_copy.00.root    Dec 9, 2015 10:02    4.55 MB
current_copy.09.root    Dec 9, 2015 10:01    4.55 MB
current_copy.08.root    Dec 9, 2015 10:01    4.55 MB
run_00003961_000.om.root    Dec 9, 2015 08:30    3.80 MB
run_00003960_000.om.root    Dec 9, 2015 08:26    4.73 MB
run_00003959_000.om.root    Dec 9, 2015 06:12    790.62 MB

"current.root" should exist and be recent, as should run_xxxxxxx where xxxxxx is the current run number. There should also exist one or more "current_x.root" files.

If these files don't exist, look at the Console DAQ to see if the OM window is still open. If it's not, this means the back-end has crashed. Restarting the run should bring it back... but this is serious enough to call an expert. Please make an elog entry and forward to both the OM and DAQ experts.
You could still label this run as "Good" in RunCat.

4. If the files exist, but you can't load them on the web page, try restarting the ROOT-file-server as follows:

- Log on to a DAQ machine.

  ssh ubdaq-prod-ws01.fnal.gov

- Use this machine to call up the 'reload' URL:
  curl http://ubdaq-prod-near1.fnal.gov/Lizard/server/kill_histserver.cgi

(Note the link is 'near1' NOT 'near2'.)

NOTE: Please do this as a last resort, and don't spam this links. Restarting the backend can cause problems if too many people are trying to access the server at the same time.

5. If this fails, contact Nathaniel Tagg on expert call list.

(Updated for clarity - Dec 8 2015 NJT)
(Updated to include first 2 sections - Jan 2017 NJT)