Project

General

Profile

SLC - Troubleshooting » History » Version 18

« Previous - Version 18/29 (diff) - Next » - Current version
Erez Cohen, 02/20/2017 08:04 PM


SLC - Troubleshooting

This page lists what issues can occur and how they should addressed.
Also listed are what problems need help from an expert.

Problems with starting up the GUI

One of the following might happen when you follow the instructions for How to connect to the shared control screen:

  • Restarting everything after a reboot or power outage: see SMC_startup.
  • You cannot ssh to ubdaq-prod-ws01 at all. This is network problem, or ws01 is down. Contact an expert.
  • ssh works, but vncviewer prints an error message indicating no response from the vncserver.
    • Make sure you there was no error message from ssh indicating failure to tunnel port 5902. Sometimes this message scrolls off the screen. If tunneling fails, there may be a defunct ssh process that needs to be killed. You can find such processes using ps ax | grep 'ssh.*-L *5902:'.
    • It is possible the vncserver is not running on ubdaq-prod-ws01. You can simply try restarting the vncserver by executing ~/startVNC.sh as the ubooneshift user on ws01. If this fails, check for a dead vncserver process using the ps and grep commands on ubdaq-prod-ws01, like this: ps auxw|grep 'Xvnc.*:2'.
    • If running the startVNC.sh command says something to the effect of "Warning: ubdaq-prod-ws01.fnal.gov:2 is taken because of /tmp/.X11-unix/X2
      Remove this file if there is no X server ubdaq-prod-ws01.fnal.gov:2"
      AND there is no Xvnc process (as established in the previous bullet) then do as the warning suggests and remove the temporary file and try to start the VNC server again.
    • See also SMC_startup.
  • vncviewer starts, but there is no gui. Check the following:
    • Make sure it is not simply minimized or on a different sub-screen. Check the bar at the bottom of the desktop inside the vnc window.
    • If it is really not running, try starting the gui using . ~/setup_SMC_EPICS.sh; run_css from a terminal window inside the vncviewer, or equivalently ~/startCSSGUI.sh.
  • You get an error starting the gui. For example, if you get "Workspace /home/ubooneshift/.ControlSystemStudio/krb5--as-ubooneshift-on-ubdaq-prod-ws01/CSS is in use. Select a different workspace." then you should kill the vncserver and start over. Log on to ws01, issue vncserver -kill :2 and then . startVNC.sh. Reconnect to the vncserver and restart the gui.

Problems with GUI layout: missing something

  • The standard layout is the Alarm "perspective". Select Window -> Perspective -> Alarm or Window -> Perspective -> Other -> Alarm.
  • If that doesn't fix the problem, look for a button/icon labeled "Alarm" in the toolbar at top, usually on the right. It should be selected.
  • If Alarm perspective is selected and the layout is incorrect, right-click on the Alarm perspective button in the toolbar and choose "Reset".
  • If even that doesn't work, it's possible a bad layout has been saved over top of the Alarm perspective. Try to rearrange the windows as you need them to be, and call an expert if you can't get it fixed.

Problems with disconnected channels / missing channel servers

If channels show up as solid pink boxes with the word "disconnected" in them, then the EPICS channel server that should serve these channels is not running. Since these are supposed to start automatically, this is generally speaking an expert problem. However, some simple causes might be

  • Network problem preventing contact to the channel server. E.g., lost contact to a "slowmoncon box" (for rack status, Glassman HV, and impedance monitor), or to the PMT HV controller (for the PMT HV). If it seems the device is really running correctly (heartbeat or activity LEDs flashing), then this is probably a problem for a network support expert. It wouldn't hurt to call a slowmoncon expert first.
  • A power failure to one of these devices. If the power is really out, then this is a problem for a electrical support expert.
  • Power is on and network connected, but device is not providing EPICS data.
    • You can always contact a slow control expert.
    • Glomation "slowmoncon box" rack monitor: If you are on-site and have ODH training and a buddy, you can try power cycling using the front panel switch. Or, you can try logging in to it via ssh as the uboonedaq user. E.g., for Glomation #4, ssh uboonedaq@192.168.144.204. Contact a slow control expert for the password. Once logged in, you can try restarting the IOC using the start_ioc.sh script in the home directory.
    • PMT HV controller: If PMT HV is on, contact a PMT expert and a slow control expert. If PMT HV is off and you are on-site and have ODH training and a buddy, it should be safe to try power cycling using the front panel switch. Note ubdaq-prod-smc must be online in order for the controller to boot. Alternatively, you can try logging in to it through its serial port via the Glomation in the Trigger/PMT rack. (Currently Glomation #16.) First ssh uboonedaq@192.168.144.216. (Contact a slow control expert for the password.) From there, you must sudo bash and screen -r to open serial port connection screen. (If that fails with a "no session to restore" message, try screen /dev/ttyS1 15200,cs8.) If you see the "Boot" prompt, use "@" to reboot. This may be needed after a power outage because power can be restored to the PMT HV controller before ubdaq-prod-smc is online. But if it is at the "->" prompt, it should already be working. Call a slow control expert.

If none of the above applies, contact a slow monitor/control (slowmoncon) expert.

Problems with alarms

  • Alarm ranges: there is an expert procedure for changing them. Contact the appropriate subsystem expert to get approval for changes.
  • Alarms not functioning at all, or "server not found": The alarm system should start and restart itself as needed; currently, there is no non-expert procedure for diagnosing or restarting the alarm system. One might be created if this becomes a problem, but for now, please let a slowmoncon expert know.

Problems with archiving

Entering to and exiting from full-screen mode is done using 'F11'
In this mode, the only visible window will be the SMC, and the archiver will not be visible.
Hence, If the archiver is not visible, checkout if you are on full-screen mode, and if so, exit from it.