Project

General

Profile

Rebooting the Mountain Top Computers

Rebooting the mountain top computer cluster should only be done by authorized CTIO personnel. They know the correct procedures and in particular are familiar with the correct order in which the machines have to be restarted. Failure to do this correctly will prevent you from restarting SISPI.

Unfortunately, recovery after the computers restart is not quite as smooth as we would like. Here is a list of issues we have encountered. It is not understood what causes these particular failures. Eventually all of these items should get fixed...

  1. Some computers fail to restart. They hang with the console displaying "Restarting". If you encounter this situation please ask telops to reset the machine(s)
  2. Some of the DECam network devices fail to connect after a reboot. For the FCM and the BCAM system it might be necessary to power-cycle the hardware controllers if the connection problem persists. The BCAM (lwdaq) driver box is in the Cass Cage. The FCM controller is in the back of the Barrel but can be power-cycled from the bottom of the HexaPod Rack.
  3. Some of the vncservers might have to be restarted manually (on readout2 (guider GUI, display 15), system1 (system control, display 2)
  4. QuickReduce needs to be restarted on quick1. Follow the instructions or contact Angelo.
  5. We had serious connectivity problems between SISPI and the TCS after the reboot. This could be unrelated but we didn't find any problem and couldn't reproduce the issue after the TCS was restarted. If you run into this, try to restart the TCS.