Project

General

Profile

Older, Resolved Issues

  1. Architect Clean-Up (10/31/2012)
    It has been observed that the architect leaves some processes behind when the instance is shutdown. We are working on a fix. Until then it is recommended that in the afternoon, when you prepare the system for the night, you log in to ics1 as sispi and check for old shutter processes. For example ps aux | grep bin/Shutter If the instance is running there should be only one such process (none if the instances is down). If you find a shutter process when the instance is down stop it with kill -9 <pid>
    This is now addressed by the cleanup_processes script and the -k option of the architect.
  2. PML error: Server too busy (11/18/2012)
    This is the first time we have ever seen this error. The Hexapod was refusing PML connections with an error about "Server too busy". This is a Pyro thing. If it comes up again, we can increase Pyro's maximum number of allowed connections by setting PYRO_MAXCONNECTIONS. We can do this in the same manner we set other Pyro setting (Pyro reads settings from a file or from environment variables). After joining an instance, you can check out Pyro's current configuration with "python -m Pyro.configuration". Right now, we have max set to 200 connections.
  3. Shutter stuck open in GUIs (11/11/2012)
    We have identified a problem with the shutter code that prevents the shutter displays in the GUIs from closing. We verified using dome flats and comparing counts that the shutter is closed and that this is just a software/GUI issue. Until this is fixed please use this fix to correct the display. If this recipe seems weird to you - well, it is, but it works: On the Architect Console select the Shutter and enter the command configure.
    Submit and repeat this 5 times (or more) until you see the shutter image in the GUIs closing. No, you don't have to stand on one foot when doing this. After that you will have to RESET on the observer console.# Failed Exposures are Lost from Queue (10/02/2012)
    This behaviour is as expected, but can be frustrating when you're running a script. If one exposure fails, like with the intermittent TCSInterface error, then that exposure will not be retried and you'll be missing exposures from the intended set.
  4. Runaway Hexapod (10/09/2012)
    If in the course of testing, something may happen where the hexapod gets some ridiculous number and starts moving way too far. To fix this, go to the Console app in the vnc and type "HEXAPOD stop" or go the System Control in the Architect Console GUI do component ICS, device HEXAPOD, command stop. You will get back a message that this failed because the hexapod is busy moving, but it should stop anyway. Then be sure to turn off the buggy component, reconfigure, and give the hexapod a reasonable value to go back to (for instance, by setting the focus to a known reasonable value for the next exposure). The OCS might time out as the hexapod makes a long move; try again once it's arrived.
  5. Don't Forget to Reset Things after a Configure (10/04/2012)
    If the instance needs a reset, the Hexapod will go back the default settings from the ini file. If you had changed anything there (like which LUTs are being consulted), be sure to reset them after a configure.
  6. Observer1 startup/cleanup scripts (10/8/2012)
    The DECamObserver account has a few scripts in ~/bin to make managing all the SISPI windows easier. The start_sispi_windows script starts a bunch of browser windows, vnc for the GuiderGUI, and Skype. The organize_sispi_windows script spreads those windows out neatly across all 8 monitors. The observer_setup script runs the first script, sleeps a bit to let all the windows get their title bars, then runs the second script. Note that observer_setup follows the other scripts with "&"; there were issues with the first script not releasing the terminal and the second script never running. There is also an observer_cleanup script that kills all chrome, skype, and vncviewer processes. The observer_setup script has a shortcut on the desktop (which works now, unlike before). The observer_setup and observer_cleanup scripts are also available as drop-down icons from the menu bar; this is particularly useful for running the cleanup script when the desktop and all your xterms are buried under a pile of other windows.
  7. GUI Timeout and Freezes (9/28/2012)
    We have a known problem (but no solution yet) with the GUIs. Eventually they run out of resources on the observer1 machine (most often memory) and they crash or response becomes sluggish. The most dangerous situation are stale displays - the GUI's look fine but they are not updating. It is recommended to refresh each GUI occasionally. Note that it is safe to restart all GUIs without interrupting the SISPI instance.
    Kevin points out that Chrome gets really slow when using 1.4G of RAM. By using top and sorting by memory use (press "F" to selecting sorting column, and "n" to select memory), you can find the Chrome windows that are slowing everything down and kill 'em. The remaining Chrome windows become much more responsive after this clean up.
  8. Starfinder Timeouts (9/28/2012)
    With the default catalog (nomad_catalog pipeline6) SISPI (GCS and Donut) time out in prepareGCS, prepareDonut and break an interlock
    Solution: exclude Guider and Donut - either on the observer console or by setting the appropriate configuration variable (lookup_guidestar, for example). Using one of Kevin's reduced catalogs also works
  9. GCS does not stop Guider (9/28/2012)
    The GCS/Guider complex is the least tested part of SISPI. We have noticed that once in a while GCS misses to stop the Guider at the end of the exposure. The "sync_with_shutter" feature is designed to take care of this. A patch has been applied to the OCS to force the "stop_guiding" call. The effectiveness of this fix needs to be monitored.
  10. TCSInterface (9/28/2012)
    Less frequent than in past days we still observe that the TCSInterface breaks an interlock when it looses the connection to the TCS. In most cases we could trace this to issues/activities on the TCS side but the effect is the same: you need to reset (Check the interlock viewer and if the TCSINTERFACE is back to the READY state a simple RESET on the observer console gui is sufficient. If not you need to configure.
  11. TCS Slewing Issues (10/02/2012)
    During a long slew, the OCS timed out even though there were no errors from the TCS. It just had a long way to go. Also, there is a chance that some of the interlock issues are due to the dome moving slower than the telescope can slew; do we check for that? The OCS waits now for 10 minutes (ie basically forever) - you can abort this by pressing the "Abort TCS Command" in the TCS gui in the ICS display. The TCS now waits for the dome to be in position before telling SISPI that it is ready
  12. FCM Crashes (10/06/2012)
    When you get an error message that the FCM has failed (the interlock breaks as well) you need to restart this application from the architect console GUI.
    Select FCM (it runs on ics1) on the left view. Then click the restart button (best to double check that you really have FCM selected - you don't want to restart other components)
    Watch the log messages. When the FCM is back reconfigure SISPI.
  13. Microphone Input for Skype (10/06/2012)
    There is a microphone attached to observer1 (I think it's part of the webcam, which connects via USB) to use for making Skype calls. If this is not working, you may have the wrong audio input device select. To check, go to the menu bar and right click on the volume icon. Choose "Sound Preferences". The 3rd tab in this window is "Input". Choose the device "081d Analog Mono" (not the internal audio analog stereo). If you make some noise, you should then see the input level bars dance around. If not, check the "Input Volume" slider.