Project

General

Profile

Status 2012-05-02

Current state

  • Basic Architecture in place
  • Some usage
  • Thresholds alarming

Batch monitoring

  • Joe has prototype up on rexgpvm01
  • Login/permissions current issue.

Plans

Put Jeremy to work this summer:

  • Downtime db
    • downtime/change info
    • what it affects
    • provide graph annotations
  • Javascript graph utility
    • standard wrapper
    • include downtime info
    • list items/colors, etc.
  • threshold info in browser page
  • Updates to Joe's batch pages?

Questions

  • What do we need to monitor that we arent?
    • enstore storage (per experiment?)
    • enstore usage (per experiment?)
    • sam station dump info?
  • What displays do we need?
  • Job state monitoring?