Project

General

Profile

MonitoringServiceThoughts

Proposal: Central monitoring repository

i.e "what we wanted MCAS to be"

Components:

  • Central webservice:
    • drop-off service for rrd data additions, with defined hierarchy
      (web rrdcreate/rrdupdate service)
    • fetch data (via rrdxport
      http://www.mrtg.org/rrdtool/doc/rrdxport.en.html)
    • dropoff/fetch for mappings (jobs->users, etc.)
      • in json format(?) for javascript pages
  • snmp gateway
    • polls systems/routers, updates rrd files
    • net-snmp package in kits for linux systems, or via rpms
  • Web area for monitoring pages
    • javscript tools for
      • graphing data using rrdxport service
    • form to generate custom mixed monitors using (rrdxport service)
      • shows graph
      • shows javascript for putting in pages
  • Alarm service?
    • configure thresholds,action for values in rrd files
    • configure scheduled downtimes to squelch
    • script to poll each, cut helpdesk ticket.

Storage Hierarchy

     rrd/
         net/
             r/router.rrd
         hosts/
             h/hostname.rrd
         jobs/
             cluster1/             
                 byuser/
                     username/
                          jobid.rrd
                 bygroup/
                      groupname/
                          username -> ../../byuser/username
         systems/
             dcache/
                 stats.rrd
             sam/
                station/
                    stats.rrd
     maps/
         hosts/
             h/host_dev_mount.json
         jobs/
             job_state.json
             job_user.json