Project

General

Profile

Monitoring

DfMon

  • Monitor page shows disk space usage in % for the last 24 hours for critical offline MINOS machines. Page is updated every 10 min. By request email warning can be sent to a list of users every N hours after disk usage exceeds value X.

Web links:

http://nusoft.fnal.gov/minos/prodmon/
http://nusoft.fnal.gov/minos/prodmon/df.html

Cron job

#lena@minos27.fnal.gov
*/10 * * * * /usr/krb5/bin/kcron "ssh -l mindata minos27.fnal.gov bash  /scratch/minos/rrdtest/dfMon.cron >& /dev/null" >& /dev/null

Scripts

mindata@minos27.fnal.gov
/scratch/minos/rrdtest/*

#setup script that calls monDf.py:
/scratch/minos/rrdtest/dfMon.cron

#driving script: contains hardcoded list of the disks to monitor, and list of users to send warning email
/scratch/minos/rrdtest/monDf.py

#monitoring utilities including: 

   - df remotely by ssh
   - df of local disk
   - df afs
   - rrdtool plot - percentage
   - rrdplot by value
   - mail to a list of users
   - mail to a list of users once in N seconds on condition

/scratch/minos/rrdtest/monUtils.py

rrd output and logs: /scratch/minos/rrdtest/rrd

configuration: /scratch/minos/rrdtest/rrdDef.py

(output files location: /nusoft/app/web/htdoc/minos/prodmon/* )

Web area

(output files location: /nusoft/app/web/htdoc/minos/prodmon/* )
served at
http://nusoft.fnal.gov/minos/prodmon/df.html
http://nusoft.fnal.gov/minos/prodmon/

Contacts

Databases updates - timing study

Plots

BFLDDBICOILSTATEVLD

BFLDDBICOILSTATEVLD, timestart less then 5 years before

BFLDDBICOILSTATEVLD, timestart last 5 years

BFLDDBICOILSTATEVLD, timestart last 5 years , close-up

BFLDDBICOILSTATEVLD, timestart last 5 years , overflow

SPILLTIMENDVLD

Scripts

where to put it (as minsoft):
/minos/app/monitoring? 
should go under minosdb account?

lena@minos27.fnal.gov
/scratch/minos/lena/timing_study/

histogram.py - plots INSERTDATE difference for FD and ND, it creates
TABLE.eps files from DB dumps, FD and ND.
Plot entries are for SEQNO's  that are present in both DB dumps, so 
they do not include values that are still in transitions. LOW and HIGH
are adjusted to leave out abnormal entries.

DB queries:
select SEQNO,UNIX_TIMESTAMP(TIMESTART),UNIX_TIMESTAMP(TIMEEND),UNIX_TIMESTAMP(CREATIONDATE),UNIX_TIMESTAMP(INSERTDATE) from SPILLTIMENDVLD order by SEQNO;

select SEQNO,UNIX_TIMESTAMP(TIMESTART),UNIX_TIMESTAMP(TIMEEND),UNIX_TIMESTAMP(CREATIONDATE),UNIX_TIMESTAMP(INSERTDATE) from BFLDDBICOILSTATEVLD order by SEQNO;

done on minos-db.minos-soudan.org and minos-db1.fnal.gov

then output copied to /scratch/minos/lena/timing_study/

*.eps are copied to /nusoft/app/web/htdoc/minos/prodmon/