Project

General

Profile

Task #9779

Enhance SAMMonitor sensor on cmssrv205 so that it reports as several individual checks

Added by Gerard Bernabeu Altayo almost 4 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
Normal
Start date:
07/31/2015
Due date:
% Done:

0%

Estimated time:
15.00 h
Spent time:
Duration:

Description

Hi Nick,

Tony Tiradani developed the following sensor:

[root@cmssrv205 ~]# /usr/share/check-mk-agent/local/cms_site_monitoring
1 SAMMonitor - Flavor: OSG-CE, Hostname: cmsosgce3.fnal.gov, Metric: org.sam.CONDOR-JobSubmit (/cms/Role_lcgadmin), Status: UNKNOWN, Age: 0 # Flavor: OSG-CE, Hostname: cmsosgce3.fnal.gov, Metric: org.sam.CONDOR-JobSubmit (/cms/Role_production), Status: UNKNOWN, Age: 0
[root@cmssrv205 ~]#

The output is good but I'd like to be able to ack/ignore individual SAM checks.

The script should be modified so that it echoes many lines in the proper check_mk sensor. I think this will allow one single script to report as many sensors that can be dealt with individually :)

WARN cmssrv205 SAMMonitor [Reschedule an immediate check of the 'Check_MK' service] [View and edit parameters for this service] WARN - Flavor: OSG-CE, Hostname: cmsosgce3.fnal.gov, Metric: org.sam.CONDOR-JobSubmit (/cms/Role_lcgadmin), Status: UNKNOWN, Age: 0 # Flavor: OSG-CE, Hostname: cmsosgce3.fnal.gov, Metric: org.sam.CONDOR-JobSubmit (/cms/Role_production), Status: UNKNOWN, Age: 0

Once this is done we'll be able to clear the check_mk entries :)

Gerard

History

#1 Updated by Nicholas Peregonow almost 4 years ago

Taking initial look at this script now. Current script displaying the following warning:

1 SAMMonitor - Flavor: OSG-CE, Hostname: cmsosgce3.fnal.gov, Metric: org.sam.CONDOR-JobSubmit (/cms/Role_lcgadmin), Status: UNKNOWN, Age: 0 # Flavor: OSG-CE, Hostname: cmsosgce3.fnal.gov, Metric: org.sam.CONDOR-JobSubmit (/cms/Role_production), Status: UNKNOWN, Age: 0

Looks like I can pass the verbose option to this script, and see that all sites are good, except the two reported.

Clarification: Are we looking to have each one of these sites listed below as an idividual check in check_mk?

[njp@workbench python]$ ./cms_site_monitoring --verbose
Found 'T1_US_FNAL'
Flavor: OSG-SRMv2
Host: cmssrm.fnal.gov - Host Status: OK
Metric: org.cms.SRM-GetPFNFromTFC (/cms/Role_production) - status: OK
Metric: org.cms.SRM-VOPut (/cms/Role_production) - status: OK
Metric: org.cms.SRM-VOGet (/cms/Role_production) - status: OK
Host: cmssrmdisk.fnal.gov - Host Status: MISSING
Flavor: OSG-CE
Host: cmsosgce.fnal.gov - Host Status: OK
Metric: org.cms.glexec.WN-gLExec (/cms/Role_pilot) - status: OK
Metric: org.cms.WN-squid (/cms/Role_lcgadmin) - status: OK
Metric: org.cms.WN-mc (/cms/Role_production) - status: OK
Metric: org.cms.WN-basic (/cms/Role_lcgadmin) - status: OK
Metric: org.cms.WN-xrootd-access (/cms/Role_lcgadmin) - status: OK
Metric: org.cms.WN-swinst (/cms/Role_lcgadmin) - status: OK
Metric: org.cms.WN-frontier (/cms/Role_lcgadmin) - status: OK
Metric: org.cms.WN-env (/cms/Role_lcgadmin) - status: OK
Metric: org.cms.WN-xrootd-fallback (/cms/Role_lcgadmin) - status: OK
Metric: org.cms.WN-analysis (/cms/Role_lcgadmin) - status: OK
Metric: org.sam.CONDOR-JobSubmit (/cms/Role_lcgadmin) - status: OK
Metric: org.sam.CONDOR-JobSubmit (/cms/Role_production) - status: OK
Host: cmsosgce2.fnal.gov - Host Status: OK
Metric: org.cms.glexec.WN-gLExec (/cms/Role_pilot) - status: OK
Metric: org.cms.WN-squid (/cms/Role_lcgadmin) - status: OK
Metric: org.cms.WN-mc (/cms/Role_production) - status: OK
Metric: org.cms.WN-basic (/cms/Role_lcgadmin) - status: OK
Metric: org.cms.WN-xrootd-access (/cms/Role_lcgadmin) - status: OK
Metric: org.cms.WN-swinst (/cms/Role_lcgadmin) - status: OK
Metric: org.cms.WN-frontier (/cms/Role_lcgadmin) - status: OK
Metric: org.cms.WN-env (/cms/Role_lcgadmin) - status: OK
Metric: org.cms.WN-xrootd-fallback (/cms/Role_lcgadmin) - status: OK
Metric: org.cms.WN-analysis (/cms/Role_lcgadmin) - status: OK
Metric: org.sam.CONDOR-JobSubmit (/cms/Role_lcgadmin) - status: OK
Metric: org.sam.CONDOR-JobSubmit (/cms/Role_production) - status: OK
Host: cmsosgce3.fnal.gov - Host Status: UNKNOWN
Metric: org.sam.CONDOR-JobSubmit (/cms/Role_lcgadmin) - status: UNKNOWN
Metric: org.sam.CONDOR-JobSubmit (/cms/Role_production) - status: UNKNOWN
Host: cmsosgce4.fnal.gov - Host Status: OK
Metric: org.cms.glexec.WN-gLExec (/cms/Role_pilot) - status: OK
Metric: org.cms.WN-squid (/cms/Role_lcgadmin) - status: OK
Metric: org.cms.WN-mc (/cms/Role_production) - status: OK
Metric: org.cms.WN-basic (/cms/Role_lcgadmin) - status: OK
Metric: org.cms.WN-xrootd-access (/cms/Role_lcgadmin) - status: OK
Metric: org.cms.WN-swinst (/cms/Role_lcgadmin) - status: OK
Metric: org.cms.WN-frontier (/cms/Role_lcgadmin) - status: OK
Metric: org.cms.WN-env (/cms/Role_lcgadmin) - status: OK
Metric: org.cms.WN-xrootd-fallback (/cms/Role_lcgadmin) - status: OK
Metric: org.cms.WN-analysis (/cms/Role_lcgadmin) - status: OK
Metric: org.sam.CONDOR-JobSubmit (/cms/Role_lcgadmin) - status: OK
Metric: org.sam.CONDOR-JobSubmit (/cms/Role_production) - status: OK

#2 Updated by Nicholas Peregonow over 3 years ago

  • Status changed from New to Closed
  • Estimated time set to 15.00 h

Completed



Also available in: Atom PDF