Project

General

Profile

Support #2524

monitoring of MINOS machines w/ top

Added by Robert Hatcher almost 8 years ago. Updated about 3 years ago.

Status:
Closed
Priority:
Normal
Start date:
03/01/2012
Due date:
% Done:

100%

Estimated time:
Spent time:
Duration:

Description

A proposal was made to keep a running log of "top" on various machines in order to have an additional handle on causes for machine overloads. This would be useful when the machine becomes unreachable due to the heavy load.

This might be generalize to all of the IF nodes.

History

#1 Updated by Arthur Kreymer almost 8 years ago

This is now running, using the ifmon account.
Restarted April 12 2012.

View the results at http://nusoft.fnal.gov/ifmon/pslog/

Script  /nusoft/app/home/ifmon/pslog
Added 'export TZ=UTC' to the script
old script is pslog.20120304
Restarted pslog on all systems
first killed running pslog by killing the sleep 600

set nohup ; { /nusoft/app/home/ifmon/pslog & } ; sleep 2 ; ps xf

grep sshd /nusoft/app/web/htdoc/ifmon/pslog/`hostname -s`/CURRENT.txt | tail

#2 Updated by Arthur Kreymer about 6 years ago

  • Status changed from New to Resolved
  • Assignee set to Arthur Kreymer
  • % Done changed from 0 to 100

FEF has deployed up to the minute ps monitoring of all systems.
See files like /var/log/prochistory/20140131_08:53:01_procs.gz
View with zcat.
About a week is retained.

#3 Updated by Arthur Kreymer about 3 years ago

  • Status changed from Resolved to Closed


Also available in: Atom PDF