Project

General

Profile

Task #9895

CMS-T1 downtime Aug 26th for dCache upgrade.

Added by Natalia Ratnikova over 4 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
Normal
Start date:
08/25/2015
Due date:
% Done:

0%

Estimated time:
Spent time:
Duration:

Description

Down time starts at 1pm last till 4pm.
Upgrade to dCache 2.2.29 - security patch
Move to openjdk java
yum update and reboot everything to get new kernel.

History

#1 Updated by Natalia Ratnikova over 4 years ago

Plan:

Declare downtime in check_mk .

1. Add new dcache rpm to uscmst1 repo.
2. Stop phedex agents
3. Stop dcache service on cmssrmdisk.
4. Uninstall snapshot version.
5. Proceed with upgrade and reboot and checks , see Gerard's commands
6. Fix any remaining issues
7. Start phedex agents
8. Check phedex download logs
9. Declare the end of downtime

#2 Updated by Natalia Ratnikova over 4 years ago

From 1pm to 3 pm on 08-26-2015 , one hour before official end, so we see any alarms coming

#3 Updated by Natalia Ratnikova over 4 years ago

on cmsadmin1:

[root@cmsadmin1 Aug-26-2015]# pwd
/root/natalia/Aug-26-2015
[root@cmsadmin1 Aug-26-2015]# wc -l *
191 dcache-disk-pools.list
4 dcache-disk-servers.list
195 total
[root@cmsadmin1 Aug-26-2015]#

#4 Updated by Natalia Ratnikova over 4 years ago

Preparation step : run yum update in advance on all nodes, so the new kernel rpm is installed

on root@cmsadmin1

cd /root/natalia/Aug-26-2015

[root@cmsadmin1 Aug-26-2015]# pssh -h dcache-disk-servers.list -l root -t 0 -p 50 -o yum_update_srvs.log -e yum_update_srvs.err 'kernelversion=2.6.32-573.3.1.el6.x86_64; yum update -y; rpm -q kernel-${kernelversion}'
[1] 17:57:01 [SUCCESS] cmssrmdisk.fnal.gov
[2] 17:57:02 [SUCCESS] cmschimeradiskbackup.fnal.gov
[3] 17:57:03 [SUCCESS] cmsdcacheadmindisk.fnal.gov
[4] 17:57:10 [SUCCESS] cmschimeradisk.fnal.gov
[root@cmsadmin1 Aug-26-2015]#

[root@cmsadmin1 Aug-26-2015]# time pssh -h dcache-disk-pools.list -l root -t 0 -p 50 -o yum_update_pools.log -e yum_update_pools.err 'kernelversion=2.6.32-573.3.1.el6.x86_64; yum update -y; rpm -q kernel-${kernelversion}'

[1] 17:59:43 [SUCCESS] cmsstor221.fnal.gov
[2] 17:59:52 [SUCCESS] cmsstor201.fnal.gov
[3] 17:59:55 [SUCCESS] cmsstor185.fnal.gov
[4] 17:59:58 [SUCCESS] cmsstor181.fnal.gov
[5] 17:59:58 [SUCCESS] cmsstor216.fnal.gov
[6] 17:59:58 [SUCCESS] cmsstor195.fnal.gov
[7] 17:59:58 [SUCCESS] cmsstor175.fnal.gov
[8] 17:59:58 [SUCCESS] cmsstor203.fnal.gov
[9] 17:59:58 [SUCCESS] cmsstor182.fnal.gov
[10] 17:59:59 [SUCCESS] cmsstor172.fnal.gov
[11] 17:59:59 [SUCCESS] cmsstor177.fnal.gov
[12] 17:59:59 [SUCCESS] cmsstor220.fnal.gov
[13] 17:59:59 [SUCCESS] cmsstor193.fnal.gov
[14] 18:00:00 [SUCCESS] cmsstor173.fnal.gov
[15] 18:00:00 [SUCCESS] cmsstor217.fnal.gov
[16] 18:00:00 [SUCCESS] cmsstor171.fnal.gov
[17] 18:00:00 [SUCCESS] cmsstor183.fnal.gov
[18] 18:00:00 [SUCCESS] cmsstor174.fnal.gov
[19] 18:00:00 [SUCCESS] cmsstor178.fnal.gov
[20] 18:00:00 [SUCCESS] cmsstor209.fnal.gov
[21] 18:00:00 [SUCCESS] cmsstor196.fnal.gov
[22] 18:00:00 [SUCCESS] cmsstor168.fnal.gov
[23] 18:00:01 [SUCCESS] cmsstor176.fnal.gov
[24] 18:00:01 [SUCCESS] cmsstor219.fnal.gov
[25] 18:00:01 [SUCCESS] cmsstor189.fnal.gov
[26] 18:00:01 [SUCCESS] cmsstor214.fnal.gov
[27] 18:00:01 [SUCCESS] cmsstor179.fnal.gov
[28] 18:00:02 [SUCCESS] cmsstor187.fnal.gov
[29] 18:00:03 [SUCCESS] cmsstor169.fnal.gov
[30] 18:00:03 [SUCCESS] cmsstor237.fnal.gov
[31] 18:00:03 [SUCCESS] cmsstor206.fnal.gov
[32] 18:00:03 [SUCCESS] cmsstor186.fnal.gov
[33] 18:00:04 [SUCCESS] cmsstor212.fnal.gov
[34] 18:00:04 [SUCCESS] cmsstor194.fnal.gov
[35] 18:00:04 [SUCCESS] cmsstor210.fnal.gov
[36] 18:00:04 [SUCCESS] cmsstor198.fnal.gov
[37] 18:00:04 [SUCCESS] cmsstor184.fnal.gov
[38] 18:00:04 [SUCCESS] cmsstor215.fnal.gov
[39] 18:00:08 [SUCCESS] cmsstor213.fnal.gov
[40] 18:00:12 [SUCCESS] cmsstor207.fnal.gov
[41] 18:00:15 [SUCCESS] cmsstor199.fnal.gov
[42] 18:00:16 [SUCCESS] cmsstor191.fnal.gov
[43] 18:00:19 [SUCCESS] cmsstor197.fnal.gov
[44] 18:00:20 [SUCCESS] cmsstor188.fnal.gov
[45] 18:00:23 [SUCCESS] cmsstor205.fnal.gov
[46] 18:00:27 [SUCCESS] cmsstor204.fnal.gov
[47] 18:00:35 [SUCCESS] cmsstor208.fnal.gov
[48] 18:01:01 [SUCCESS] cmsstor222.fnal.gov
[49] 18:01:07 [SUCCESS] cmsstor242.fnal.gov
[50] 18:01:08 [SUCCESS] cmsstor223.fnal.gov
[51] 18:01:08 [SUCCESS] cmsstor192.fnal.gov
[52] 18:01:11 [SUCCESS] cmsstor224.fnal.gov
[53] 18:01:12 [SUCCESS] cmsstor227.fnal.gov
[54] 18:01:12 [SUCCESS] cmsstor226.fnal.gov
[55] 18:01:12 [SUCCESS] cmsstor233.fnal.gov
[56] 18:01:13 [SUCCESS] cmsstor234.fnal.gov
[57] 18:01:15 [SUCCESS] cmsstor231.fnal.gov
[58] 18:01:15 [SUCCESS] cmsstor236.fnal.gov
[59] 18:01:15 [SUCCESS] cmsstor244.fnal.gov
[60] 18:01:16 [SUCCESS] cmsstor232.fnal.gov
[61] 18:01:16 [SUCCESS] cmsstor238.fnal.gov
[62] 18:01:16 [SUCCESS] cmsstor239.fnal.gov
[63] 18:01:17 [SUCCESS] cmsstor230.fnal.gov
[64] 18:01:17 [SUCCESS] cmsstor246.fnal.gov
[65] 18:01:17 [SUCCESS] cmsstor250.fnal.gov
[66] 18:01:18 [SUCCESS] cmsstor243.fnal.gov
[67] 18:01:18 [SUCCESS] cmsstor240.fnal.gov
[68] 18:01:18 [SUCCESS] cmsstor245.fnal.gov
[69] 18:01:18 [SUCCESS] cmsstor248.fnal.gov
[70] 18:01:19 [SUCCESS] cmsstor249.fnal.gov
[71] 18:01:19 [SUCCESS] cmsstor247.fnal.gov
[72] 18:01:19 [SUCCESS] cmsstor229.fnal.gov
[73] 18:01:25 [SUCCESS] cmsstor251.fnal.gov
[74] 18:01:27 [SUCCESS] cmsstor235.fnal.gov
[75] 18:01:33 [SUCCESS] cmsstor241.fnal.gov
[76] 18:01:37 [SUCCESS] cmsstor211.fnal.gov
[77] 18:01:38 [SUCCESS] cmsstor228.fnal.gov
[78] 18:01:39 [SUCCESS] cmsstor225.fnal.gov
[79] 18:01:46 [SUCCESS] cmsstor202.fnal.gov
[80] 18:03:04 [SUCCESS] cmsstor264.fnal.gov
[81] 18:03:10 [SUCCESS] cmsstor267.fnal.gov
[82] 18:03:10 [SUCCESS] cmsstor323.fnal.gov
[83] 18:03:14 [SUCCESS] cmsstor266.fnal.gov
[84] 18:03:16 [SUCCESS] cmsstor265.fnal.gov
[85] 18:03:18 [SUCCESS] cmsstor313.fnal.gov
[86] 18:03:19 [SUCCESS] cmsstor218.fnal.gov
[87] 18:03:20 [SUCCESS] cmsstor312.fnal.gov
[88] 18:03:20 [SUCCESS] cmsstor316.fnal.gov
[89] 18:03:20 [SUCCESS] cmsstor309.fnal.gov
[90] 18:03:21 [SUCCESS] cmsstor315.fnal.gov
[91] 18:03:21 [SUCCESS] cmsstor310.fnal.gov
[92] 18:03:21 [SUCCESS] cmsstor314.fnal.gov
[93] 18:03:21 [SUCCESS] cmsstor311.fnal.gov
[94] 18:03:22 [SUCCESS] cmsstor273.fnal.gov
[95] 18:03:23 [SUCCESS] cmsstor261.fnal.gov
[96] 18:03:27 [SUCCESS] cmsstor317.fnal.gov
[97] 18:03:28 [SUCCESS] cmsstor271.fnal.gov
[98] 18:03:31 [SUCCESS] cmsstor318.fnal.gov
[99] 18:03:32 [SUCCESS] cmsstor270.fnal.gov
[100] 18:03:33 [SUCCESS] cmsstor319.fnal.gov
[101] 18:03:35 [SUCCESS] cmsstor275.fnal.gov
[102] 18:03:40 [SUCCESS] cmsstor320.fnal.gov
[103] 18:03:41 [SUCCESS] cmsstor321.fnal.gov
[104] 18:03:42 [SUCCESS] cmsstor322.fnal.gov
[105] 18:03:45 [SUCCESS] cmsstor277.fnal.gov
[106] 18:04:09 [SUCCESS] cmsstor278.fnal.gov
[107] 18:04:11 [SUCCESS] cmsstor263.fnal.gov
[108] 18:04:12 [SUCCESS] cmsstor282.fnal.gov
[109] 18:04:16 [SUCCESS] cmsstor262.fnal.gov
[110] 18:04:18 [SUCCESS] cmsstor287.fnal.gov
[111] 18:04:18 [SUCCESS] cmsstor284.fnal.gov
[112] 18:04:19 [SUCCESS] cmsstor269.fnal.gov
[113] 18:04:19 [SUCCESS] cmsstor286.fnal.gov
[114] 18:04:19 [SUCCESS] cmsstor285.fnal.gov
[115] 18:04:21 [SUCCESS] cmsstor268.fnal.gov
[116] 18:04:21 [SUCCESS] cmsstor293.fnal.gov
[117] 18:04:22 [SUCCESS] cmsstor291.fnal.gov
[118] 18:04:24 [SUCCESS] cmsstor288.fnal.gov
[119] 18:04:27 [SUCCESS] cmsstor272.fnal.gov
[120] 18:04:28 [SUCCESS] cmsstor294.fnal.gov
[121] 18:04:33 [SUCCESS] cmsstor274.fnal.gov
[122] 18:04:33 [SUCCESS] cmsstor292.fnal.gov
[123] 18:04:35 [SUCCESS] cmsstor276.fnal.gov
[124] 18:05:00 [SUCCESS] cmsstor341.fnal.gov
[125] 18:05:14 [SUCCESS] cmsstor279.fnal.gov
[126] 18:05:18 [SUCCESS] cmsstor280.fnal.gov
[127] 18:05:22 [SUCCESS] cmsstor281.fnal.gov
[128] 18:05:24 [SUCCESS] cmsstor283.fnal.gov
[129] 18:05:29 [SUCCESS] cmsstor289.fnal.gov
[130] 18:05:32 [SUCCESS] cmsstor290.fnal.gov
[131] 18:05:42 [SUCCESS] cmsstor329.fnal.gov
[132] 18:06:18 [SUCCESS] cmsstor325.fnal.gov
[133] 18:06:19 [SUCCESS] cmsstor326.fnal.gov
[134] 18:06:31 [SUCCESS] cmsstor334.fnal.gov
[135] 18:06:32 [SUCCESS] cmsstor336.fnal.gov
[136] 18:06:33 [SUCCESS] cmsstor333.fnal.gov
[137] 18:06:37 [SUCCESS] cmsstor344.fnal.gov
[138] 18:06:37 [SUCCESS] cmsstor328.fnal.gov
[139] 18:06:38 [SUCCESS] cmsstor345.fnal.gov
[140] 18:06:43 [SUCCESS] cmsstor342.fnal.gov
[141] 18:06:49 [SUCCESS] cmsstor353.fnal.gov
[142] 18:07:08 [SUCCESS] cmsstor369.fnal.gov
[143] 18:07:18 [SUCCESS] cmsstor370.fnal.gov
[144] 18:07:25 [SUCCESS] cmsstor327.fnal.gov
[145] 18:07:26 [SUCCESS] cmsstor358.fnal.gov
[146] 18:07:29 [SUCCESS] cmsstor324.fnal.gov
[147] 18:07:30 [SUCCESS] cmsstor354.fnal.gov
[148] 18:07:31 [SUCCESS] cmsstor339.fnal.gov
[149] 18:07:31 [SUCCESS] cmsstor330.fnal.gov
[150] 18:07:32 [SUCCESS] cmsstor331.fnal.gov
[151] 18:07:34 [SUCCESS] cmsstor360.fnal.gov
[152] 18:07:35 [SUCCESS] cmsstor364.fnal.gov
[153] 18:07:35 [SUCCESS] cmsstor337.fnal.gov
[154] 18:07:37 [SUCCESS] cmsstor332.fnal.gov
[155] 18:07:37 [SUCCESS] cmsstor338.fnal.gov
[156] 18:07:38 [SUCCESS] cmsstor366.fnal.gov
[157] 18:07:38 [SUCCESS] cmsstor335.fnal.gov
[158] 18:07:39 [SUCCESS] cmsstor340.fnal.gov
[159] 18:07:49 [SUCCESS] cmsstor402.fnal.gov
[160] 18:07:50 [SUCCESS] cmsstor401.fnal.gov
[161] 18:07:55 [SUCCESS] cmsstor347.fnal.gov
[162] 18:07:57 [SUCCESS] cmsstor348.fnal.gov
[163] 18:07:57 [SUCCESS] cmsstor346.fnal.gov
[164] 18:07:59 [SUCCESS] cmsstor409.fnal.gov
[165] 18:07:59 [SUCCESS] cmsstor403.fnal.gov
[166] 18:08:03 [SUCCESS] cmsstor404.fnal.gov
[167] 18:08:04 [SUCCESS] cmsstor343.fnal.gov
[168] 18:08:04 [SUCCESS] cmsstor349.fnal.gov
[169] 18:08:07 [SUCCESS] cmsstor406.fnal.gov
[170] 18:08:09 [SUCCESS] cmsstor405.fnal.gov
[171] 18:08:10 [SUCCESS] cmsstor410.fnal.gov
[172] 18:08:11 [SUCCESS] cmsstor407.fnal.gov
[173] 18:08:14 [SUCCESS] cmsstor408.fnal.gov
[174] 18:08:19 [SUCCESS] cmsstor351.fnal.gov
[175] 18:08:25 [SUCCESS] cmsstor350.fnal.gov
[176] 18:08:27 [SUCCESS] cmsstor352.fnal.gov
[177] 18:08:30 [SUCCESS] cmsstor357.fnal.gov
[178] 18:08:31 [SUCCESS] cmsstor355.fnal.gov
[179] 18:08:33 [SUCCESS] cmsstor356.fnal.gov
[180] 18:08:34 [SUCCESS] cmsstor359.fnal.gov
[181] 18:08:39 [SUCCESS] cmsstor361.fnal.gov
[182] 18:08:39 [SUCCESS] cmsstor363.fnal.gov
[183] 18:08:41 [SUCCESS] cmsstor362.fnal.gov
[184] 18:08:49 [SUCCESS] cmsstor365.fnal.gov
[185] 18:08:55 [SUCCESS] cmsstor372.fnal.gov
[186] 18:08:55 [SUCCESS] cmsstor371.fnal.gov
[187] 18:09:00 [SUCCESS] cmsstor368.fnal.gov
[188] 18:09:05 [SUCCESS] cmsstor374.fnal.gov
[189] 18:09:07 [SUCCESS] cmsstor376.fnal.gov
[190] 18:09:52 [SUCCESS] cmsstor373.fnal.gov
[191] 18:10:06 [SUCCESS] cmsstor375.fnal.gov

real 11m25.432s
user 0m7.971s
sys 0m6.021s
[root@cmsadmin1 Aug-26-2015]#

#5 Updated by Natalia Ratnikova over 4 years ago

Stop phedex agents on the disk instance:

ssh root@cmsphedex-disk
su - cmsprod
cd siteconfs/siteconf-fnaldisk.4.1.3-comp3-1.0
./pmgr Dev stop
./pmgr Debug stop
./pmgr Prod stop

And on the tape instance:

ssh root@cmssrv228
su - cmsprod
cd /home/cmsprod/siteconf-git/
./pmgr Dev stop
./pmgr Debug stop
./pmgr Prod stop

#6 Updated by Natalia Ratnikova over 4 years ago

Upgrade dCache servers:

ssh root@cmsadmin1

Handle non-standard dcache installation on cmssrmdisk: leave puppet disabled and erase non-standard dcache, as yum thinks its version is newer than the one we are going to install:

ssh cmssrmdisk 
puppet agent --disable
service dcache-server stop
rpm -qi dcache
yum erase dcache
rpm -qi dcache
exit

Now upgrade, reboot, and check all servers:

cd /root/natalia/Aug-26-2015

pssh -h dcache-disk-servers.list -l root -t 0 -p 4 -o upgr_srvs.log -e upgr_srvs.err 'kernelversion=2.6.32-573.3.1.el6.x86_64; extra_rpm_check="dcache-2.2.29-1.noarch"; puppet agent --disable; service dcache-server stop; yum clean all; yum update -y; rpm -q kernel-${kernelversion} ${extra_rpm_check} && puppet agent --enable && reboot'

pssh -h dcache-disk-servers.list -l root -t 60 -p 4 -o check_srvs.log -e check_srvs.err 'kernelversion=2.6.32-573.3.1.el6.x86_64; uptime; uname -a | grep $kernelversion && (service dcache-server status | grep -v DOMAIN | grep -v running ); dcache version;'

#7 Updated by Natalia Ratnikova over 4 years ago

Upgrade reboot and check all pools


pssh -h dcache-disk-pools.list -l root -t 0 -p 50 -o upgr_pools.log -e upgr_pools.err 'kernelversion=2.6.32-573.3.1.el6.x86_64; extra_rpm_check="dcache-2.2.29-1.noarch"; puppet agent --disable; service dcache-server stop; yum clean all; yum update -y; rpm -q kernel-${kernelversion} ${extra_rpm_check} && puppet agent --enable && reboot'

pssh -h dcache-disk-pools.list -l root -t 60 -p 50 -o check_pools.log -e check_pools.err 'kernelversion=2.6.32-573.3.1.el6.x86_64; uptime; uname -a | grep $kernelversion && ((service dcache-server status | grep -v DOMAIN | grep -v running ) || service dcache-server status); dcache version;'

#8 Updated by Natalia Ratnikova over 4 years ago

Start phedex agents on the disk instance:

ssh root@cmsphedex-disk
su - cmsprod
cd siteconfs/siteconf-fnaldisk.4.1.3-comp3-1.0
./pmgr Dev start
./pmgr Debug start
./pmgr Prod start

And on the tape instance:

ssh root@cmssrv228
su - cmsprod
cd /home/cmsprod/siteconf-git/
./pmgr Dev start
./pmgr Debug start
./pmgr Prod start

Check the logs to see if transfers have resumed and succeed to write data to dcache.

#9 Updated by Natalia Ratnikova over 4 years ago

Last login: Wed Aug 26 12:02:47 on ttys003
mac-121252:~ natasha$ to-adm 
Last login: Wed Aug 26 11:12:06 2015 from mac-122182.attlocal.net.dhcp.fnal.gov
                              NOTICE TO USERS

       This  is a Federal computer (and/or it is directly connected to a
       Fermilab local network system) that is the property of the United
       States Government.  It is for authorized use only.  Users (autho-
       rized or unauthorized) have no explicit or  implicit  expectation
       of privacy.

       Any  or  all uses of this system and all files on this system may
       be intercepted, monitored, recorded,  copied, audited, inspected,
       and  disclosed  to authorized site, Department of Energy  and law
       enforcement personnel, as  well as authorized officials of  other
       agencies,  both  domestic and foreign.  By using this system, the
       user consents to such interception, monitoring, recording,  copy-
       ing,  auditing,  inspection,  and disclosure at the discretion of
       authorized site or Department of Energy personnel.

       Unauthorized or improper use of this system may result in  admin-
       istrative  disciplinary  action and civil and criminal penalties.
       By continuing to use this system you indicate your  awareness  of
       and  consent to these terms and conditions of use.  LOG OFF IMME-
       DIATELY if you do not agree to  the  conditions  stated  in  this
       warning.

       Fermilab  policy  and  rules for computing, including appropriate
       use, may be found at http://www.fnal.gov/cd/main/cpolicy.html

cmsadmin1.fnal.gov - bastion/production (SLF 6.6)
32-core Opteron 6320 (H8QG6); 62.89 GB RAM, 16.00 GB swap
[root@cmsadmin1 ~]# ssh cmssrmdisk 
puppet agent --disable
service dcache-server stop
rpm -qi dcache
yum erase dcache
rpm -qi dcache
exitLast login: Wed Aug 26 07:47:43 2015 from cmsadmin1.fnal.gov
                              NOTICE TO USERS

       This  is a Federal computer (and/or it is directly connected to a
       Fermilab local network system) that is the property of the United
       States Government.  It is for authorized use only.  Users (autho-
       rized or unauthorized) have no explicit or  implicit  expectation
       of privacy.

       Any  or  all uses of this system and all files on this system may
       be intercepted, monitored, recorded,  copied, audited, inspected,
       and  disclosed  to authorized site, Department of Energy  and law
       enforcement personnel, as  well as authorized officials of  other
       agencies,  both  domestic and foreign.  By using this system, the
       user consents to such interception, monitoring, recording,  copy-
       ing,  auditing,  inspection,  and disclosure at the discretion of
       authorized site or Department of Energy personnel.

       Unauthorized or improper use of this system may result in  admin-
       istrative  disciplinary  action and civil and criminal penalties.
       By continuing to use this system you indicate your  awareness  of
       and  consent to these terms and conditions of use.  LOG OFF IMME-
       DIATELY if you do not agree to  the  conditions  stated  in  this
       warning.

       Fermilab  policy  and  rules for computing, including appropriate
       use, may be found at http://www.fnal.gov/cd/main/cpolicy.html

cmssrmdisk.fnal.gov - srmdisk/production (SLF 6.6)
32-core Opteron 6320 (H8QG6); 62.89 GB RAM, 16.00 GB swap
[root@cmssrmdisk ~]# puppet agent --disable
[root@cmssrmdisk ~]# service dcache-server stop
Stopping transfermanagersDomain 0 done
Stopping srm-cmssrmdiskDomain 0 1 2 3 done
Stopping gPlazmaDomain 0 done
Stopping utilityDomain 0 done
Stopping xrootdLFNs-cmssrmdiskDomain 0 1 2 3 4 done
Stopping xrootd-cmssrmdiskDomain 0 done
Stopping gsidcap-cmssrmdiskDomain 0 done
Stopping authdcap-cmssrmdiskDomain 0 done
Stopping dcap-cmssrmdiskDomain 0 done
Stopping nfsDomain.v3 0 done
[root@cmssrmdisk ~]# rpm -qi dcache
Name        : dcache                       Relocations: / 
Version     : 2.2.29SNAPSHOT                    Vendor: dCache.org
Release     : 1                             Build Date: Wed 14 Jan 2015 11:51:08 AM CST
Install Date: Wed 11 Feb 2015 08:27:42 AM CST      Build Host: uqbar.fnal.gov
Group       : Applications/System           Source RPM: dcache-2.2.29SNAPSHOT-1.src.rpm
Size        : 71361598                         License: Distributable
Signature   : (none)
Packager    : dCache.org <support@dcache.org>.
Summary     : dCache Server
Description :
dCache is a distributed mass storage system.

This package contains the server components.
[root@cmssrmdisk ~]# yum erase dcache
Loaded plugins: priorities, security
Setting up Remove Process
Resolving Dependencies
--> Running transaction check
---> Package dcache.noarch 0:2.2.29SNAPSHOT-1 will be erased
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
 Package        Arch           Version                  Repository         Size
================================================================================
Removing:
 dcache         noarch         2.2.29SNAPSHOT-1         installed          68 M

Transaction Summary
================================================================================
Remove        1 Package(s)

Installed size: 68 M
Is this ok [y/N]: y
Is this ok [y/N]: Is this ok [y/N]: y
Downloading Packages:
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Erasing    : dcache-2.2.29SNAPSHOT-1.noarch                               1/1 
warning: /etc/dcache/gplazma.conf saved as /etc/dcache/gplazma.conf.rpmsave
warning: /etc/dcache/dcachesrm-gplazma.policy saved as /etc/dcache/dcachesrm-gplazma.policy.rpmsave
warning: /etc/dcache/dcache.conf saved as /etc/dcache/dcache.conf.rpmsave
  Verifying  : dcache-2.2.29SNAPSHOT-1.noarch                               1/1 

Removed:
  dcache.noarch 0:2.2.29SNAPSHOT-1                                              

Complete!
[root@cmssrmdisk ~]# rpm -q dcache
package dcache is not installed
[root@cmssrmdisk ~]# exit
logout
Connection to cmssrmdisk closed.
[root@cmsadmin1 ~]# cd /root/natalia/Aug-26-2015
[root@cmsadmin1 Aug-26-2015]# pssh -h dcache-disk-servers.list -l root -t 0 -p 4 -o upgr_srvs.log -e upgr_srvs.err 'kernelversion=2.6.32-573.3.1.el6.x86_64; extra_rpm_check="dcache-2.2.29-1.noarch"; puppet agent --disable; service dcache-server stop; yum clean all; yum update -y; rpm -q kernel-${kernelversion} ${extra_rpm_check} && puppet agent --enable && reboot'
[1] 13:08:58 [FAILURE] cmssrmdisk.fnal.gov Exited with error code 1
[2] 13:09:09 [SUCCESS] cmschimeradiskbackup.fnal.gov
[3] 13:09:22 [SUCCESS] cmschimeradisk.fnal.gov
[4] 13:09:30 [SUCCESS] cmsdcacheadmindisk.fnal.gov
[root@cmsadmin1 Aug-26-2015]# ssh cmssrmdisk.fnal.gov
Last login: Wed Aug 26 13:05:47 2015 from cmsadmin1.fnal.gov
                              NOTICE TO USERS

       This  is a Federal computer (and/or it is directly connected to a
       Fermilab local network system) that is the property of the United
       States Government.  It is for authorized use only.  Users (autho-
       rized or unauthorized) have no explicit or  implicit  expectation
       of privacy.

       Any  or  all uses of this system and all files on this system may
       be intercepted, monitored, recorded,  copied, audited, inspected,
       and  disclosed  to authorized site, Department of Energy  and law
       enforcement personnel, as  well as authorized officials of  other
       agencies,  both  domestic and foreign.  By using this system, the
       user consents to such interception, monitoring, recording,  copy-
       ing,  auditing,  inspection,  and disclosure at the discretion of
       authorized site or Department of Energy personnel.

       Unauthorized or improper use of this system may result in  admin-
       istrative  disciplinary  action and civil and criminal penalties.
       By continuing to use this system you indicate your  awareness  of
       and  consent to these terms and conditions of use.  LOG OFF IMME-
       DIATELY if you do not agree to  the  conditions  stated  in  this
       warning.

       Fermilab  policy  and  rules for computing, including appropriate
       use, may be found at http://www.fnal.gov/cd/main/cpolicy.html

cmssrmdisk.fnal.gov - srmdisk/production (SLF 6.6)
32-core Opteron 6320 (H8QG6); 62.89 GB RAM, 16.00 GB swap
[root@cmssrmdisk ~]# srpm -q dcache 
-bash: srpm: command not found
[root@cmssrmdisk ~]# rpm -q dcache 
package dcache is not installed
[root@cmssrmdisk ~]# yum install dcache
Loaded plugins: priorities, security
Setting up Install Process
583 packages excluded due to repository priority protections
Resolving Dependencies
--> Running transaction check
---> Package dcache.noarch 0:2.2.29-1 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
 Package          Arch             Version              Repository         Size
================================================================================
Installing:
 dcache           noarch           2.2.29-1             uscmst1            61 M

Transaction Summary
================================================================================
Install       1 Package(s)

Total download size: 61 M
Installed size: 68 M
Is this ok [y/N]: y
Downloading Packages:
dcache-2.2.29-1.noarch.rpm                               |  61 MB     00:00     
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing : dcache-2.2.29-1.noarch                                       1/1 
  Verifying  : dcache-2.2.29-1.noarch                                       1/1 

Installed:
  dcache.noarch 0:2.2.29-1                                                      

Complete!
[root@cmssrmdisk ~]# cat  dcache-disk-servers.list
cat: dcache-disk-servers.list: No such file or directory
[root@cmssrmdisk ~]# exit
logout
Connection to cmssrmdisk.fnal.gov closed.
[root@cmsadmin1 Aug-26-2015]# cat dcache-disk-servers.list
cmsdcacheadmindisk.fnal.gov
cmschimeradisk.fnal.gov
cmssrmdisk.fnal.gov
cmschimeradiskbackup.fnal.gov
[root@cmsadmin1 Aug-26-2015]# grep srm dcache-disk-servers.list > srmdisk.list
[root@cmsadmin1 Aug-26-2015]# pssh -h srmdisk.list -l root -t 0 -p 4 -o upgr_srvs.log -e upgr_srvs.err 'kernelversion=2.6.32-573.3.1.el6.x86_64; extra_rpm_check="dcache-2.2.29-1.noarch"; puppet agent --disable; service dcache-server stop; yum clean all; yum update -y; rpm -q kernel-${kernelversion} ${extra_rpm_check} && puppet agent --enable && reboot'
[1] 13:15:08 [SUCCESS] cmssrmdisk.fnal.gov
[root@cmsadmin1 Aug-26-2015]# 

[root@cmsadmin1 Aug-26-2015]# cat check_srvs.log/cmssrmdisk.fnal.gov 
 13:28:48 up 10 min,  0 users,  load average: 2.21, 1.37, 0.67
Linux cmssrmdisk.fnal.gov 2.6.32-573.3.1.el6.x86_64 #1 SMP Thu Aug 13 12:55:33 CDT 2015 x86_64 x86_64 x86_64 GNU/Linux
DOMAIN                      STATUS  PID   USER 
nfsDomain.v3                running 9479  root 
dcap-cmssrmdiskDomain       running 9556  root 
authdcap-cmssrmdiskDomain   running 9622  root 
gsidcap-cmssrmdiskDomain    running 9702  root 
xrootd-cmssrmdiskDomain     running 9772  root 
xrootdLFNs-cmssrmdiskDomain running 9845  root 
utilityDomain               running 9915  root 
gPlazmaDomain               running 9990  root 
srm-cmssrmdiskDomain        running 10069 root 
transfermanagersDomain      running 10139 root 
2.2.29
[root@cmsadmin1 Aug-26-2015]# cat check_srvs.err/cmssrmdisk.fnal.gov 
[root@cmsadmin1 Aug-26-2015]# ls -l  check_srvs.err/
total 0
-rw-r--r-- 1 root root 0 Aug 26 13:18 cmschimeradiskbackup.fnal.gov
-rw-r--r-- 1 root root 0 Aug 26 13:18 cmschimeradisk.fnal.gov
-rw-r--r-- 1 root root 0 Aug 26 13:18 cmsdcacheadmindisk.fnal.gov
-rw-r--r-- 1 root root 0 Aug 26 13:28 cmssrmdisk.fnal.gov
[root@cmsadmin1 Aug-26-2015]# 
[root@cmsadmin1 Aug-26-2015]# cat check_srvs.log/cmssrmdisk.fnal.gov 
 13:28:48 up 10 min,  0 users,  load average: 2.21, 1.37, 0.67
Linux cmssrmdisk.fnal.gov 2.6.32-573.3.1.el6.x86_64 #1 SMP Thu Aug 13 12:55:33 CDT 2015 x86_64 x86_64 x86_64 GNU/Linux
DOMAIN                      STATUS  PID   USER 
nfsDomain.v3                running 9479  root 
dcap-cmssrmdiskDomain       running 9556  root 
authdcap-cmssrmdiskDomain   running 9622  root 
gsidcap-cmssrmdiskDomain    running 9702  root 
xrootd-cmssrmdiskDomain     running 9772  root 
xrootdLFNs-cmssrmdiskDomain running 9845  root 
utilityDomain               running 9915  root 
gPlazmaDomain               running 9990  root 
srm-cmssrmdiskDomain        running 10069 root 
transfermanagersDomain      running 10139 root 
2.2.29
[root@cmsadmin1 Aug-26-2015]# cat check_srvs.err/cmssrmdisk.fnal.gov 
[root@cmsadmin1 Aug-26-2015]# ls -l  check_srvs.err/
total 0
-rw-r--r-- 1 root root 0 Aug 26 13:18 cmschimeradiskbackup.fnal.gov
-rw-r--r-- 1 root root 0 Aug 26 13:18 cmschimeradisk.fnal.gov
-rw-r--r-- 1 root root 0 Aug 26 13:18 cmsdcacheadmindisk.fnal.gov
-rw-r--r-- 1 root root 0 Aug 26 13:28 cmssrmdisk.fnal.gov
[root@cmsadmin1 Aug-26-2015]# pssh -h dcache-disk-pools.list -l root -t 0 -p 50 -o upgr_pools.log -e upgr_pools.err 'kernelversion=2.6.32-573.3.1.el6.x86_64; extra_rpm_check="dcache-2.2.29-1.noarch"; puppet agent --disable; service dcache-server stop; yum clean all; yum update -y; rpm -q kernel-${kernelversion} ${extra_rpm_check} && puppet agent --enable && reboot'
[1] 13:32:05 [SUCCESS] cmsstor201.fnal.gov
[2] 13:32:07 [SUCCESS] cmsstor221.fnal.gov
[3] 13:32:10 [SUCCESS] cmsstor189.fnal.gov
[4] 13:32:11 [SUCCESS] cmsstor195.fnal.gov
[5] 13:32:11 [SUCCESS] cmsstor176.fnal.gov
[6] 13:32:12 [SUCCESS] cmsstor193.fnal.gov
[7] 13:32:12 [SUCCESS] cmsstor210.fnal.gov
[8] 13:32:13 [SUCCESS] cmsstor209.fnal.gov
[9] 13:32:13 [SUCCESS] cmsstor185.fnal.gov
[10] 13:32:13 [SUCCESS] cmsstor212.fnal.gov
[11] 13:32:13 [SUCCESS] cmsstor183.fnal.gov
[12] 13:32:13 [SUCCESS] cmsstor173.fnal.gov
[13] 13:32:14 [SUCCESS] cmsstor217.fnal.gov
[14] 13:32:14 [SUCCESS] cmsstor175.fnal.gov
[15] 13:32:14 [SUCCESS] cmsstor214.fnal.gov
[16] 13:32:14 [SUCCESS] cmsstor187.fnal.gov
[17] 13:32:15 [SUCCESS] cmsstor220.fnal.gov
[18] 13:32:15 [SUCCESS] cmsstor216.fnal.gov
[19] 13:32:15 [SUCCESS] cmsstor213.fnal.gov
[20] 13:32:15 [SUCCESS] cmsstor219.fnal.gov
[21] 13:32:15 [SUCCESS] cmsstor172.fnal.gov
[22] 13:32:15 [SUCCESS] cmsstor184.fnal.gov
[23] 13:32:15 [SUCCESS] cmsstor174.fnal.gov
[24] 13:32:15 [SUCCESS] cmsstor178.fnal.gov
[25] 13:32:15 [SUCCESS] cmsstor182.fnal.gov
[26] 13:32:16 [SUCCESS] cmsstor169.fnal.gov
[27] 13:32:16 [SUCCESS] cmsstor194.fnal.gov
[28] 13:32:16 [SUCCESS] cmsstor168.fnal.gov
[29] 13:32:16 [SUCCESS] cmsstor181.fnal.gov
[30] 13:32:17 [SUCCESS] cmsstor171.fnal.gov
[31] 13:32:17 [SUCCESS] cmsstor215.fnal.gov
[32] 13:32:17 [SUCCESS] cmsstor186.fnal.gov
[33] 13:32:17 [SUCCESS] cmsstor177.fnal.gov
[34] 13:32:18 [SUCCESS] cmsstor203.fnal.gov
[35] 13:32:18 [SUCCESS] cmsstor198.fnal.gov
[36] 13:32:21 [SUCCESS] cmsstor196.fnal.gov
[37] 13:32:24 [SUCCESS] cmsstor191.fnal.gov
[38] 13:32:25 [SUCCESS] cmsstor199.fnal.gov
[39] 13:32:25 [SUCCESS] cmsstor206.fnal.gov
[40] 13:32:26 [SUCCESS] cmsstor179.fnal.gov
[41] 13:32:26 [SUCCESS] cmsstor207.fnal.gov
[42] 13:32:26 [SUCCESS] cmsstor205.fnal.gov
[43] 13:32:30 [SUCCESS] cmsstor197.fnal.gov
[44] 13:32:30 [SUCCESS] cmsstor188.fnal.gov
[45] 13:32:42 [SUCCESS] cmsstor204.fnal.gov
[46] 13:32:51 [SUCCESS] cmsstor208.fnal.gov
[47] 13:33:01 [SUCCESS] cmsstor242.fnal.gov
[48] 13:33:02 [SUCCESS] cmsstor223.fnal.gov
[49] 13:33:04 [SUCCESS] cmsstor237.fnal.gov
[50] 13:33:04 [SUCCESS] cmsstor224.fnal.gov
[51] 13:33:04 [SUCCESS] cmsstor233.fnal.gov
[52] 13:33:04 [SUCCESS] cmsstor227.fnal.gov
[53] 13:33:05 [SUCCESS] cmsstor222.fnal.gov
[54] 13:33:06 [SUCCESS] cmsstor236.fnal.gov
[55] 13:33:07 [SUCCESS] cmsstor230.fnal.gov
[56] 13:33:07 [SUCCESS] cmsstor243.fnal.gov
[57] 13:33:07 [SUCCESS] cmsstor250.fnal.gov
[58] 13:33:07 [SUCCESS] cmsstor226.fnal.gov
[59] 13:33:08 [SUCCESS] cmsstor232.fnal.gov
[60] 13:33:08 [SUCCESS] cmsstor229.fnal.gov
[61] 13:33:08 [SUCCESS] cmsstor244.fnal.gov
[62] 13:33:08 [SUCCESS] cmsstor238.fnal.gov
[63] 13:33:08 [SUCCESS] cmsstor234.fnal.gov
[64] 13:33:09 [SUCCESS] cmsstor248.fnal.gov
[65] 13:33:09 [SUCCESS] cmsstor192.fnal.gov
[66] 13:33:09 [SUCCESS] cmsstor240.fnal.gov
[67] 13:33:09 [SUCCESS] cmsstor239.fnal.gov
[68] 13:33:10 [SUCCESS] cmsstor245.fnal.gov
[69] 13:33:10 [SUCCESS] cmsstor249.fnal.gov
[70] 13:33:10 [SUCCESS] cmsstor231.fnal.gov
[71] 13:33:11 [SUCCESS] cmsstor246.fnal.gov
[72] 13:33:13 [SUCCESS] cmsstor251.fnal.gov
[73] 13:33:14 [SUCCESS] cmsstor264.fnal.gov
[74] 13:33:15 [SUCCESS] cmsstor228.fnal.gov
[75] 13:33:16 [SUCCESS] cmsstor247.fnal.gov
[76] 13:33:16 [SUCCESS] cmsstor261.fnal.gov
[77] 13:33:19 [SUCCESS] cmsstor235.fnal.gov
[78] 13:33:19 [SUCCESS] cmsstor211.fnal.gov
[79] 13:33:22 [SUCCESS] cmsstor266.fnal.gov
[80] 13:33:22 [SUCCESS] cmsstor265.fnal.gov
[81] 13:33:23 [SUCCESS] cmsstor272.fnal.gov
[82] 13:33:26 [SUCCESS] cmsstor270.fnal.gov
[83] 13:33:29 [SUCCESS] cmsstor273.fnal.gov
[84] 13:33:29 [SUCCESS] cmsstor267.fnal.gov
[85] 13:33:29 [SUCCESS] cmsstor274.fnal.gov
[86] 13:33:29 [SUCCESS] cmsstor225.fnal.gov
[87] 13:33:31 [SUCCESS] cmsstor241.fnal.gov
[88] 13:33:41 [SUCCESS] cmsstor275.fnal.gov
[89] 13:33:41 [SUCCESS] cmsstor263.fnal.gov
[90] 13:33:46 [SUCCESS] cmsstor262.fnal.gov
[91] 13:33:49 [SUCCESS] cmsstor269.fnal.gov
[92] 13:33:49 [SUCCESS] cmsstor202.fnal.gov
[93] 13:33:54 [SUCCESS] cmsstor271.fnal.gov
[94] 13:33:58 [SUCCESS] cmsstor268.fnal.gov
[95] 13:33:58 [SUCCESS] cmsstor315.fnal.gov
[96] 13:33:58 [SUCCESS] cmsstor309.fnal.gov
[97] 13:34:01 [SUCCESS] cmsstor312.fnal.gov
[98] 13:34:01 [SUCCESS] cmsstor313.fnal.gov
[99] 13:34:02 [SUCCESS] cmsstor314.fnal.gov
[100] 13:34:03 [SUCCESS] cmsstor311.fnal.gov
[101] 13:34:04 [SUCCESS] cmsstor286.fnal.gov
[102] 13:34:05 [SUCCESS] cmsstor317.fnal.gov
[103] 13:34:05 [SUCCESS] cmsstor319.fnal.gov
[104] 13:34:05 [SUCCESS] cmsstor281.fnal.gov
[105] 13:34:06 [SUCCESS] cmsstor285.fnal.gov
[106] 13:34:06 [SUCCESS] cmsstor283.fnal.gov
[107] 13:34:06 [SUCCESS] cmsstor320.fnal.gov
[108] 13:34:06 [SUCCESS] cmsstor289.fnal.gov
[109] 13:34:06 [SUCCESS] cmsstor310.fnal.gov
[110] 13:34:07 [SUCCESS] cmsstor316.fnal.gov
[111] 13:34:08 [SUCCESS] cmsstor321.fnal.gov
[112] 13:34:08 [SUCCESS] cmsstor278.fnal.gov
[113] 13:34:08 [SUCCESS] cmsstor318.fnal.gov
[114] 13:34:09 [SUCCESS] cmsstor279.fnal.gov
[115] 13:34:09 [SUCCESS] cmsstor322.fnal.gov
[116] 13:34:11 [SUCCESS] cmsstor280.fnal.gov
[117] 13:34:17 [SUCCESS] cmsstor329.fnal.gov
[118] 13:34:18 [SUCCESS] cmsstor276.fnal.gov
[119] 13:34:22 [SUCCESS] cmsstor323.fnal.gov
[120] 13:34:27 [SUCCESS] cmsstor325.fnal.gov
[121] 13:34:29 [SUCCESS] cmsstor277.fnal.gov
[122] 13:34:30 [SUCCESS] cmsstor331.fnal.gov
[123] 13:34:32 [SUCCESS] cmsstor328.fnal.gov
[124] 13:34:34 [SUCCESS] cmsstor294.fnal.gov
[125] 13:34:34 [SUCCESS] cmsstor291.fnal.gov
[126] 13:34:36 [SUCCESS] cmsstor287.fnal.gov
[127] 13:34:36 [SUCCESS] cmsstor218.fnal.gov
[128] 13:34:36 [SUCCESS] cmsstor290.fnal.gov
[129] 13:34:36 [SUCCESS] cmsstor293.fnal.gov
[130] 13:34:36 [SUCCESS] cmsstor284.fnal.gov
[131] 13:34:37 [SUCCESS] cmsstor282.fnal.gov
[132] 13:34:38 [SUCCESS] cmsstor288.fnal.gov
[133] 13:34:40 [SUCCESS] cmsstor292.fnal.gov
[134] 13:34:51 [SUCCESS] cmsstor324.fnal.gov
[135] 13:34:51 [SUCCESS] cmsstor336.fnal.gov
[136] 13:34:53 [SUCCESS] cmsstor337.fnal.gov
[137] 13:34:55 [SUCCESS] cmsstor326.fnal.gov
[138] 13:34:57 [SUCCESS] cmsstor335.fnal.gov
[139] 13:34:58 [SUCCESS] cmsstor338.fnal.gov
[140] 13:35:01 [SUCCESS] cmsstor327.fnal.gov
[141] 13:35:02 [SUCCESS] cmsstor343.fnal.gov
[142] 13:35:03 [SUCCESS] cmsstor330.fnal.gov
[143] 13:35:05 [SUCCESS] cmsstor350.fnal.gov
[144] 13:35:08 [SUCCESS] cmsstor351.fnal.gov
[145] 13:35:09 [SUCCESS] cmsstor359.fnal.gov
[146] 13:35:10 [SUCCESS] cmsstor353.fnal.gov
[147] 13:35:10 [SUCCESS] cmsstor332.fnal.gov
[148] 13:35:12 [SUCCESS] cmsstor345.fnal.gov
[149] 13:35:14 [SUCCESS] cmsstor349.fnal.gov
[150] 13:35:16 [SUCCESS] cmsstor334.fnal.gov
[151] 13:35:17 [SUCCESS] cmsstor333.fnal.gov
[152] 13:35:23 [SUCCESS] cmsstor362.fnal.gov
[153] 13:35:23 [SUCCESS] cmsstor339.fnal.gov
[154] 13:35:26 [SUCCESS] cmsstor370.fnal.gov
[155] 13:35:27 [SUCCESS] cmsstor342.fnal.gov
[156] 13:35:28 [SUCCESS] cmsstor341.fnal.gov
[157] 13:35:29 [SUCCESS] cmsstor369.fnal.gov
[158] 13:35:31 [SUCCESS] cmsstor364.fnal.gov
[159] 13:35:31 [SUCCESS] cmsstor340.fnal.gov
[160] 13:35:31 [SUCCESS] cmsstor401.fnal.gov
[161] 13:35:32 [SUCCESS] cmsstor346.fnal.gov
[162] 13:35:32 [SUCCESS] cmsstor365.fnal.gov
[163] 13:35:33 [SUCCESS] cmsstor402.fnal.gov
[164] 13:35:33 [SUCCESS] cmsstor348.fnal.gov
[165] 13:35:33 [SUCCESS] cmsstor355.fnal.gov
[166] 13:35:34 [SUCCESS] cmsstor352.fnal.gov
[167] 13:35:35 [SUCCESS] cmsstor356.fnal.gov
[168] 13:35:36 [SUCCESS] cmsstor368.fnal.gov
[169] 13:35:36 [SUCCESS] cmsstor374.fnal.gov
[170] 13:35:37 [SUCCESS] cmsstor347.fnal.gov
[171] 13:35:37 [SUCCESS] cmsstor357.fnal.gov
[172] 13:35:39 [SUCCESS] cmsstor360.fnal.gov
[173] 13:35:40 [SUCCESS] cmsstor354.fnal.gov
[174] 13:35:42 [SUCCESS] cmsstor404.fnal.gov
[175] 13:35:43 [SUCCESS] cmsstor403.fnal.gov
[176] 13:35:43 [SUCCESS] cmsstor344.fnal.gov
[177] 13:35:43 [SUCCESS] cmsstor405.fnal.gov
[178] 13:35:43 [SUCCESS] cmsstor358.fnal.gov
[179] 13:35:44 [SUCCESS] cmsstor406.fnal.gov
[180] 13:35:45 [SUCCESS] cmsstor361.fnal.gov
[181] 13:35:45 [SUCCESS] cmsstor372.fnal.gov
[182] 13:35:48 [SUCCESS] cmsstor407.fnal.gov
[183] 13:35:49 [SUCCESS] cmsstor410.fnal.gov
[184] 13:35:49 [SUCCESS] cmsstor363.fnal.gov
[185] 13:35:50 [SUCCESS] cmsstor408.fnal.gov
[186] 13:35:51 [SUCCESS] cmsstor409.fnal.gov
[187] 13:35:53 [SUCCESS] cmsstor366.fnal.gov
[188] 13:36:04 [SUCCESS] cmsstor375.fnal.gov
[189] 13:36:06 [SUCCESS] cmsstor373.fnal.gov
[190] 13:36:07 [SUCCESS] cmsstor376.fnal.gov
[191] 13:36:07 [SUCCESS] cmsstor371.fnal.gov
[root@cmsadmin1 Aug-26-2015]# pssh -h dcache-disk-pools.list -l root -t 60 -p 50 -o check_pools.log -e check_pools.err 'kernelversion=2.6.32-573.3.1.el6.x86_64; uptime; uname -a | grep $kernelversion && ((service dcache-server status | grep -v DOMAIN | grep -v running ) || service dcache-server status); dcache version;'
[1] 13:40:50 [FAILURE] cmsstor169.fnal.gov Exited with error code 255
[2] 13:40:51 [SUCCESS] cmsstor178.fnal.gov
[3] 13:40:51 [SUCCESS] cmsstor171.fnal.gov
[4] 13:40:51 [SUCCESS] cmsstor172.fnal.gov
[5] 13:40:51 [SUCCESS] cmsstor173.fnal.gov
[6] 13:40:51 [SUCCESS] cmsstor168.fnal.gov
[7] 13:40:51 [SUCCESS] cmsstor194.fnal.gov
[8] 13:40:51 [SUCCESS] cmsstor174.fnal.gov
[9] 13:40:51 [SUCCESS] cmsstor193.fnal.gov
[10] 13:40:51 [SUCCESS] cmsstor175.fnal.gov
[11] 13:40:51 [SUCCESS] cmsstor179.fnal.gov
[12] 13:40:51 [SUCCESS] cmsstor187.fnal.gov
[13] 13:40:51 [SUCCESS] cmsstor181.fnal.gov
[14] 13:40:51 [SUCCESS] cmsstor177.fnal.gov
[15] 13:40:51 [SUCCESS] cmsstor182.fnal.gov
[16] 13:40:51 [SUCCESS] cmsstor189.fnal.gov
[17] 13:40:51 [SUCCESS] cmsstor183.fnal.gov
[18] 13:40:52 [SUCCESS] cmsstor186.fnal.gov
[19] 13:40:52 [SUCCESS] cmsstor176.fnal.gov
[20] 13:40:52 [SUCCESS] cmsstor221.fnal.gov
[21] 13:40:52 [SUCCESS] cmsstor206.fnal.gov
[22] 13:40:52 [SUCCESS] cmsstor197.fnal.gov
[23] 13:40:52 [SUCCESS] cmsstor198.fnal.gov
[24] 13:40:52 [SUCCESS] cmsstor203.fnal.gov
[25] 13:40:52 [SUCCESS] cmsstor196.fnal.gov
[26] 13:40:52 [SUCCESS] cmsstor209.fnal.gov
[27] 13:40:52 [SUCCESS] cmsstor191.fnal.gov
[28] 13:40:52 [SUCCESS] cmsstor185.fnal.gov
[29] 13:40:52 [SUCCESS] cmsstor199.fnal.gov
[30] 13:40:52 [SUCCESS] cmsstor201.fnal.gov
[31] 13:40:52 [SUCCESS] cmsstor216.fnal.gov
[32] 13:40:52 [SUCCESS] cmsstor188.fnal.gov
[33] 13:40:52 [SUCCESS] cmsstor205.fnal.gov
[34] 13:40:52 [SUCCESS] cmsstor215.fnal.gov
[35] 13:40:52 [SUCCESS] cmsstor217.fnal.gov
[36] 13:40:52 [SUCCESS] cmsstor219.fnal.gov
[37] 13:40:52 [SUCCESS] cmsstor214.fnal.gov
[38] 13:40:52 [SUCCESS] cmsstor212.fnal.gov
[39] 13:40:52 [SUCCESS] cmsstor195.fnal.gov
[40] 13:40:52 [SUCCESS] cmsstor208.fnal.gov
[41] 13:40:52 [SUCCESS] cmsstor211.fnal.gov
[42] 13:40:52 [SUCCESS] cmsstor220.fnal.gov
[43] 13:40:52 [SUCCESS] cmsstor207.fnal.gov
[44] 13:40:52 [SUCCESS] cmsstor213.fnal.gov
[45] 13:40:52 [SUCCESS] cmsstor210.fnal.gov
[46] 13:40:52 [SUCCESS] cmsstor204.fnal.gov
[47] 13:40:52 [SUCCESS] cmsstor192.fnal.gov
[48] 13:40:52 [SUCCESS] cmsstor184.fnal.gov
[49] 13:40:52 [SUCCESS] cmsstor218.fnal.gov
[50] 13:40:52 [SUCCESS] cmsstor202.fnal.gov
[51] 13:40:55 [SUCCESS] cmsstor222.fnal.gov
[52] 13:40:56 [SUCCESS] cmsstor223.fnal.gov
[53] 13:40:56 [SUCCESS] cmsstor225.fnal.gov
[54] 13:40:56 [SUCCESS] cmsstor242.fnal.gov
[55] 13:40:56 [SUCCESS] cmsstor226.fnal.gov
[56] 13:40:56 [SUCCESS] cmsstor224.fnal.gov
[57] 13:40:56 [SUCCESS] cmsstor230.fnal.gov
[58] 13:40:56 [SUCCESS] cmsstor240.fnal.gov
[59] 13:40:56 [SUCCESS] cmsstor232.fnal.gov
[60] 13:40:56 [SUCCESS] cmsstor234.fnal.gov
[61] 13:40:56 [SUCCESS] cmsstor231.fnal.gov
[62] 13:40:56 [SUCCESS] cmsstor248.fnal.gov
[63] 13:40:56 [SUCCESS] cmsstor238.fnal.gov
[64] 13:40:56 [SUCCESS] cmsstor229.fnal.gov
[65] 13:40:56 [SUCCESS] cmsstor235.fnal.gov
[66] 13:40:56 [SUCCESS] cmsstor244.fnal.gov
[67] 13:40:56 [SUCCESS] cmsstor246.fnal.gov
[68] 13:40:56 [SUCCESS] cmsstor243.fnal.gov
[69] 13:40:56 [SUCCESS] cmsstor251.fnal.gov
[70] 13:40:56 [SUCCESS] cmsstor249.fnal.gov
[71] 13:40:56 [SUCCESS] cmsstor236.fnal.gov
[72] 13:40:56 [SUCCESS] cmsstor245.fnal.gov
[73] 13:40:56 [SUCCESS] cmsstor233.fnal.gov
[74] 13:40:56 [SUCCESS] cmsstor241.fnal.gov
[75] 13:40:56 [SUCCESS] cmsstor247.fnal.gov
[76] 13:40:56 [SUCCESS] cmsstor239.fnal.gov
[77] 13:40:56 [SUCCESS] cmsstor237.fnal.gov
[78] 13:40:56 [SUCCESS] cmsstor250.fnal.gov
[79] 13:40:56 [SUCCESS] cmsstor228.fnal.gov
[80] 13:40:57 [SUCCESS] cmsstor227.fnal.gov
[81] 13:41:00 [SUCCESS] cmsstor262.fnal.gov
[82] 13:41:00 [SUCCESS] cmsstor273.fnal.gov
[83] 13:41:00 [SUCCESS] cmsstor269.fnal.gov
[84] 13:41:00 [SUCCESS] cmsstor265.fnal.gov
[85] 13:41:00 [SUCCESS] cmsstor261.fnal.gov
[86] 13:41:00 [SUCCESS] cmsstor270.fnal.gov
[87] 13:41:00 [SUCCESS] cmsstor276.fnal.gov
[88] 13:41:00 [SUCCESS] cmsstor263.fnal.gov
[89] 13:41:00 [SUCCESS] cmsstor275.fnal.gov
[90] 13:41:00 [SUCCESS] cmsstor268.fnal.gov
[91] 13:41:00 [SUCCESS] cmsstor271.fnal.gov
[92] 13:41:00 [SUCCESS] cmsstor267.fnal.gov
[93] 13:41:00 [SUCCESS] cmsstor274.fnal.gov
[94] 13:41:00 [SUCCESS] cmsstor272.fnal.gov
[95] 13:41:00 [SUCCESS] cmsstor266.fnal.gov
[96] 13:41:00 [SUCCESS] cmsstor277.fnal.gov
[97] 13:41:00 [SUCCESS] cmsstor278.fnal.gov
[98] 13:41:00 [SUCCESS] cmsstor280.fnal.gov
[99] 13:41:00 [SUCCESS] cmsstor279.fnal.gov
[100] 13:41:01 [SUCCESS] cmsstor311.fnal.gov
[101] 13:41:01 [SUCCESS] cmsstor309.fnal.gov
[102] 13:41:01 [SUCCESS] cmsstor264.fnal.gov
[103] 13:41:01 [SUCCESS] cmsstor317.fnal.gov
[104] 13:41:01 [SUCCESS] cmsstor313.fnal.gov
[105] 13:41:01 [SUCCESS] cmsstor310.fnal.gov
[106] 13:41:01 [SUCCESS] cmsstor319.fnal.gov
[107] 13:41:01 [SUCCESS] cmsstor315.fnal.gov
[108] 13:41:01 [SUCCESS] cmsstor312.fnal.gov
[109] 13:41:01 [SUCCESS] cmsstor314.fnal.gov
[110] 13:41:01 [SUCCESS] cmsstor318.fnal.gov
[111] 13:41:01 [SUCCESS] cmsstor320.fnal.gov
[112] 13:41:01 [SUCCESS] cmsstor316.fnal.gov
[113] 13:41:01 [SUCCESS] cmsstor321.fnal.gov
[114] 13:41:01 [SUCCESS] cmsstor322.fnal.gov
[115] 13:41:03 [SUCCESS] cmsstor281.fnal.gov
[116] 13:41:04 [SUCCESS] cmsstor283.fnal.gov
[117] 13:41:04 [SUCCESS] cmsstor285.fnal.gov
[118] 13:41:04 [SUCCESS] cmsstor284.fnal.gov
[119] 13:41:04 [SUCCESS] cmsstor288.fnal.gov
[120] 13:41:04 [SUCCESS] cmsstor290.fnal.gov
[121] 13:41:04 [SUCCESS] cmsstor289.fnal.gov
[122] 13:41:04 [SUCCESS] cmsstor286.fnal.gov
[123] 13:41:04 [SUCCESS] cmsstor282.fnal.gov
[124] 13:41:04 [SUCCESS] cmsstor292.fnal.gov
[125] 13:41:04 [SUCCESS] cmsstor287.fnal.gov
[126] 13:41:04 [SUCCESS] cmsstor294.fnal.gov
[127] 13:41:04 [SUCCESS] cmsstor291.fnal.gov
[128] 13:41:05 [SUCCESS] cmsstor293.fnal.gov
[129] 13:41:05 [SUCCESS] cmsstor323.fnal.gov
[130] 13:41:05 [SUCCESS] cmsstor324.fnal.gov
[131] 13:41:07 [SUCCESS] cmsstor329.fnal.gov
[132] 13:41:08 [SUCCESS] cmsstor326.fnal.gov
[133] 13:41:08 [SUCCESS] cmsstor327.fnal.gov
[134] 13:41:08 [SUCCESS] cmsstor325.fnal.gov
[135] 13:41:08 [SUCCESS] cmsstor337.fnal.gov
[136] 13:41:08 [SUCCESS] cmsstor333.fnal.gov
[137] 13:41:08 [SUCCESS] cmsstor332.fnal.gov
[138] 13:41:08 [SUCCESS] cmsstor328.fnal.gov
[139] 13:41:08 [SUCCESS] cmsstor338.fnal.gov
[140] 13:41:08 [SUCCESS] cmsstor331.fnal.gov
[141] 13:41:08 [SUCCESS] cmsstor336.fnal.gov
[142] 13:41:08 [SUCCESS] cmsstor330.fnal.gov
[143] 13:41:08 [SUCCESS] cmsstor334.fnal.gov
[144] 13:41:08 [SUCCESS] cmsstor339.fnal.gov
[145] 13:41:08 [SUCCESS] cmsstor335.fnal.gov
[146] 13:41:08 [SUCCESS] cmsstor340.fnal.gov
[147] 13:41:08 [SUCCESS] cmsstor341.fnal.gov
[148] 13:41:09 [SUCCESS] cmsstor342.fnal.gov
[149] 13:41:09 [SUCCESS] cmsstor343.fnal.gov
[150] 13:41:09 [SUCCESS] cmsstor345.fnal.gov
[151] 13:41:09 [SUCCESS] cmsstor344.fnal.gov
[152] 13:41:09 [SUCCESS] cmsstor346.fnal.gov
[153] 13:41:09 [SUCCESS] cmsstor349.fnal.gov
[154] 13:41:09 [SUCCESS] cmsstor350.fnal.gov
[155] 13:41:09 [SUCCESS] cmsstor348.fnal.gov
[156] 13:41:09 [SUCCESS] cmsstor369.fnal.gov
[157] 13:41:09 [SUCCESS] cmsstor354.fnal.gov
[158] 13:41:09 [SUCCESS] cmsstor351.fnal.gov
[159] 13:41:09 [SUCCESS] cmsstor352.fnal.gov
[160] 13:41:09 [SUCCESS] cmsstor353.fnal.gov
[161] 13:41:09 [SUCCESS] cmsstor347.fnal.gov
[162] 13:41:09 [SUCCESS] cmsstor370.fnal.gov
[163] 13:41:09 [SUCCESS] cmsstor356.fnal.gov
[164] 13:41:09 [SUCCESS] cmsstor355.fnal.gov
[165] 13:41:09 [SUCCESS] cmsstor357.fnal.gov
[166] 13:41:10 [SUCCESS] cmsstor358.fnal.gov
[167] 13:41:11 [SUCCESS] cmsstor359.fnal.gov
[168] 13:41:12 [SUCCESS] cmsstor368.fnal.gov
[169] 13:41:12 [SUCCESS] cmsstor364.fnal.gov
[170] 13:41:12 [SUCCESS] cmsstor362.fnal.gov
[171] 13:41:12 [SUCCESS] cmsstor363.fnal.gov
[172] 13:41:12 [SUCCESS] cmsstor360.fnal.gov
[173] 13:41:12 [SUCCESS] cmsstor361.fnal.gov
[174] 13:41:13 [SUCCESS] cmsstor365.fnal.gov
[175] 13:41:13 [SUCCESS] cmsstor372.fnal.gov
[176] 13:41:13 [SUCCESS] cmsstor366.fnal.gov
[177] 13:41:13 [SUCCESS] cmsstor371.fnal.gov
[178] 13:41:13 [SUCCESS] cmsstor373.fnal.gov
[179] 13:41:13 [SUCCESS] cmsstor374.fnal.gov
[180] 13:41:14 [SUCCESS] cmsstor375.fnal.gov
[181] 13:41:14 [SUCCESS] cmsstor402.fnal.gov
[182] 13:41:14 [SUCCESS] cmsstor401.fnal.gov
[183] 13:41:14 [SUCCESS] cmsstor404.fnal.gov
[184] 13:41:14 [SUCCESS] cmsstor410.fnal.gov
[185] 13:41:14 [SUCCESS] cmsstor406.fnal.gov
[186] 13:41:14 [SUCCESS] cmsstor409.fnal.gov
[187] 13:41:15 [SUCCESS] cmsstor403.fnal.gov
[188] 13:41:15 [SUCCESS] cmsstor408.fnal.gov
[189] 13:41:15 [SUCCESS] cmsstor405.fnal.gov
[190] 13:41:15 [SUCCESS] cmsstor407.fnal.gov
[191] 13:41:15 [SUCCESS] cmsstor376.fnal.gov
[root@cmsadmin1 Aug-26-2015]# 

[root@cmsadmin1 Aug-26-2015]# ls -l check_pools.err/ | grep -v "0 Aug" 
total 40
-rw-r--r-- 1 root root 68 Aug 26 13:40 cmsstor169.fnal.gov
-rw-r--r-- 1 root root 76 Aug 26 13:40 cmsstor214.fnal.gov
-rw-r--r-- 1 root root 76 Aug 26 13:40 cmsstor217.fnal.gov
-rw-r--r-- 1 root root 76 Aug 26 13:40 cmsstor227.fnal.gov
-rw-r--r-- 1 root root 76 Aug 26 13:40 cmsstor245.fnal.gov
-rw-r--r-- 1 root root 76 Aug 26 13:40 cmsstor274.fnal.gov
-rw-r--r-- 1 root root 76 Aug 26 13:40 cmsstor284.fnal.gov
-rw-r--r-- 1 root root 76 Aug 26 13:40 cmsstor318.fnal.gov
-rw-r--r-- 1 root root 76 Aug 26 13:40 cmsstor321.fnal.gov
-rw-r--r-- 1 root root 76 Aug 26 13:41 cmsstor347.fnal.gov
[root@cmsadmin1 Aug-26-2015]# ls -l check_pools.err/ | grep -v "0 Aug" > non-zero-error.pools
[root@cmsadmin1 Aug-26-2015]# ls -l check_pools.err/ | grep -v "0 Aug" | awk '{print $NF}'> non-zero-error.pools 
[root@cmsadmin1 Aug-26-2015]# cat check_pools.err/cmsstor227.fnal.gov 
Warning: No xauth data; using fake authentication data for X11 forwarding.
[root@cmsadmin1 Aug-26-2015]# cat check_pools.err/cmsstor169.fnal.gov 
ssh: connect to host cmsstor169.fnal.gov port 22: No route to host
[root@cmsadmin1 Aug-26-2015]# for f in check_pools.log/*; do tail -1 $f; done | sort -u 
2.2.29

#10 Updated by Natalia Ratnikova over 4 years ago

Power cycle cmsstor169 which came up on reboot in maintanance mode.

#11 Updated by Natalia Ratnikova over 4 years ago

Started agents . Sent email to cms-t1
Clearing any remaining alarms in check_mk for dcache disk group .

Natalia Ratnikova:
I see CRIT errors in check_mk for some pools:
CRIT - 32 CRIT messages (Last worst: "Aug 26 13:36:53 cmsstor261 kernel: ACPI Error: No handler for Region [SACS] (ffff8810252c3420) [PCI_Config] (20090903/evregion-331)")
and some warnings about wrong eth speed
Gerard Bernabeu:
that’s a ‘discardable’ error that this thing complains about after each reboot... You can clean them all
the network thing needs some more attanteion
I
I’m fixing the same on some tape pools by doing ifdown; ifup

#12 Updated by Natalia Ratnikova over 4 years ago

Test fix suggested by Gerard on cmsstor168 (need to give bond0 interface as an argument )
Rerun check to make sure alarm is gone.

get list of pool servers with warnings from check_mk - manually :

root@cmsadmin1:/root/natalia/Aug-26-2015/wrong_speed.list

Construct pssh for those pools

[root@cmsadmin1 Aug-26-2015]# pssh -h wrong_speed.list -l root -t0 -p 12 -o fix_speed.log -e fix_speed.err 'ifdown bond0; ifup bond0'
[1] 16:23:46 [SUCCESS] cmsstor168.fnal.gov
[2] 16:23:47 [SUCCESS] cmsstor182.fnal.gov
[3] 16:23:47 [SUCCESS] cmsstor210.fnal.gov
[4] 16:23:47 [SUCCESS] cmsstor234.fnal.gov
[5] 16:23:47 [SUCCESS] cmsstor208.fnal.gov
[6] 16:23:47 [SUCCESS] cmsstor220.fnal.gov
[7] 16:23:48 [SUCCESS] cmsstor223.fnal.gov
[8] 16:23:48 [SUCCESS] cmsstor243.fnal.gov
[9] 16:23:48 [SUCCESS] cmsstor226.fnal.gov
[10] 16:23:48 [SUCCESS] cmsstor247.fnal.gov
[11] 16:23:48 [SUCCESS] cmsstor230.fnal.gov
[12] 16:23:49 [SUCCESS] cmsstor207.fnal.gov

#13 Updated by Natalia Ratnikova over 4 years ago

  • Status changed from Assigned to Resolved


Also available in: Atom PDF