Task #9895
CMS-T1 downtime Aug 26th for dCache upgrade.
Description
Down time starts at 1pm last till 4pm.
Upgrade to dCache 2.2.29 - security patch
Move to openjdk java
yum update and reboot everything to get new kernel.
History
#1 Updated by Natalia Ratnikova over 5 years ago
Plan:
Declare downtime in check_mk .
1. Add new dcache rpm to uscmst1 repo.
2. Stop phedex agents
3. Stop dcache service on cmssrmdisk.
4. Uninstall snapshot version.
5. Proceed with upgrade and reboot and checks , see Gerard's commands
6. Fix any remaining issues
7. Start phedex agents
8. Check phedex download logs
9. Declare the end of downtime
#2 Updated by Natalia Ratnikova over 5 years ago
From 1pm to 3 pm on 08-26-2015 , one hour before official end, so we see any alarms coming
#3 Updated by Natalia Ratnikova over 5 years ago
on cmsadmin1:
[root@cmsadmin1 Aug-26-2015]# pwd
/root/natalia/Aug-26-2015
[root@cmsadmin1 Aug-26-2015]# wc -l *
191 dcache-disk-pools.list
4 dcache-disk-servers.list
195 total
[root@cmsadmin1 Aug-26-2015]#
#4 Updated by Natalia Ratnikova over 5 years ago
Preparation step : run yum update in advance on all nodes, so the new kernel rpm is installed
on root@cmsadmin1
cd /root/natalia/Aug-26-2015
[root@cmsadmin1 Aug-26-2015]# pssh h dcache-disk-servers.list -l root -t 0 -p 50 -o yum_update_srvs.log -e yum_update_srvs.err 'kernelversion=2.6.32-573.3.1.el6.x86_64; yum update -y; rpm -q kernel${kernelversion}'
[1] 17:57:01 [SUCCESS] cmssrmdisk.fnal.gov
[2] 17:57:02 [SUCCESS] cmschimeradiskbackup.fnal.gov
[3] 17:57:03 [SUCCESS] cmsdcacheadmindisk.fnal.gov
[4] 17:57:10 [SUCCESS] cmschimeradisk.fnal.gov
[root@cmsadmin1 Aug-26-2015]#
[root@cmsadmin1 Aug-26-2015]# time pssh h dcache-disk-pools.list -l root -t 0 -p 50 -o yum_update_pools.log -e yum_update_pools.err 'kernelversion=2.6.32-573.3.1.el6.x86_64; yum update -y; rpm -q kernel${kernelversion}'
[1] 17:59:43 [SUCCESS] cmsstor221.fnal.gov
[2] 17:59:52 [SUCCESS] cmsstor201.fnal.gov
[3] 17:59:55 [SUCCESS] cmsstor185.fnal.gov
[4] 17:59:58 [SUCCESS] cmsstor181.fnal.gov
[5] 17:59:58 [SUCCESS] cmsstor216.fnal.gov
[6] 17:59:58 [SUCCESS] cmsstor195.fnal.gov
[7] 17:59:58 [SUCCESS] cmsstor175.fnal.gov
[8] 17:59:58 [SUCCESS] cmsstor203.fnal.gov
[9] 17:59:58 [SUCCESS] cmsstor182.fnal.gov
[10] 17:59:59 [SUCCESS] cmsstor172.fnal.gov
[11] 17:59:59 [SUCCESS] cmsstor177.fnal.gov
[12] 17:59:59 [SUCCESS] cmsstor220.fnal.gov
[13] 17:59:59 [SUCCESS] cmsstor193.fnal.gov
[14] 18:00:00 [SUCCESS] cmsstor173.fnal.gov
[15] 18:00:00 [SUCCESS] cmsstor217.fnal.gov
[16] 18:00:00 [SUCCESS] cmsstor171.fnal.gov
[17] 18:00:00 [SUCCESS] cmsstor183.fnal.gov
[18] 18:00:00 [SUCCESS] cmsstor174.fnal.gov
[19] 18:00:00 [SUCCESS] cmsstor178.fnal.gov
[20] 18:00:00 [SUCCESS] cmsstor209.fnal.gov
[21] 18:00:00 [SUCCESS] cmsstor196.fnal.gov
[22] 18:00:00 [SUCCESS] cmsstor168.fnal.gov
[23] 18:00:01 [SUCCESS] cmsstor176.fnal.gov
[24] 18:00:01 [SUCCESS] cmsstor219.fnal.gov
[25] 18:00:01 [SUCCESS] cmsstor189.fnal.gov
[26] 18:00:01 [SUCCESS] cmsstor214.fnal.gov
[27] 18:00:01 [SUCCESS] cmsstor179.fnal.gov
[28] 18:00:02 [SUCCESS] cmsstor187.fnal.gov
[29] 18:00:03 [SUCCESS] cmsstor169.fnal.gov
[30] 18:00:03 [SUCCESS] cmsstor237.fnal.gov
[31] 18:00:03 [SUCCESS] cmsstor206.fnal.gov
[32] 18:00:03 [SUCCESS] cmsstor186.fnal.gov
[33] 18:00:04 [SUCCESS] cmsstor212.fnal.gov
[34] 18:00:04 [SUCCESS] cmsstor194.fnal.gov
[35] 18:00:04 [SUCCESS] cmsstor210.fnal.gov
[36] 18:00:04 [SUCCESS] cmsstor198.fnal.gov
[37] 18:00:04 [SUCCESS] cmsstor184.fnal.gov
[38] 18:00:04 [SUCCESS] cmsstor215.fnal.gov
[39] 18:00:08 [SUCCESS] cmsstor213.fnal.gov
[40] 18:00:12 [SUCCESS] cmsstor207.fnal.gov
[41] 18:00:15 [SUCCESS] cmsstor199.fnal.gov
[42] 18:00:16 [SUCCESS] cmsstor191.fnal.gov
[43] 18:00:19 [SUCCESS] cmsstor197.fnal.gov
[44] 18:00:20 [SUCCESS] cmsstor188.fnal.gov
[45] 18:00:23 [SUCCESS] cmsstor205.fnal.gov
[46] 18:00:27 [SUCCESS] cmsstor204.fnal.gov
[47] 18:00:35 [SUCCESS] cmsstor208.fnal.gov
[48] 18:01:01 [SUCCESS] cmsstor222.fnal.gov
[49] 18:01:07 [SUCCESS] cmsstor242.fnal.gov
[50] 18:01:08 [SUCCESS] cmsstor223.fnal.gov
[51] 18:01:08 [SUCCESS] cmsstor192.fnal.gov
[52] 18:01:11 [SUCCESS] cmsstor224.fnal.gov
[53] 18:01:12 [SUCCESS] cmsstor227.fnal.gov
[54] 18:01:12 [SUCCESS] cmsstor226.fnal.gov
[55] 18:01:12 [SUCCESS] cmsstor233.fnal.gov
[56] 18:01:13 [SUCCESS] cmsstor234.fnal.gov
[57] 18:01:15 [SUCCESS] cmsstor231.fnal.gov
[58] 18:01:15 [SUCCESS] cmsstor236.fnal.gov
[59] 18:01:15 [SUCCESS] cmsstor244.fnal.gov
[60] 18:01:16 [SUCCESS] cmsstor232.fnal.gov
[61] 18:01:16 [SUCCESS] cmsstor238.fnal.gov
[62] 18:01:16 [SUCCESS] cmsstor239.fnal.gov
[63] 18:01:17 [SUCCESS] cmsstor230.fnal.gov
[64] 18:01:17 [SUCCESS] cmsstor246.fnal.gov
[65] 18:01:17 [SUCCESS] cmsstor250.fnal.gov
[66] 18:01:18 [SUCCESS] cmsstor243.fnal.gov
[67] 18:01:18 [SUCCESS] cmsstor240.fnal.gov
[68] 18:01:18 [SUCCESS] cmsstor245.fnal.gov
[69] 18:01:18 [SUCCESS] cmsstor248.fnal.gov
[70] 18:01:19 [SUCCESS] cmsstor249.fnal.gov
[71] 18:01:19 [SUCCESS] cmsstor247.fnal.gov
[72] 18:01:19 [SUCCESS] cmsstor229.fnal.gov
[73] 18:01:25 [SUCCESS] cmsstor251.fnal.gov
[74] 18:01:27 [SUCCESS] cmsstor235.fnal.gov
[75] 18:01:33 [SUCCESS] cmsstor241.fnal.gov
[76] 18:01:37 [SUCCESS] cmsstor211.fnal.gov
[77] 18:01:38 [SUCCESS] cmsstor228.fnal.gov
[78] 18:01:39 [SUCCESS] cmsstor225.fnal.gov
[79] 18:01:46 [SUCCESS] cmsstor202.fnal.gov
[80] 18:03:04 [SUCCESS] cmsstor264.fnal.gov
[81] 18:03:10 [SUCCESS] cmsstor267.fnal.gov
[82] 18:03:10 [SUCCESS] cmsstor323.fnal.gov
[83] 18:03:14 [SUCCESS] cmsstor266.fnal.gov
[84] 18:03:16 [SUCCESS] cmsstor265.fnal.gov
[85] 18:03:18 [SUCCESS] cmsstor313.fnal.gov
[86] 18:03:19 [SUCCESS] cmsstor218.fnal.gov
[87] 18:03:20 [SUCCESS] cmsstor312.fnal.gov
[88] 18:03:20 [SUCCESS] cmsstor316.fnal.gov
[89] 18:03:20 [SUCCESS] cmsstor309.fnal.gov
[90] 18:03:21 [SUCCESS] cmsstor315.fnal.gov
[91] 18:03:21 [SUCCESS] cmsstor310.fnal.gov
[92] 18:03:21 [SUCCESS] cmsstor314.fnal.gov
[93] 18:03:21 [SUCCESS] cmsstor311.fnal.gov
[94] 18:03:22 [SUCCESS] cmsstor273.fnal.gov
[95] 18:03:23 [SUCCESS] cmsstor261.fnal.gov
[96] 18:03:27 [SUCCESS] cmsstor317.fnal.gov
[97] 18:03:28 [SUCCESS] cmsstor271.fnal.gov
[98] 18:03:31 [SUCCESS] cmsstor318.fnal.gov
[99] 18:03:32 [SUCCESS] cmsstor270.fnal.gov
[100] 18:03:33 [SUCCESS] cmsstor319.fnal.gov
[101] 18:03:35 [SUCCESS] cmsstor275.fnal.gov
[102] 18:03:40 [SUCCESS] cmsstor320.fnal.gov
[103] 18:03:41 [SUCCESS] cmsstor321.fnal.gov
[104] 18:03:42 [SUCCESS] cmsstor322.fnal.gov
[105] 18:03:45 [SUCCESS] cmsstor277.fnal.gov
[106] 18:04:09 [SUCCESS] cmsstor278.fnal.gov
[107] 18:04:11 [SUCCESS] cmsstor263.fnal.gov
[108] 18:04:12 [SUCCESS] cmsstor282.fnal.gov
[109] 18:04:16 [SUCCESS] cmsstor262.fnal.gov
[110] 18:04:18 [SUCCESS] cmsstor287.fnal.gov
[111] 18:04:18 [SUCCESS] cmsstor284.fnal.gov
[112] 18:04:19 [SUCCESS] cmsstor269.fnal.gov
[113] 18:04:19 [SUCCESS] cmsstor286.fnal.gov
[114] 18:04:19 [SUCCESS] cmsstor285.fnal.gov
[115] 18:04:21 [SUCCESS] cmsstor268.fnal.gov
[116] 18:04:21 [SUCCESS] cmsstor293.fnal.gov
[117] 18:04:22 [SUCCESS] cmsstor291.fnal.gov
[118] 18:04:24 [SUCCESS] cmsstor288.fnal.gov
[119] 18:04:27 [SUCCESS] cmsstor272.fnal.gov
[120] 18:04:28 [SUCCESS] cmsstor294.fnal.gov
[121] 18:04:33 [SUCCESS] cmsstor274.fnal.gov
[122] 18:04:33 [SUCCESS] cmsstor292.fnal.gov
[123] 18:04:35 [SUCCESS] cmsstor276.fnal.gov
[124] 18:05:00 [SUCCESS] cmsstor341.fnal.gov
[125] 18:05:14 [SUCCESS] cmsstor279.fnal.gov
[126] 18:05:18 [SUCCESS] cmsstor280.fnal.gov
[127] 18:05:22 [SUCCESS] cmsstor281.fnal.gov
[128] 18:05:24 [SUCCESS] cmsstor283.fnal.gov
[129] 18:05:29 [SUCCESS] cmsstor289.fnal.gov
[130] 18:05:32 [SUCCESS] cmsstor290.fnal.gov
[131] 18:05:42 [SUCCESS] cmsstor329.fnal.gov
[132] 18:06:18 [SUCCESS] cmsstor325.fnal.gov
[133] 18:06:19 [SUCCESS] cmsstor326.fnal.gov
[134] 18:06:31 [SUCCESS] cmsstor334.fnal.gov
[135] 18:06:32 [SUCCESS] cmsstor336.fnal.gov
[136] 18:06:33 [SUCCESS] cmsstor333.fnal.gov
[137] 18:06:37 [SUCCESS] cmsstor344.fnal.gov
[138] 18:06:37 [SUCCESS] cmsstor328.fnal.gov
[139] 18:06:38 [SUCCESS] cmsstor345.fnal.gov
[140] 18:06:43 [SUCCESS] cmsstor342.fnal.gov
[141] 18:06:49 [SUCCESS] cmsstor353.fnal.gov
[142] 18:07:08 [SUCCESS] cmsstor369.fnal.gov
[143] 18:07:18 [SUCCESS] cmsstor370.fnal.gov
[144] 18:07:25 [SUCCESS] cmsstor327.fnal.gov
[145] 18:07:26 [SUCCESS] cmsstor358.fnal.gov
[146] 18:07:29 [SUCCESS] cmsstor324.fnal.gov
[147] 18:07:30 [SUCCESS] cmsstor354.fnal.gov
[148] 18:07:31 [SUCCESS] cmsstor339.fnal.gov
[149] 18:07:31 [SUCCESS] cmsstor330.fnal.gov
[150] 18:07:32 [SUCCESS] cmsstor331.fnal.gov
[151] 18:07:34 [SUCCESS] cmsstor360.fnal.gov
[152] 18:07:35 [SUCCESS] cmsstor364.fnal.gov
[153] 18:07:35 [SUCCESS] cmsstor337.fnal.gov
[154] 18:07:37 [SUCCESS] cmsstor332.fnal.gov
[155] 18:07:37 [SUCCESS] cmsstor338.fnal.gov
[156] 18:07:38 [SUCCESS] cmsstor366.fnal.gov
[157] 18:07:38 [SUCCESS] cmsstor335.fnal.gov
[158] 18:07:39 [SUCCESS] cmsstor340.fnal.gov
[159] 18:07:49 [SUCCESS] cmsstor402.fnal.gov
[160] 18:07:50 [SUCCESS] cmsstor401.fnal.gov
[161] 18:07:55 [SUCCESS] cmsstor347.fnal.gov
[162] 18:07:57 [SUCCESS] cmsstor348.fnal.gov
[163] 18:07:57 [SUCCESS] cmsstor346.fnal.gov
[164] 18:07:59 [SUCCESS] cmsstor409.fnal.gov
[165] 18:07:59 [SUCCESS] cmsstor403.fnal.gov
[166] 18:08:03 [SUCCESS] cmsstor404.fnal.gov
[167] 18:08:04 [SUCCESS] cmsstor343.fnal.gov
[168] 18:08:04 [SUCCESS] cmsstor349.fnal.gov
[169] 18:08:07 [SUCCESS] cmsstor406.fnal.gov
[170] 18:08:09 [SUCCESS] cmsstor405.fnal.gov
[171] 18:08:10 [SUCCESS] cmsstor410.fnal.gov
[172] 18:08:11 [SUCCESS] cmsstor407.fnal.gov
[173] 18:08:14 [SUCCESS] cmsstor408.fnal.gov
[174] 18:08:19 [SUCCESS] cmsstor351.fnal.gov
[175] 18:08:25 [SUCCESS] cmsstor350.fnal.gov
[176] 18:08:27 [SUCCESS] cmsstor352.fnal.gov
[177] 18:08:30 [SUCCESS] cmsstor357.fnal.gov
[178] 18:08:31 [SUCCESS] cmsstor355.fnal.gov
[179] 18:08:33 [SUCCESS] cmsstor356.fnal.gov
[180] 18:08:34 [SUCCESS] cmsstor359.fnal.gov
[181] 18:08:39 [SUCCESS] cmsstor361.fnal.gov
[182] 18:08:39 [SUCCESS] cmsstor363.fnal.gov
[183] 18:08:41 [SUCCESS] cmsstor362.fnal.gov
[184] 18:08:49 [SUCCESS] cmsstor365.fnal.gov
[185] 18:08:55 [SUCCESS] cmsstor372.fnal.gov
[186] 18:08:55 [SUCCESS] cmsstor371.fnal.gov
[187] 18:09:00 [SUCCESS] cmsstor368.fnal.gov
[188] 18:09:05 [SUCCESS] cmsstor374.fnal.gov
[189] 18:09:07 [SUCCESS] cmsstor376.fnal.gov
[190] 18:09:52 [SUCCESS] cmsstor373.fnal.gov
[191] 18:10:06 [SUCCESS] cmsstor375.fnal.gov
real 11m25.432s
user 0m7.971s
sys 0m6.021s
[root@cmsadmin1 Aug-26-2015]#
#5 Updated by Natalia Ratnikova over 5 years ago
Stop phedex agents on the disk instance:
ssh root@cmsphedex-disk su - cmsprod cd siteconfs/siteconf-fnaldisk.4.1.3-comp3-1.0 ./pmgr Dev stop ./pmgr Debug stop ./pmgr Prod stop
And on the tape instance:
ssh root@cmssrv228 su - cmsprod cd /home/cmsprod/siteconf-git/ ./pmgr Dev stop ./pmgr Debug stop ./pmgr Prod stop
#6 Updated by Natalia Ratnikova over 5 years ago
Upgrade dCache servers:
ssh root@cmsadmin1
Handle non-standard dcache installation on cmssrmdisk: leave puppet disabled and erase non-standard dcache, as yum thinks its version is newer than the one we are going to install:
ssh cmssrmdisk puppet agent --disable service dcache-server stop rpm -qi dcache yum erase dcache rpm -qi dcache exit
Now upgrade, reboot, and check all servers:
cd /root/natalia/Aug-26-2015 pssh -h dcache-disk-servers.list -l root -t 0 -p 4 -o upgr_srvs.log -e upgr_srvs.err 'kernelversion=2.6.32-573.3.1.el6.x86_64; extra_rpm_check="dcache-2.2.29-1.noarch"; puppet agent --disable; service dcache-server stop; yum clean all; yum update -y; rpm -q kernel-${kernelversion} ${extra_rpm_check} && puppet agent --enable && reboot' pssh -h dcache-disk-servers.list -l root -t 60 -p 4 -o check_srvs.log -e check_srvs.err 'kernelversion=2.6.32-573.3.1.el6.x86_64; uptime; uname -a | grep $kernelversion && (service dcache-server status | grep -v DOMAIN | grep -v running ); dcache version;'
#7 Updated by Natalia Ratnikova over 5 years ago
Upgrade reboot and check all pools
pssh -h dcache-disk-pools.list -l root -t 0 -p 50 -o upgr_pools.log -e upgr_pools.err 'kernelversion=2.6.32-573.3.1.el6.x86_64; extra_rpm_check="dcache-2.2.29-1.noarch"; puppet agent --disable; service dcache-server stop; yum clean all; yum update -y; rpm -q kernel-${kernelversion} ${extra_rpm_check} && puppet agent --enable && reboot' pssh -h dcache-disk-pools.list -l root -t 60 -p 50 -o check_pools.log -e check_pools.err 'kernelversion=2.6.32-573.3.1.el6.x86_64; uptime; uname -a | grep $kernelversion && ((service dcache-server status | grep -v DOMAIN | grep -v running ) || service dcache-server status); dcache version;'
#8 Updated by Natalia Ratnikova over 5 years ago
Start phedex agents on the disk instance:
ssh root@cmsphedex-disk su - cmsprod cd siteconfs/siteconf-fnaldisk.4.1.3-comp3-1.0 ./pmgr Dev start ./pmgr Debug start ./pmgr Prod start
And on the tape instance:
ssh root@cmssrv228 su - cmsprod cd /home/cmsprod/siteconf-git/ ./pmgr Dev start ./pmgr Debug start ./pmgr Prod start
Check the logs to see if transfers have resumed and succeed to write data to dcache.
#9 Updated by Natalia Ratnikova over 5 years ago
Last login: Wed Aug 26 12:02:47 on ttys003 mac-121252:~ natasha$ to-adm Last login: Wed Aug 26 11:12:06 2015 from mac-122182.attlocal.net.dhcp.fnal.gov NOTICE TO USERS This is a Federal computer (and/or it is directly connected to a Fermilab local network system) that is the property of the United States Government. It is for authorized use only. Users (autho- rized or unauthorized) have no explicit or implicit expectation of privacy. Any or all uses of this system and all files on this system may be intercepted, monitored, recorded, copied, audited, inspected, and disclosed to authorized site, Department of Energy and law enforcement personnel, as well as authorized officials of other agencies, both domestic and foreign. By using this system, the user consents to such interception, monitoring, recording, copy- ing, auditing, inspection, and disclosure at the discretion of authorized site or Department of Energy personnel. Unauthorized or improper use of this system may result in admin- istrative disciplinary action and civil and criminal penalties. By continuing to use this system you indicate your awareness of and consent to these terms and conditions of use. LOG OFF IMME- DIATELY if you do not agree to the conditions stated in this warning. Fermilab policy and rules for computing, including appropriate use, may be found at http://www.fnal.gov/cd/main/cpolicy.html cmsadmin1.fnal.gov - bastion/production (SLF 6.6) 32-core Opteron 6320 (H8QG6); 62.89 GB RAM, 16.00 GB swap [root@cmsadmin1 ~]# ssh cmssrmdisk puppet agent --disable service dcache-server stop rpm -qi dcache yum erase dcache rpm -qi dcache exitLast login: Wed Aug 26 07:47:43 2015 from cmsadmin1.fnal.gov NOTICE TO USERS This is a Federal computer (and/or it is directly connected to a Fermilab local network system) that is the property of the United States Government. It is for authorized use only. Users (autho- rized or unauthorized) have no explicit or implicit expectation of privacy. Any or all uses of this system and all files on this system may be intercepted, monitored, recorded, copied, audited, inspected, and disclosed to authorized site, Department of Energy and law enforcement personnel, as well as authorized officials of other agencies, both domestic and foreign. By using this system, the user consents to such interception, monitoring, recording, copy- ing, auditing, inspection, and disclosure at the discretion of authorized site or Department of Energy personnel. Unauthorized or improper use of this system may result in admin- istrative disciplinary action and civil and criminal penalties. By continuing to use this system you indicate your awareness of and consent to these terms and conditions of use. LOG OFF IMME- DIATELY if you do not agree to the conditions stated in this warning. Fermilab policy and rules for computing, including appropriate use, may be found at http://www.fnal.gov/cd/main/cpolicy.html cmssrmdisk.fnal.gov - srmdisk/production (SLF 6.6) 32-core Opteron 6320 (H8QG6); 62.89 GB RAM, 16.00 GB swap [root@cmssrmdisk ~]# puppet agent --disable [root@cmssrmdisk ~]# service dcache-server stop Stopping transfermanagersDomain 0 done Stopping srm-cmssrmdiskDomain 0 1 2 3 done Stopping gPlazmaDomain 0 done Stopping utilityDomain 0 done Stopping xrootdLFNs-cmssrmdiskDomain 0 1 2 3 4 done Stopping xrootd-cmssrmdiskDomain 0 done Stopping gsidcap-cmssrmdiskDomain 0 done Stopping authdcap-cmssrmdiskDomain 0 done Stopping dcap-cmssrmdiskDomain 0 done Stopping nfsDomain.v3 0 done [root@cmssrmdisk ~]# rpm -qi dcache Name : dcache Relocations: / Version : 2.2.29SNAPSHOT Vendor: dCache.org Release : 1 Build Date: Wed 14 Jan 2015 11:51:08 AM CST Install Date: Wed 11 Feb 2015 08:27:42 AM CST Build Host: uqbar.fnal.gov Group : Applications/System Source RPM: dcache-2.2.29SNAPSHOT-1.src.rpm Size : 71361598 License: Distributable Signature : (none) Packager : dCache.org <support@dcache.org>. Summary : dCache Server Description : dCache is a distributed mass storage system. This package contains the server components. [root@cmssrmdisk ~]# yum erase dcache Loaded plugins: priorities, security Setting up Remove Process Resolving Dependencies --> Running transaction check ---> Package dcache.noarch 0:2.2.29SNAPSHOT-1 will be erased --> Finished Dependency Resolution Dependencies Resolved ================================================================================ Package Arch Version Repository Size ================================================================================ Removing: dcache noarch 2.2.29SNAPSHOT-1 installed 68 M Transaction Summary ================================================================================ Remove 1 Package(s) Installed size: 68 M Is this ok [y/N]: y Is this ok [y/N]: Is this ok [y/N]: y Downloading Packages: Running rpm_check_debug Running Transaction Test Transaction Test Succeeded Running Transaction Erasing : dcache-2.2.29SNAPSHOT-1.noarch 1/1 warning: /etc/dcache/gplazma.conf saved as /etc/dcache/gplazma.conf.rpmsave warning: /etc/dcache/dcachesrm-gplazma.policy saved as /etc/dcache/dcachesrm-gplazma.policy.rpmsave warning: /etc/dcache/dcache.conf saved as /etc/dcache/dcache.conf.rpmsave Verifying : dcache-2.2.29SNAPSHOT-1.noarch 1/1 Removed: dcache.noarch 0:2.2.29SNAPSHOT-1 Complete! [root@cmssrmdisk ~]# rpm -q dcache package dcache is not installed [root@cmssrmdisk ~]# exit logout Connection to cmssrmdisk closed. [root@cmsadmin1 ~]# cd /root/natalia/Aug-26-2015 [root@cmsadmin1 Aug-26-2015]# pssh -h dcache-disk-servers.list -l root -t 0 -p 4 -o upgr_srvs.log -e upgr_srvs.err 'kernelversion=2.6.32-573.3.1.el6.x86_64; extra_rpm_check="dcache-2.2.29-1.noarch"; puppet agent --disable; service dcache-server stop; yum clean all; yum update -y; rpm -q kernel-${kernelversion} ${extra_rpm_check} && puppet agent --enable && reboot' [1] 13:08:58 [FAILURE] cmssrmdisk.fnal.gov Exited with error code 1 [2] 13:09:09 [SUCCESS] cmschimeradiskbackup.fnal.gov [3] 13:09:22 [SUCCESS] cmschimeradisk.fnal.gov [4] 13:09:30 [SUCCESS] cmsdcacheadmindisk.fnal.gov [root@cmsadmin1 Aug-26-2015]# ssh cmssrmdisk.fnal.gov Last login: Wed Aug 26 13:05:47 2015 from cmsadmin1.fnal.gov NOTICE TO USERS This is a Federal computer (and/or it is directly connected to a Fermilab local network system) that is the property of the United States Government. It is for authorized use only. Users (autho- rized or unauthorized) have no explicit or implicit expectation of privacy. Any or all uses of this system and all files on this system may be intercepted, monitored, recorded, copied, audited, inspected, and disclosed to authorized site, Department of Energy and law enforcement personnel, as well as authorized officials of other agencies, both domestic and foreign. By using this system, the user consents to such interception, monitoring, recording, copy- ing, auditing, inspection, and disclosure at the discretion of authorized site or Department of Energy personnel. Unauthorized or improper use of this system may result in admin- istrative disciplinary action and civil and criminal penalties. By continuing to use this system you indicate your awareness of and consent to these terms and conditions of use. LOG OFF IMME- DIATELY if you do not agree to the conditions stated in this warning. Fermilab policy and rules for computing, including appropriate use, may be found at http://www.fnal.gov/cd/main/cpolicy.html cmssrmdisk.fnal.gov - srmdisk/production (SLF 6.6) 32-core Opteron 6320 (H8QG6); 62.89 GB RAM, 16.00 GB swap [root@cmssrmdisk ~]# srpm -q dcache -bash: srpm: command not found [root@cmssrmdisk ~]# rpm -q dcache package dcache is not installed [root@cmssrmdisk ~]# yum install dcache Loaded plugins: priorities, security Setting up Install Process 583 packages excluded due to repository priority protections Resolving Dependencies --> Running transaction check ---> Package dcache.noarch 0:2.2.29-1 will be installed --> Finished Dependency Resolution Dependencies Resolved ================================================================================ Package Arch Version Repository Size ================================================================================ Installing: dcache noarch 2.2.29-1 uscmst1 61 M Transaction Summary ================================================================================ Install 1 Package(s) Total download size: 61 M Installed size: 68 M Is this ok [y/N]: y Downloading Packages: dcache-2.2.29-1.noarch.rpm | 61 MB 00:00 Running rpm_check_debug Running Transaction Test Transaction Test Succeeded Running Transaction Installing : dcache-2.2.29-1.noarch 1/1 Verifying : dcache-2.2.29-1.noarch 1/1 Installed: dcache.noarch 0:2.2.29-1 Complete! [root@cmssrmdisk ~]# cat dcache-disk-servers.list cat: dcache-disk-servers.list: No such file or directory [root@cmssrmdisk ~]# exit logout Connection to cmssrmdisk.fnal.gov closed. [root@cmsadmin1 Aug-26-2015]# cat dcache-disk-servers.list cmsdcacheadmindisk.fnal.gov cmschimeradisk.fnal.gov cmssrmdisk.fnal.gov cmschimeradiskbackup.fnal.gov [root@cmsadmin1 Aug-26-2015]# grep srm dcache-disk-servers.list > srmdisk.list [root@cmsadmin1 Aug-26-2015]# pssh -h srmdisk.list -l root -t 0 -p 4 -o upgr_srvs.log -e upgr_srvs.err 'kernelversion=2.6.32-573.3.1.el6.x86_64; extra_rpm_check="dcache-2.2.29-1.noarch"; puppet agent --disable; service dcache-server stop; yum clean all; yum update -y; rpm -q kernel-${kernelversion} ${extra_rpm_check} && puppet agent --enable && reboot' [1] 13:15:08 [SUCCESS] cmssrmdisk.fnal.gov [root@cmsadmin1 Aug-26-2015]# [root@cmsadmin1 Aug-26-2015]# cat check_srvs.log/cmssrmdisk.fnal.gov 13:28:48 up 10 min, 0 users, load average: 2.21, 1.37, 0.67 Linux cmssrmdisk.fnal.gov 2.6.32-573.3.1.el6.x86_64 #1 SMP Thu Aug 13 12:55:33 CDT 2015 x86_64 x86_64 x86_64 GNU/Linux DOMAIN STATUS PID USER nfsDomain.v3 running 9479 root dcap-cmssrmdiskDomain running 9556 root authdcap-cmssrmdiskDomain running 9622 root gsidcap-cmssrmdiskDomain running 9702 root xrootd-cmssrmdiskDomain running 9772 root xrootdLFNs-cmssrmdiskDomain running 9845 root utilityDomain running 9915 root gPlazmaDomain running 9990 root srm-cmssrmdiskDomain running 10069 root transfermanagersDomain running 10139 root 2.2.29 [root@cmsadmin1 Aug-26-2015]# cat check_srvs.err/cmssrmdisk.fnal.gov [root@cmsadmin1 Aug-26-2015]# ls -l check_srvs.err/ total 0 -rw-r--r-- 1 root root 0 Aug 26 13:18 cmschimeradiskbackup.fnal.gov -rw-r--r-- 1 root root 0 Aug 26 13:18 cmschimeradisk.fnal.gov -rw-r--r-- 1 root root 0 Aug 26 13:18 cmsdcacheadmindisk.fnal.gov -rw-r--r-- 1 root root 0 Aug 26 13:28 cmssrmdisk.fnal.gov [root@cmsadmin1 Aug-26-2015]# [root@cmsadmin1 Aug-26-2015]# cat check_srvs.log/cmssrmdisk.fnal.gov 13:28:48 up 10 min, 0 users, load average: 2.21, 1.37, 0.67 Linux cmssrmdisk.fnal.gov 2.6.32-573.3.1.el6.x86_64 #1 SMP Thu Aug 13 12:55:33 CDT 2015 x86_64 x86_64 x86_64 GNU/Linux DOMAIN STATUS PID USER nfsDomain.v3 running 9479 root dcap-cmssrmdiskDomain running 9556 root authdcap-cmssrmdiskDomain running 9622 root gsidcap-cmssrmdiskDomain running 9702 root xrootd-cmssrmdiskDomain running 9772 root xrootdLFNs-cmssrmdiskDomain running 9845 root utilityDomain running 9915 root gPlazmaDomain running 9990 root srm-cmssrmdiskDomain running 10069 root transfermanagersDomain running 10139 root 2.2.29 [root@cmsadmin1 Aug-26-2015]# cat check_srvs.err/cmssrmdisk.fnal.gov [root@cmsadmin1 Aug-26-2015]# ls -l check_srvs.err/ total 0 -rw-r--r-- 1 root root 0 Aug 26 13:18 cmschimeradiskbackup.fnal.gov -rw-r--r-- 1 root root 0 Aug 26 13:18 cmschimeradisk.fnal.gov -rw-r--r-- 1 root root 0 Aug 26 13:18 cmsdcacheadmindisk.fnal.gov -rw-r--r-- 1 root root 0 Aug 26 13:28 cmssrmdisk.fnal.gov [root@cmsadmin1 Aug-26-2015]# pssh -h dcache-disk-pools.list -l root -t 0 -p 50 -o upgr_pools.log -e upgr_pools.err 'kernelversion=2.6.32-573.3.1.el6.x86_64; extra_rpm_check="dcache-2.2.29-1.noarch"; puppet agent --disable; service dcache-server stop; yum clean all; yum update -y; rpm -q kernel-${kernelversion} ${extra_rpm_check} && puppet agent --enable && reboot' [1] 13:32:05 [SUCCESS] cmsstor201.fnal.gov [2] 13:32:07 [SUCCESS] cmsstor221.fnal.gov [3] 13:32:10 [SUCCESS] cmsstor189.fnal.gov [4] 13:32:11 [SUCCESS] cmsstor195.fnal.gov [5] 13:32:11 [SUCCESS] cmsstor176.fnal.gov [6] 13:32:12 [SUCCESS] cmsstor193.fnal.gov [7] 13:32:12 [SUCCESS] cmsstor210.fnal.gov [8] 13:32:13 [SUCCESS] cmsstor209.fnal.gov [9] 13:32:13 [SUCCESS] cmsstor185.fnal.gov [10] 13:32:13 [SUCCESS] cmsstor212.fnal.gov [11] 13:32:13 [SUCCESS] cmsstor183.fnal.gov [12] 13:32:13 [SUCCESS] cmsstor173.fnal.gov [13] 13:32:14 [SUCCESS] cmsstor217.fnal.gov [14] 13:32:14 [SUCCESS] cmsstor175.fnal.gov [15] 13:32:14 [SUCCESS] cmsstor214.fnal.gov [16] 13:32:14 [SUCCESS] cmsstor187.fnal.gov [17] 13:32:15 [SUCCESS] cmsstor220.fnal.gov [18] 13:32:15 [SUCCESS] cmsstor216.fnal.gov [19] 13:32:15 [SUCCESS] cmsstor213.fnal.gov [20] 13:32:15 [SUCCESS] cmsstor219.fnal.gov [21] 13:32:15 [SUCCESS] cmsstor172.fnal.gov [22] 13:32:15 [SUCCESS] cmsstor184.fnal.gov [23] 13:32:15 [SUCCESS] cmsstor174.fnal.gov [24] 13:32:15 [SUCCESS] cmsstor178.fnal.gov [25] 13:32:15 [SUCCESS] cmsstor182.fnal.gov [26] 13:32:16 [SUCCESS] cmsstor169.fnal.gov [27] 13:32:16 [SUCCESS] cmsstor194.fnal.gov [28] 13:32:16 [SUCCESS] cmsstor168.fnal.gov [29] 13:32:16 [SUCCESS] cmsstor181.fnal.gov [30] 13:32:17 [SUCCESS] cmsstor171.fnal.gov [31] 13:32:17 [SUCCESS] cmsstor215.fnal.gov [32] 13:32:17 [SUCCESS] cmsstor186.fnal.gov [33] 13:32:17 [SUCCESS] cmsstor177.fnal.gov [34] 13:32:18 [SUCCESS] cmsstor203.fnal.gov [35] 13:32:18 [SUCCESS] cmsstor198.fnal.gov [36] 13:32:21 [SUCCESS] cmsstor196.fnal.gov [37] 13:32:24 [SUCCESS] cmsstor191.fnal.gov [38] 13:32:25 [SUCCESS] cmsstor199.fnal.gov [39] 13:32:25 [SUCCESS] cmsstor206.fnal.gov [40] 13:32:26 [SUCCESS] cmsstor179.fnal.gov [41] 13:32:26 [SUCCESS] cmsstor207.fnal.gov [42] 13:32:26 [SUCCESS] cmsstor205.fnal.gov [43] 13:32:30 [SUCCESS] cmsstor197.fnal.gov [44] 13:32:30 [SUCCESS] cmsstor188.fnal.gov [45] 13:32:42 [SUCCESS] cmsstor204.fnal.gov [46] 13:32:51 [SUCCESS] cmsstor208.fnal.gov [47] 13:33:01 [SUCCESS] cmsstor242.fnal.gov [48] 13:33:02 [SUCCESS] cmsstor223.fnal.gov [49] 13:33:04 [SUCCESS] cmsstor237.fnal.gov [50] 13:33:04 [SUCCESS] cmsstor224.fnal.gov [51] 13:33:04 [SUCCESS] cmsstor233.fnal.gov [52] 13:33:04 [SUCCESS] cmsstor227.fnal.gov [53] 13:33:05 [SUCCESS] cmsstor222.fnal.gov [54] 13:33:06 [SUCCESS] cmsstor236.fnal.gov [55] 13:33:07 [SUCCESS] cmsstor230.fnal.gov [56] 13:33:07 [SUCCESS] cmsstor243.fnal.gov [57] 13:33:07 [SUCCESS] cmsstor250.fnal.gov [58] 13:33:07 [SUCCESS] cmsstor226.fnal.gov [59] 13:33:08 [SUCCESS] cmsstor232.fnal.gov [60] 13:33:08 [SUCCESS] cmsstor229.fnal.gov [61] 13:33:08 [SUCCESS] cmsstor244.fnal.gov [62] 13:33:08 [SUCCESS] cmsstor238.fnal.gov [63] 13:33:08 [SUCCESS] cmsstor234.fnal.gov [64] 13:33:09 [SUCCESS] cmsstor248.fnal.gov [65] 13:33:09 [SUCCESS] cmsstor192.fnal.gov [66] 13:33:09 [SUCCESS] cmsstor240.fnal.gov [67] 13:33:09 [SUCCESS] cmsstor239.fnal.gov [68] 13:33:10 [SUCCESS] cmsstor245.fnal.gov [69] 13:33:10 [SUCCESS] cmsstor249.fnal.gov [70] 13:33:10 [SUCCESS] cmsstor231.fnal.gov [71] 13:33:11 [SUCCESS] cmsstor246.fnal.gov [72] 13:33:13 [SUCCESS] cmsstor251.fnal.gov [73] 13:33:14 [SUCCESS] cmsstor264.fnal.gov [74] 13:33:15 [SUCCESS] cmsstor228.fnal.gov [75] 13:33:16 [SUCCESS] cmsstor247.fnal.gov [76] 13:33:16 [SUCCESS] cmsstor261.fnal.gov [77] 13:33:19 [SUCCESS] cmsstor235.fnal.gov [78] 13:33:19 [SUCCESS] cmsstor211.fnal.gov [79] 13:33:22 [SUCCESS] cmsstor266.fnal.gov [80] 13:33:22 [SUCCESS] cmsstor265.fnal.gov [81] 13:33:23 [SUCCESS] cmsstor272.fnal.gov [82] 13:33:26 [SUCCESS] cmsstor270.fnal.gov [83] 13:33:29 [SUCCESS] cmsstor273.fnal.gov [84] 13:33:29 [SUCCESS] cmsstor267.fnal.gov [85] 13:33:29 [SUCCESS] cmsstor274.fnal.gov [86] 13:33:29 [SUCCESS] cmsstor225.fnal.gov [87] 13:33:31 [SUCCESS] cmsstor241.fnal.gov [88] 13:33:41 [SUCCESS] cmsstor275.fnal.gov [89] 13:33:41 [SUCCESS] cmsstor263.fnal.gov [90] 13:33:46 [SUCCESS] cmsstor262.fnal.gov [91] 13:33:49 [SUCCESS] cmsstor269.fnal.gov [92] 13:33:49 [SUCCESS] cmsstor202.fnal.gov [93] 13:33:54 [SUCCESS] cmsstor271.fnal.gov [94] 13:33:58 [SUCCESS] cmsstor268.fnal.gov [95] 13:33:58 [SUCCESS] cmsstor315.fnal.gov [96] 13:33:58 [SUCCESS] cmsstor309.fnal.gov [97] 13:34:01 [SUCCESS] cmsstor312.fnal.gov [98] 13:34:01 [SUCCESS] cmsstor313.fnal.gov [99] 13:34:02 [SUCCESS] cmsstor314.fnal.gov [100] 13:34:03 [SUCCESS] cmsstor311.fnal.gov [101] 13:34:04 [SUCCESS] cmsstor286.fnal.gov [102] 13:34:05 [SUCCESS] cmsstor317.fnal.gov [103] 13:34:05 [SUCCESS] cmsstor319.fnal.gov [104] 13:34:05 [SUCCESS] cmsstor281.fnal.gov [105] 13:34:06 [SUCCESS] cmsstor285.fnal.gov [106] 13:34:06 [SUCCESS] cmsstor283.fnal.gov [107] 13:34:06 [SUCCESS] cmsstor320.fnal.gov [108] 13:34:06 [SUCCESS] cmsstor289.fnal.gov [109] 13:34:06 [SUCCESS] cmsstor310.fnal.gov [110] 13:34:07 [SUCCESS] cmsstor316.fnal.gov [111] 13:34:08 [SUCCESS] cmsstor321.fnal.gov [112] 13:34:08 [SUCCESS] cmsstor278.fnal.gov [113] 13:34:08 [SUCCESS] cmsstor318.fnal.gov [114] 13:34:09 [SUCCESS] cmsstor279.fnal.gov [115] 13:34:09 [SUCCESS] cmsstor322.fnal.gov [116] 13:34:11 [SUCCESS] cmsstor280.fnal.gov [117] 13:34:17 [SUCCESS] cmsstor329.fnal.gov [118] 13:34:18 [SUCCESS] cmsstor276.fnal.gov [119] 13:34:22 [SUCCESS] cmsstor323.fnal.gov [120] 13:34:27 [SUCCESS] cmsstor325.fnal.gov [121] 13:34:29 [SUCCESS] cmsstor277.fnal.gov [122] 13:34:30 [SUCCESS] cmsstor331.fnal.gov [123] 13:34:32 [SUCCESS] cmsstor328.fnal.gov [124] 13:34:34 [SUCCESS] cmsstor294.fnal.gov [125] 13:34:34 [SUCCESS] cmsstor291.fnal.gov [126] 13:34:36 [SUCCESS] cmsstor287.fnal.gov [127] 13:34:36 [SUCCESS] cmsstor218.fnal.gov [128] 13:34:36 [SUCCESS] cmsstor290.fnal.gov [129] 13:34:36 [SUCCESS] cmsstor293.fnal.gov [130] 13:34:36 [SUCCESS] cmsstor284.fnal.gov [131] 13:34:37 [SUCCESS] cmsstor282.fnal.gov [132] 13:34:38 [SUCCESS] cmsstor288.fnal.gov [133] 13:34:40 [SUCCESS] cmsstor292.fnal.gov [134] 13:34:51 [SUCCESS] cmsstor324.fnal.gov [135] 13:34:51 [SUCCESS] cmsstor336.fnal.gov [136] 13:34:53 [SUCCESS] cmsstor337.fnal.gov [137] 13:34:55 [SUCCESS] cmsstor326.fnal.gov [138] 13:34:57 [SUCCESS] cmsstor335.fnal.gov [139] 13:34:58 [SUCCESS] cmsstor338.fnal.gov [140] 13:35:01 [SUCCESS] cmsstor327.fnal.gov [141] 13:35:02 [SUCCESS] cmsstor343.fnal.gov [142] 13:35:03 [SUCCESS] cmsstor330.fnal.gov [143] 13:35:05 [SUCCESS] cmsstor350.fnal.gov [144] 13:35:08 [SUCCESS] cmsstor351.fnal.gov [145] 13:35:09 [SUCCESS] cmsstor359.fnal.gov [146] 13:35:10 [SUCCESS] cmsstor353.fnal.gov [147] 13:35:10 [SUCCESS] cmsstor332.fnal.gov [148] 13:35:12 [SUCCESS] cmsstor345.fnal.gov [149] 13:35:14 [SUCCESS] cmsstor349.fnal.gov [150] 13:35:16 [SUCCESS] cmsstor334.fnal.gov [151] 13:35:17 [SUCCESS] cmsstor333.fnal.gov [152] 13:35:23 [SUCCESS] cmsstor362.fnal.gov [153] 13:35:23 [SUCCESS] cmsstor339.fnal.gov [154] 13:35:26 [SUCCESS] cmsstor370.fnal.gov [155] 13:35:27 [SUCCESS] cmsstor342.fnal.gov [156] 13:35:28 [SUCCESS] cmsstor341.fnal.gov [157] 13:35:29 [SUCCESS] cmsstor369.fnal.gov [158] 13:35:31 [SUCCESS] cmsstor364.fnal.gov [159] 13:35:31 [SUCCESS] cmsstor340.fnal.gov [160] 13:35:31 [SUCCESS] cmsstor401.fnal.gov [161] 13:35:32 [SUCCESS] cmsstor346.fnal.gov [162] 13:35:32 [SUCCESS] cmsstor365.fnal.gov [163] 13:35:33 [SUCCESS] cmsstor402.fnal.gov [164] 13:35:33 [SUCCESS] cmsstor348.fnal.gov [165] 13:35:33 [SUCCESS] cmsstor355.fnal.gov [166] 13:35:34 [SUCCESS] cmsstor352.fnal.gov [167] 13:35:35 [SUCCESS] cmsstor356.fnal.gov [168] 13:35:36 [SUCCESS] cmsstor368.fnal.gov [169] 13:35:36 [SUCCESS] cmsstor374.fnal.gov [170] 13:35:37 [SUCCESS] cmsstor347.fnal.gov [171] 13:35:37 [SUCCESS] cmsstor357.fnal.gov [172] 13:35:39 [SUCCESS] cmsstor360.fnal.gov [173] 13:35:40 [SUCCESS] cmsstor354.fnal.gov [174] 13:35:42 [SUCCESS] cmsstor404.fnal.gov [175] 13:35:43 [SUCCESS] cmsstor403.fnal.gov [176] 13:35:43 [SUCCESS] cmsstor344.fnal.gov [177] 13:35:43 [SUCCESS] cmsstor405.fnal.gov [178] 13:35:43 [SUCCESS] cmsstor358.fnal.gov [179] 13:35:44 [SUCCESS] cmsstor406.fnal.gov [180] 13:35:45 [SUCCESS] cmsstor361.fnal.gov [181] 13:35:45 [SUCCESS] cmsstor372.fnal.gov [182] 13:35:48 [SUCCESS] cmsstor407.fnal.gov [183] 13:35:49 [SUCCESS] cmsstor410.fnal.gov [184] 13:35:49 [SUCCESS] cmsstor363.fnal.gov [185] 13:35:50 [SUCCESS] cmsstor408.fnal.gov [186] 13:35:51 [SUCCESS] cmsstor409.fnal.gov [187] 13:35:53 [SUCCESS] cmsstor366.fnal.gov [188] 13:36:04 [SUCCESS] cmsstor375.fnal.gov [189] 13:36:06 [SUCCESS] cmsstor373.fnal.gov [190] 13:36:07 [SUCCESS] cmsstor376.fnal.gov [191] 13:36:07 [SUCCESS] cmsstor371.fnal.gov [root@cmsadmin1 Aug-26-2015]# pssh -h dcache-disk-pools.list -l root -t 60 -p 50 -o check_pools.log -e check_pools.err 'kernelversion=2.6.32-573.3.1.el6.x86_64; uptime; uname -a | grep $kernelversion && ((service dcache-server status | grep -v DOMAIN | grep -v running ) || service dcache-server status); dcache version;' [1] 13:40:50 [FAILURE] cmsstor169.fnal.gov Exited with error code 255 [2] 13:40:51 [SUCCESS] cmsstor178.fnal.gov [3] 13:40:51 [SUCCESS] cmsstor171.fnal.gov [4] 13:40:51 [SUCCESS] cmsstor172.fnal.gov [5] 13:40:51 [SUCCESS] cmsstor173.fnal.gov [6] 13:40:51 [SUCCESS] cmsstor168.fnal.gov [7] 13:40:51 [SUCCESS] cmsstor194.fnal.gov [8] 13:40:51 [SUCCESS] cmsstor174.fnal.gov [9] 13:40:51 [SUCCESS] cmsstor193.fnal.gov [10] 13:40:51 [SUCCESS] cmsstor175.fnal.gov [11] 13:40:51 [SUCCESS] cmsstor179.fnal.gov [12] 13:40:51 [SUCCESS] cmsstor187.fnal.gov [13] 13:40:51 [SUCCESS] cmsstor181.fnal.gov [14] 13:40:51 [SUCCESS] cmsstor177.fnal.gov [15] 13:40:51 [SUCCESS] cmsstor182.fnal.gov [16] 13:40:51 [SUCCESS] cmsstor189.fnal.gov [17] 13:40:51 [SUCCESS] cmsstor183.fnal.gov [18] 13:40:52 [SUCCESS] cmsstor186.fnal.gov [19] 13:40:52 [SUCCESS] cmsstor176.fnal.gov [20] 13:40:52 [SUCCESS] cmsstor221.fnal.gov [21] 13:40:52 [SUCCESS] cmsstor206.fnal.gov [22] 13:40:52 [SUCCESS] cmsstor197.fnal.gov [23] 13:40:52 [SUCCESS] cmsstor198.fnal.gov [24] 13:40:52 [SUCCESS] cmsstor203.fnal.gov [25] 13:40:52 [SUCCESS] cmsstor196.fnal.gov [26] 13:40:52 [SUCCESS] cmsstor209.fnal.gov [27] 13:40:52 [SUCCESS] cmsstor191.fnal.gov [28] 13:40:52 [SUCCESS] cmsstor185.fnal.gov [29] 13:40:52 [SUCCESS] cmsstor199.fnal.gov [30] 13:40:52 [SUCCESS] cmsstor201.fnal.gov [31] 13:40:52 [SUCCESS] cmsstor216.fnal.gov [32] 13:40:52 [SUCCESS] cmsstor188.fnal.gov [33] 13:40:52 [SUCCESS] cmsstor205.fnal.gov [34] 13:40:52 [SUCCESS] cmsstor215.fnal.gov [35] 13:40:52 [SUCCESS] cmsstor217.fnal.gov [36] 13:40:52 [SUCCESS] cmsstor219.fnal.gov [37] 13:40:52 [SUCCESS] cmsstor214.fnal.gov [38] 13:40:52 [SUCCESS] cmsstor212.fnal.gov [39] 13:40:52 [SUCCESS] cmsstor195.fnal.gov [40] 13:40:52 [SUCCESS] cmsstor208.fnal.gov [41] 13:40:52 [SUCCESS] cmsstor211.fnal.gov [42] 13:40:52 [SUCCESS] cmsstor220.fnal.gov [43] 13:40:52 [SUCCESS] cmsstor207.fnal.gov [44] 13:40:52 [SUCCESS] cmsstor213.fnal.gov [45] 13:40:52 [SUCCESS] cmsstor210.fnal.gov [46] 13:40:52 [SUCCESS] cmsstor204.fnal.gov [47] 13:40:52 [SUCCESS] cmsstor192.fnal.gov [48] 13:40:52 [SUCCESS] cmsstor184.fnal.gov [49] 13:40:52 [SUCCESS] cmsstor218.fnal.gov [50] 13:40:52 [SUCCESS] cmsstor202.fnal.gov [51] 13:40:55 [SUCCESS] cmsstor222.fnal.gov [52] 13:40:56 [SUCCESS] cmsstor223.fnal.gov [53] 13:40:56 [SUCCESS] cmsstor225.fnal.gov [54] 13:40:56 [SUCCESS] cmsstor242.fnal.gov [55] 13:40:56 [SUCCESS] cmsstor226.fnal.gov [56] 13:40:56 [SUCCESS] cmsstor224.fnal.gov [57] 13:40:56 [SUCCESS] cmsstor230.fnal.gov [58] 13:40:56 [SUCCESS] cmsstor240.fnal.gov [59] 13:40:56 [SUCCESS] cmsstor232.fnal.gov [60] 13:40:56 [SUCCESS] cmsstor234.fnal.gov [61] 13:40:56 [SUCCESS] cmsstor231.fnal.gov [62] 13:40:56 [SUCCESS] cmsstor248.fnal.gov [63] 13:40:56 [SUCCESS] cmsstor238.fnal.gov [64] 13:40:56 [SUCCESS] cmsstor229.fnal.gov [65] 13:40:56 [SUCCESS] cmsstor235.fnal.gov [66] 13:40:56 [SUCCESS] cmsstor244.fnal.gov [67] 13:40:56 [SUCCESS] cmsstor246.fnal.gov [68] 13:40:56 [SUCCESS] cmsstor243.fnal.gov [69] 13:40:56 [SUCCESS] cmsstor251.fnal.gov [70] 13:40:56 [SUCCESS] cmsstor249.fnal.gov [71] 13:40:56 [SUCCESS] cmsstor236.fnal.gov [72] 13:40:56 [SUCCESS] cmsstor245.fnal.gov [73] 13:40:56 [SUCCESS] cmsstor233.fnal.gov [74] 13:40:56 [SUCCESS] cmsstor241.fnal.gov [75] 13:40:56 [SUCCESS] cmsstor247.fnal.gov [76] 13:40:56 [SUCCESS] cmsstor239.fnal.gov [77] 13:40:56 [SUCCESS] cmsstor237.fnal.gov [78] 13:40:56 [SUCCESS] cmsstor250.fnal.gov [79] 13:40:56 [SUCCESS] cmsstor228.fnal.gov [80] 13:40:57 [SUCCESS] cmsstor227.fnal.gov [81] 13:41:00 [SUCCESS] cmsstor262.fnal.gov [82] 13:41:00 [SUCCESS] cmsstor273.fnal.gov [83] 13:41:00 [SUCCESS] cmsstor269.fnal.gov [84] 13:41:00 [SUCCESS] cmsstor265.fnal.gov [85] 13:41:00 [SUCCESS] cmsstor261.fnal.gov [86] 13:41:00 [SUCCESS] cmsstor270.fnal.gov [87] 13:41:00 [SUCCESS] cmsstor276.fnal.gov [88] 13:41:00 [SUCCESS] cmsstor263.fnal.gov [89] 13:41:00 [SUCCESS] cmsstor275.fnal.gov [90] 13:41:00 [SUCCESS] cmsstor268.fnal.gov [91] 13:41:00 [SUCCESS] cmsstor271.fnal.gov [92] 13:41:00 [SUCCESS] cmsstor267.fnal.gov [93] 13:41:00 [SUCCESS] cmsstor274.fnal.gov [94] 13:41:00 [SUCCESS] cmsstor272.fnal.gov [95] 13:41:00 [SUCCESS] cmsstor266.fnal.gov [96] 13:41:00 [SUCCESS] cmsstor277.fnal.gov [97] 13:41:00 [SUCCESS] cmsstor278.fnal.gov [98] 13:41:00 [SUCCESS] cmsstor280.fnal.gov [99] 13:41:00 [SUCCESS] cmsstor279.fnal.gov [100] 13:41:01 [SUCCESS] cmsstor311.fnal.gov [101] 13:41:01 [SUCCESS] cmsstor309.fnal.gov [102] 13:41:01 [SUCCESS] cmsstor264.fnal.gov [103] 13:41:01 [SUCCESS] cmsstor317.fnal.gov [104] 13:41:01 [SUCCESS] cmsstor313.fnal.gov [105] 13:41:01 [SUCCESS] cmsstor310.fnal.gov [106] 13:41:01 [SUCCESS] cmsstor319.fnal.gov [107] 13:41:01 [SUCCESS] cmsstor315.fnal.gov [108] 13:41:01 [SUCCESS] cmsstor312.fnal.gov [109] 13:41:01 [SUCCESS] cmsstor314.fnal.gov [110] 13:41:01 [SUCCESS] cmsstor318.fnal.gov [111] 13:41:01 [SUCCESS] cmsstor320.fnal.gov [112] 13:41:01 [SUCCESS] cmsstor316.fnal.gov [113] 13:41:01 [SUCCESS] cmsstor321.fnal.gov [114] 13:41:01 [SUCCESS] cmsstor322.fnal.gov [115] 13:41:03 [SUCCESS] cmsstor281.fnal.gov [116] 13:41:04 [SUCCESS] cmsstor283.fnal.gov [117] 13:41:04 [SUCCESS] cmsstor285.fnal.gov [118] 13:41:04 [SUCCESS] cmsstor284.fnal.gov [119] 13:41:04 [SUCCESS] cmsstor288.fnal.gov [120] 13:41:04 [SUCCESS] cmsstor290.fnal.gov [121] 13:41:04 [SUCCESS] cmsstor289.fnal.gov [122] 13:41:04 [SUCCESS] cmsstor286.fnal.gov [123] 13:41:04 [SUCCESS] cmsstor282.fnal.gov [124] 13:41:04 [SUCCESS] cmsstor292.fnal.gov [125] 13:41:04 [SUCCESS] cmsstor287.fnal.gov [126] 13:41:04 [SUCCESS] cmsstor294.fnal.gov [127] 13:41:04 [SUCCESS] cmsstor291.fnal.gov [128] 13:41:05 [SUCCESS] cmsstor293.fnal.gov [129] 13:41:05 [SUCCESS] cmsstor323.fnal.gov [130] 13:41:05 [SUCCESS] cmsstor324.fnal.gov [131] 13:41:07 [SUCCESS] cmsstor329.fnal.gov [132] 13:41:08 [SUCCESS] cmsstor326.fnal.gov [133] 13:41:08 [SUCCESS] cmsstor327.fnal.gov [134] 13:41:08 [SUCCESS] cmsstor325.fnal.gov [135] 13:41:08 [SUCCESS] cmsstor337.fnal.gov [136] 13:41:08 [SUCCESS] cmsstor333.fnal.gov [137] 13:41:08 [SUCCESS] cmsstor332.fnal.gov [138] 13:41:08 [SUCCESS] cmsstor328.fnal.gov [139] 13:41:08 [SUCCESS] cmsstor338.fnal.gov [140] 13:41:08 [SUCCESS] cmsstor331.fnal.gov [141] 13:41:08 [SUCCESS] cmsstor336.fnal.gov [142] 13:41:08 [SUCCESS] cmsstor330.fnal.gov [143] 13:41:08 [SUCCESS] cmsstor334.fnal.gov [144] 13:41:08 [SUCCESS] cmsstor339.fnal.gov [145] 13:41:08 [SUCCESS] cmsstor335.fnal.gov [146] 13:41:08 [SUCCESS] cmsstor340.fnal.gov [147] 13:41:08 [SUCCESS] cmsstor341.fnal.gov [148] 13:41:09 [SUCCESS] cmsstor342.fnal.gov [149] 13:41:09 [SUCCESS] cmsstor343.fnal.gov [150] 13:41:09 [SUCCESS] cmsstor345.fnal.gov [151] 13:41:09 [SUCCESS] cmsstor344.fnal.gov [152] 13:41:09 [SUCCESS] cmsstor346.fnal.gov [153] 13:41:09 [SUCCESS] cmsstor349.fnal.gov [154] 13:41:09 [SUCCESS] cmsstor350.fnal.gov [155] 13:41:09 [SUCCESS] cmsstor348.fnal.gov [156] 13:41:09 [SUCCESS] cmsstor369.fnal.gov [157] 13:41:09 [SUCCESS] cmsstor354.fnal.gov [158] 13:41:09 [SUCCESS] cmsstor351.fnal.gov [159] 13:41:09 [SUCCESS] cmsstor352.fnal.gov [160] 13:41:09 [SUCCESS] cmsstor353.fnal.gov [161] 13:41:09 [SUCCESS] cmsstor347.fnal.gov [162] 13:41:09 [SUCCESS] cmsstor370.fnal.gov [163] 13:41:09 [SUCCESS] cmsstor356.fnal.gov [164] 13:41:09 [SUCCESS] cmsstor355.fnal.gov [165] 13:41:09 [SUCCESS] cmsstor357.fnal.gov [166] 13:41:10 [SUCCESS] cmsstor358.fnal.gov [167] 13:41:11 [SUCCESS] cmsstor359.fnal.gov [168] 13:41:12 [SUCCESS] cmsstor368.fnal.gov [169] 13:41:12 [SUCCESS] cmsstor364.fnal.gov [170] 13:41:12 [SUCCESS] cmsstor362.fnal.gov [171] 13:41:12 [SUCCESS] cmsstor363.fnal.gov [172] 13:41:12 [SUCCESS] cmsstor360.fnal.gov [173] 13:41:12 [SUCCESS] cmsstor361.fnal.gov [174] 13:41:13 [SUCCESS] cmsstor365.fnal.gov [175] 13:41:13 [SUCCESS] cmsstor372.fnal.gov [176] 13:41:13 [SUCCESS] cmsstor366.fnal.gov [177] 13:41:13 [SUCCESS] cmsstor371.fnal.gov [178] 13:41:13 [SUCCESS] cmsstor373.fnal.gov [179] 13:41:13 [SUCCESS] cmsstor374.fnal.gov [180] 13:41:14 [SUCCESS] cmsstor375.fnal.gov [181] 13:41:14 [SUCCESS] cmsstor402.fnal.gov [182] 13:41:14 [SUCCESS] cmsstor401.fnal.gov [183] 13:41:14 [SUCCESS] cmsstor404.fnal.gov [184] 13:41:14 [SUCCESS] cmsstor410.fnal.gov [185] 13:41:14 [SUCCESS] cmsstor406.fnal.gov [186] 13:41:14 [SUCCESS] cmsstor409.fnal.gov [187] 13:41:15 [SUCCESS] cmsstor403.fnal.gov [188] 13:41:15 [SUCCESS] cmsstor408.fnal.gov [189] 13:41:15 [SUCCESS] cmsstor405.fnal.gov [190] 13:41:15 [SUCCESS] cmsstor407.fnal.gov [191] 13:41:15 [SUCCESS] cmsstor376.fnal.gov [root@cmsadmin1 Aug-26-2015]# [root@cmsadmin1 Aug-26-2015]# ls -l check_pools.err/ | grep -v "0 Aug" total 40 -rw-r--r-- 1 root root 68 Aug 26 13:40 cmsstor169.fnal.gov -rw-r--r-- 1 root root 76 Aug 26 13:40 cmsstor214.fnal.gov -rw-r--r-- 1 root root 76 Aug 26 13:40 cmsstor217.fnal.gov -rw-r--r-- 1 root root 76 Aug 26 13:40 cmsstor227.fnal.gov -rw-r--r-- 1 root root 76 Aug 26 13:40 cmsstor245.fnal.gov -rw-r--r-- 1 root root 76 Aug 26 13:40 cmsstor274.fnal.gov -rw-r--r-- 1 root root 76 Aug 26 13:40 cmsstor284.fnal.gov -rw-r--r-- 1 root root 76 Aug 26 13:40 cmsstor318.fnal.gov -rw-r--r-- 1 root root 76 Aug 26 13:40 cmsstor321.fnal.gov -rw-r--r-- 1 root root 76 Aug 26 13:41 cmsstor347.fnal.gov [root@cmsadmin1 Aug-26-2015]# ls -l check_pools.err/ | grep -v "0 Aug" > non-zero-error.pools [root@cmsadmin1 Aug-26-2015]# ls -l check_pools.err/ | grep -v "0 Aug" | awk '{print $NF}'> non-zero-error.pools [root@cmsadmin1 Aug-26-2015]# cat check_pools.err/cmsstor227.fnal.gov Warning: No xauth data; using fake authentication data for X11 forwarding. [root@cmsadmin1 Aug-26-2015]# cat check_pools.err/cmsstor169.fnal.gov ssh: connect to host cmsstor169.fnal.gov port 22: No route to host [root@cmsadmin1 Aug-26-2015]# for f in check_pools.log/*; do tail -1 $f; done | sort -u 2.2.29
#10 Updated by Natalia Ratnikova over 5 years ago
Power cycle cmsstor169 which came up on reboot in maintanance mode.
#11 Updated by Natalia Ratnikova over 5 years ago
Started agents . Sent email to cms-t1
Clearing any remaining alarms in check_mk for dcache disk group .
Natalia Ratnikova:
I see CRIT errors in check_mk for some pools:
CRIT - 32 CRIT messages (Last worst: "Aug 26 13:36:53 cmsstor261 kernel: ACPI Error: No handler for Region [SACS] (ffff8810252c3420) [PCI_Config] (20090903/evregion-331)")
and some warnings about wrong eth speed
Gerard Bernabeu:
that’s a ‘discardable’ error that this thing complains about after each reboot... You can clean them all
the network thing needs some more attanteion
I
I’m fixing the same on some tape pools by doing ifdown; ifup
#12 Updated by Natalia Ratnikova over 5 years ago
Test fix suggested by Gerard on cmsstor168 (need to give bond0 interface as an argument )
Rerun check to make sure alarm is gone.
get list of pool servers with warnings from check_mk - manually :
root@cmsadmin1:/root/natalia/Aug-26-2015/wrong_speed.list
Construct pssh for those pools
[root@cmsadmin1 Aug-26-2015]# pssh -h wrong_speed.list -l root -t0 -p 12 -o fix_speed.log -e fix_speed.err 'ifdown bond0; ifup bond0'
[1] 16:23:46 [SUCCESS] cmsstor168.fnal.gov
[2] 16:23:47 [SUCCESS] cmsstor182.fnal.gov
[3] 16:23:47 [SUCCESS] cmsstor210.fnal.gov
[4] 16:23:47 [SUCCESS] cmsstor234.fnal.gov
[5] 16:23:47 [SUCCESS] cmsstor208.fnal.gov
[6] 16:23:47 [SUCCESS] cmsstor220.fnal.gov
[7] 16:23:48 [SUCCESS] cmsstor223.fnal.gov
[8] 16:23:48 [SUCCESS] cmsstor243.fnal.gov
[9] 16:23:48 [SUCCESS] cmsstor226.fnal.gov
[10] 16:23:48 [SUCCESS] cmsstor247.fnal.gov
[11] 16:23:48 [SUCCESS] cmsstor230.fnal.gov
[12] 16:23:49 [SUCCESS] cmsstor207.fnal.gov
#13 Updated by Natalia Ratnikova over 5 years ago
- Status changed from Assigned to Resolved