Project

General

Profile

Task #10596

Perform burn-in and add cmsstor411-420 in dCache-disk

Added by Gerard Bernabeu Altayo about 4 years ago. Updated about 4 years ago.

Status:
Resolved
Priority:
Normal
Start date:
10/20/2015
Due date:
10/27/2015
% Done:

100%

Estimated time:
8.00 h
Spent time:
Duration: 8

Description

Hi Chih-Hao,

I'd like you to follow the procedures described at https://cmsweb.fnal.gov/bin/view/Storage/PoolAdd to add 9 dCache pools dcache-tape: cmsstor411-420.

This will still leave 12 nodes out of production (probably for our new dCache version testbed).

Please make sure that you do the Pool burning tests before adding this nodes in production.

If you want feel free to improve the procedures as you follow them (eg: to make them more copy&paste friendly).

Thanks,
Gerard

History

#1 Updated by Chih-Hao Huang about 4 years ago

  • Tracker changed from Bug to Task
  • Status changed from New to Assigned
  • % Done changed from 0 to 10

Reshooting cmsstor411 - cmsstor420 to disk-itb.
Encounter a problem that disk-itb configures system assuming three disks (three pools) while these nodes have only two disks (pools).

#2 Updated by Chih-Hao Huang about 4 years ago

make cmsstor411.fnal.gov.yaml a link to cmsstor401.fnal.gov.yaml and re-shot cmsstor411.
Did it twice to make it right.
Will do the rest in the same way.

#3 Updated by Chih-Hao Huang about 4 years ago

  • % Done changed from 10 to 20

Applied for Grid Admin access.
Created host certificates for them.

#4 Updated by Chih-Hao Huang about 4 years ago

  • % Done changed from 20 to 70

[1] reconfigure them again, using disk_itb/production, set correct check_mk roles
[2] create a branch and put cmsstor152 on it so that, all cmsstor411 - cmsstor420 are in burninPools
[3] started burn-in test

Perhaps, we should have a semi-permanent test branch which would not go into itb and would not be affected by other procedural constraint.

#5 Updated by Chih-Hao Huang about 4 years ago

  • % Done changed from 70 to 80

did migration copy from cmsstor155 to all cmsstor411-420.
5TB at about 240MB/s, took about 6 hours to finish. No errors..
Now is running migration copy from each of the pools to others.

#6 Updated by Gerard Bernabeu Altayo about 4 years ago

Hi,

I wanted to do some performance test on one of the machines and I'm finding very weird things on cmsstor411, looks like something is not really properly setup. Did you have to do anything that was not in the procedure? Do you have a log of the exact commands anywhere?

I'm finding potential disk label corruption:

mount: special device LABEL=dcache-disk1 does not exist
mount: special device LABEL=dcache-disk1 does not exist
mount: special device LABEL=dcache-disk1 does not exist
mount: special device LABEL=dcache-disk1 does not exist
[root@cmsstor411 ~]# xfs_admin -l /dev/sdb
warning: AG 1 label differs
warning: AG 2 label differs
warning: AG 3 label differs
warning: AG 4 label differs
warning: AG 5 label differs
warning: AG 6 label differs
warning: AG 7 label differs
warning: AG 8 label differs
warning: AG 9 label differs
warning: AG 10 label differs
warning: AG 11 label differs
warning: AG 12 label differs
warning: AG 13 label differs
warning: AG 14 label differs
warning: AG 15 label differs
warning: AG 16 label differs
warning: AG 17 label differs
warning: AG 18 label differs
warning: AG 19 label differs
warning: AG 20 label differs
warning: AG 21 label differs
warning: AG 22 label differs
warning: AG 23 label differs
warning: AG 24 label differs
warning: AG 25 label differs
warning: AG 26 label differs
warning: AG 27 label differs
warning: AG 28 label differs
warning: AG 29 label differs
warning: AG 30 label differs
warning: AG 31 label differs
warning: AG 32 label differs
warning: AG 33 label differs
warning: AG 34 label differs
warning: AG 35 label differs
warning: AG 36 label differs
warning: AG 37 label differs
warning: AG 38 label differs
warning: AG 39 label differs
warning: AG 40 label differs
warning: AG 41 label differs
warning: AG 42 label differs
warning: AG 43 label differs
warning: AG 44 label differs
warning: AG 45 label differs
warning: AG 46 label differs
warning: AG 47 label differs
warning: AG 48 label differs
warning: AG 49 label differs
warning: AG 50 label differs
warning: AG 51 label differs
warning: AG 52 label differs
warning: AG 53 label differs
warning: AG 54 label differs
warning: AG 55 label differs
warning: AG 56 label differs
warning: AG 57 label differs
warning: AG 58 label differs
warning: AG 59 label differs
warning: AG 60 label differs
warning: AG 61 label differs
warning: AG 62 label differs
warning: AG 63 label differs
warning: AG 64 label differs
warning: AG 65 label differs
warning: AG 66 label differs
warning: AG 67 label differs
warning: AG 68 label differs
warning: AG 69 label differs
warning: AG 70 label differs
label = "dcache-disk1"

This could be a similar issue like the last time (overlaping LUNs). Or maybe the FS were created wrong... We must sit and look at this before continue. I see the same on cmsstor412...

And the setup is not quite right:
[root@cmsstor411 ~]# ll /storage/data2/write-pool/
total 16
drwxr-xr-x 2 root root 12288 Oct 30 11:26 data
drwxr-xr-x 2 root root 70 Oct 30 11:26 meta
lrwxrwxrwx 1 root root 10 Oct 26 17:39 setup -> setup-disk

The 'setup' file should NOT be a link... :/ And even less a link to a non-existent file!

#7 Updated by Chih-Hao Huang about 4 years ago

On cmsadmin1

sh-4.1$ cat newpools
cmsstor411
cmsstor412
cmsstor413
cmsstor414
cmsstor415
cmsstor416
cmsstor417
cmsstor418
cmsstor419
cmsstor420
sh-4.1$ cat before_reshooting.sh
#!/bin/bash

cmd="puppet agent --disable 'pre reshoot'; service dcache-server stop; umount -a; mkfs.xfs -f -L dcache-disk1 /dev/sdb; mkfs.xfs -f -L dcache-disk2 /dev/sdc; mount -a; df -h"

for i in $*
do
echo $i:
echo ssh -l root $i $cmd
ssh -l root $i $cmd
done
sh-4.1$
sh-4.1$ ./before_reshooting.sh `cat newpools` > before_reshooting.log 2>$1
sh-4.1$ cat before_reshooting.log
cmsstor411:
ssh -l root cmsstor411 puppet agent --disable 'pre reshoot'; service dcache-server stop; umount -a; mkfs.xfs -f -L dcache-disk1 /dev/sdb; mkfs.xfs -f -L dcache-disk2 /dev/sdc; mount -a; df -h
umount: /: device is busy.
(In some cases useful info about processes that use
the device is found by lsof(8) or fuser(1))
meta-data=/dev/sdb isize=256 agcount=71, agsize=268435455 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=18852959008, imaxpct=1 = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
meta-data=/dev/sdc isize=256 agcount=71, agsize=268435455 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=18852959008, imaxpct=1 = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 901G 2.6G 852G 1% /
/dev/sda1 976M 41M 884M 5% /boot
tmpfs 32G 0 32G 0% /dev/shm
/dev/sdb 71T 35M 71T 1% /storage/data1
/dev/sdc 71T 35M 71T 1% /storage/data2
cmsstor412:
ssh -l root cmsstor412 puppet agent --disable 'pre reshoot'; service dcache-server stop; umount -a; mkfs.xfs -f -L dcache-disk1 /dev/sdb; mkfs.xfs -f -L dcache-disk2 /dev/sdc; mount -a; df -h
Stopping gridftp-cmsstor412Domain 0 done
Stopping w-cmsstor412-disk_itb-disk2Domain 0 1 done
Stopping w-cmsstor412-disk_itb-disk1Domain 0 1 done
umount: /: device is busy.
(In some cases useful info about processes that use
the device is found by lsof(8) or fuser(1))
meta-data=/dev/sdb isize=256 agcount=71, agsize=268435455 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=18852959008, imaxpct=1 = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
meta-data=/dev/sdc isize=256 agcount=71, agsize=268435455 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=18852959008, imaxpct=1 = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
mount: mount point /storage/data3 does not exist
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 901G 3.1G 852G 1% /
/dev/sda1 976M 41M 884M 5% /boot
tmpfs 32G 0 32G 0% /dev/shm
/dev/sdb 71T 35M 71T 1% /storage/data1
/dev/sdc 71T 35M 71T 1% /storage/data2
cmsstor413:
ssh -l root cmsstor413 puppet agent --disable 'pre reshoot'; service dcache-server stop; umount -a; mkfs.xfs -f -L dcache-disk1 /dev/sdb; mkfs.xfs -f -L dcache-disk2 /dev/sdc; mount -a; df -h
Stopping gridftp-cmsstor413Domain 0 done
Stopping w-cmsstor413-disk_itb-disk2Domain 0 1 done
Stopping w-cmsstor413-disk_itb-disk1Domain 0 1 done
umount: /: device is busy.
(In some cases useful info about processes that use
the device is found by lsof(8) or fuser(1))
meta-data=/dev/sdb isize=256 agcount=71, agsize=268435455 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=18852959008, imaxpct=1 = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
meta-data=/dev/sdc isize=256 agcount=71, agsize=268435455 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=18852959008, imaxpct=1 = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 901G 2.2G 853G 1% /
/dev/sda1 976M 41M 884M 5% /boot
tmpfs 32G 0 32G 0% /dev/shm
/dev/sdb 71T 35M 71T 1% /storage/data1
/dev/sdc 71T 35M 71T 1% /storage/data2
cmsstor414:
ssh -l root cmsstor414 puppet agent --disable 'pre reshoot'; service dcache-server stop; umount -a; mkfs.xfs -f -L dcache-disk1 /dev/sdb; mkfs.xfs -f -L dcache-disk2 /dev/sdc; mount -a; df -h
Stopping gridftp-cmsstor414Domain 0 done
Stopping w-cmsstor414-disk_itb-disk2Domain 0 1 2 3 done
Stopping w-cmsstor414-disk_itb-disk1Domain 0 1 2 3 done
umount: /: device is busy.
(In some cases useful info about processes that use
the device is found by lsof(8) or fuser(1))
meta-data=/dev/sdb isize=256 agcount=71, agsize=268435455 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=18852959008, imaxpct=1 = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
meta-data=/dev/sdc isize=256 agcount=71, agsize=268435455 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=18852959008, imaxpct=1 = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
mount: mount point /storage/data3 does not exist
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 901G 3.1G 852G 1% /
/dev/sda1 976M 41M 884M 5% /boot
tmpfs 32G 0 32G 0% /dev/shm
/dev/sdb 71T 35M 71T 1% /storage/data1
/dev/sdc 71T 35M 71T 1% /storage/data2
cmsstor415:
ssh -l root cmsstor415 puppet agent --disable 'pre reshoot'; service dcache-server stop; umount -a; mkfs.xfs -f -L dcache-disk1 /dev/sdb; mkfs.xfs -f -L dcache-disk2 /dev/sdc; mount -a; df -h
Stopping gridftp-cmsstor415Domain 0 1 done
Stopping w-cmsstor415-disk_itb-disk2Domain 0 1 2 3 done
Stopping w-cmsstor415-disk_itb-disk1Domain 0 1 2 3 done
umount: /: device is busy.
(In some cases useful info about processes that use
the device is found by lsof(8) or fuser(1))
meta-data=/dev/sdb isize=256 agcount=71, agsize=268435455 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=18852959008, imaxpct=1 = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
meta-data=/dev/sdc isize=256 agcount=71, agsize=268435455 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=18852959008, imaxpct=1 = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
mount: mount point /storage/data3 does not exist
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 901G 3.1G 852G 1% /
/dev/sda1 976M 41M 884M 5% /boot
tmpfs 32G 0 32G 0% /dev/shm
/dev/sdb 71T 35M 71T 1% /storage/data1
/dev/sdc 71T 35M 71T 1% /storage/data2
cmsstor416:
ssh -l root cmsstor416 puppet agent --disable 'pre reshoot'; service dcache-server stop; umount -a; mkfs.xfs -f -L dcache-disk1 /dev/sdb; mkfs.xfs -f -L dcache-disk2 /dev/sdc; mount -a; df -h
Stopping gridftp-cmsstor416Domain 0 done
Stopping w-cmsstor416-disk_itb-disk2Domain 0 1 2 3 done
Stopping w-cmsstor416-disk_itb-disk1Domain 0 1 2 3 done
umount: /: device is busy.
(In some cases useful info about processes that use
the device is found by lsof(8) or fuser(1))
meta-data=/dev/sdb isize=256 agcount=71, agsize=268435455 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=18852959008, imaxpct=1 = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
meta-data=/dev/sdc isize=256 agcount=71, agsize=268435455 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=18852959008, imaxpct=1 = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
mount: mount point /storage/data3 does not exist
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 901G 3.1G 852G 1% /
/dev/sda1 976M 41M 884M 5% /boot
tmpfs 32G 0 32G 0% /dev/shm
/dev/sdb 71T 35M 71T 1% /storage/data1
/dev/sdc 71T 35M 71T 1% /storage/data2
cmsstor417:
ssh -l root cmsstor417 puppet agent --disable 'pre reshoot'; service dcache-server stop; umount -a; mkfs.xfs -f -L dcache-disk1 /dev/sdb; mkfs.xfs -f -L dcache-disk2 /dev/sdc; mount -a; df -h
umount: /: device is busy.
(In some cases useful info about processes that use
the device is found by lsof(8) or fuser(1))
meta-data=/dev/sdb isize=256 agcount=71, agsize=268435455 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=18852959008, imaxpct=1 = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
meta-data=/dev/sdc isize=256 agcount=71, agsize=268435455 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=18852959008, imaxpct=1 = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
mount: mount point /storage/data3 does not exist
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 901G 3.1G 852G 1% /
/dev/sda1 976M 41M 884M 5% /boot
tmpfs 32G 0 32G 0% /dev/shm
/dev/sdb 71T 35M 71T 1% /storage/data1
/dev/sdc 71T 35M 71T 1% /storage/data2
cmsstor418:
ssh -l root cmsstor418 puppet agent --disable 'pre reshoot'; service dcache-server stop; umount -a; mkfs.xfs -f -L dcache-disk1 /dev/sdb; mkfs.xfs -f -L dcache-disk2 /dev/sdc; mount -a; df -h
Stopping gridftp-cmsstor418Domain 0 1 done
Stopping w-cmsstor418-disk_itb-disk2Domain 0 1 2 3 done
Stopping w-cmsstor418-disk_itb-disk1Domain 0 1 2 3 done
umount: /: device is busy.
(In some cases useful info about processes that use
the device is found by lsof(8) or fuser(1))
meta-data=/dev/sdb isize=256 agcount=71, agsize=268435455 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=18852959008, imaxpct=1 = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
meta-data=/dev/sdc isize=256 agcount=71, agsize=268435455 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=18852959008, imaxpct=1 = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
mount: mount point /storage/data3 does not exist
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 901G 3.1G 852G 1% /
/dev/sda1 976M 41M 884M 5% /boot
tmpfs 32G 0 32G 0% /dev/shm
/dev/sdb 71T 35M 71T 1% /storage/data1
/dev/sdc 71T 35M 71T 1% /storage/data2
cmsstor419:
ssh -l root cmsstor419 puppet agent --disable 'pre reshoot'; service dcache-server stop; umount -a; mkfs.xfs -f -L dcache-disk1 /dev/sdb; mkfs.xfs -f -L dcache-disk2 /dev/sdc; mount -a; df -h
Stopping gridftp-cmsstor419Domain 0 1 done
Stopping w-cmsstor419-disk_itb-disk2Domain 0 1 2 3 done
Stopping w-cmsstor419-disk_itb-disk1Domain 0 1 2 3 done
umount: /: device is busy.
(In some cases useful info about processes that use
the device is found by lsof(8) or fuser(1))
meta-data=/dev/sdb isize=256 agcount=71, agsize=268435455 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=18852959008, imaxpct=1 = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
meta-data=/dev/sdc isize=256 agcount=71, agsize=268435455 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=18852959008, imaxpct=1 = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
mount: mount point /storage/data3 does not exist
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 901G 3.1G 852G 1% /
/dev/sda1 976M 41M 884M 5% /boot
tmpfs 32G 0 32G 0% /dev/shm
/dev/sdb 71T 35M 71T 1% /storage/data1
/dev/sdc 71T 35M 71T 1% /storage/data2
cmsstor420:
ssh -l root cmsstor420 puppet agent --disable 'pre reshoot'; service dcache-server stop; umount -a; mkfs.xfs -f -L dcache-disk1 /dev/sdb; mkfs.xfs -f -L dcache-disk2 /dev/sdc; mount -a; df -h
Stopping gridftp-cmsstor420Domain 0 1 done
Stopping w-cmsstor420-disk_itb-disk2Domain 0 1 2 3 done
Stopping w-cmsstor420-disk_itb-disk1Domain 0 1 2 3 done
umount: /: device is busy.
(In some cases useful info about processes that use
the device is found by lsof(8) or fuser(1))
meta-data=/dev/sdb isize=256 agcount=71, agsize=268435455 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=18852959008, imaxpct=1 = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
meta-data=/dev/sdc isize=256 agcount=71, agsize=268435455 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=18852959008, imaxpct=1 = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
mount: mount point /storage/data3 does not exist
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 901G 3.1G 852G 1% /
/dev/sda1 976M 41M 884M 5% /boot
tmpfs 32G 0 32G 0% /dev/shm
/dev/sdb 71T 35M 71T 1% /storage/data1
/dev/sdc 71T 35M 71T 1% /storage/data2
sh-4.1$

NOTE: "/ is busy", though annoying, is OK in this purpose.

#8 Updated by Chih-Hao Huang about 4 years ago

sh-4.1$ cat newpools
cmsstor411
cmsstor412
cmsstor413
cmsstor414
cmsstor415
cmsstor416
cmsstor417
cmsstor418
cmsstor419
cmsstor420
sh-4.1$ cat check_label.sh
#!/bin/bash

cmd="xfs_admin -l /dev/sdb; xfs_admin -l /dev/sdc"

for i in $*
do
echo $i:
echo ssh -l root $i $cmd
ssh -l root $i $cmd
done
sh-4.1$ ./check_label.sh `cat newpools` > check_label.log 2>&1
sh-4.1$ cat check_label.log
cmsstor411:
ssh -l root cmsstor411 xfs_admin -l /dev/sdb; xfs_admin -l /dev/sdc
label = "dcache-disk1"
label = "dcache-disk2"
cmsstor412:
ssh -l root cmsstor412 xfs_admin -l /dev/sdb; xfs_admin -l /dev/sdc
label = "dcache-disk1"
label = "dcache-disk2"
cmsstor413:
ssh -l root cmsstor413 xfs_admin -l /dev/sdb; xfs_admin -l /dev/sdc
label = "dcache-disk1"
label = "dcache-disk2"
cmsstor414:
ssh -l root cmsstor414 xfs_admin -l /dev/sdb; xfs_admin -l /dev/sdc
label = "dcache-disk1"
label = "dcache-disk2"
cmsstor415:
ssh -l root cmsstor415 xfs_admin -l /dev/sdb; xfs_admin -l /dev/sdc
label = "dcache-disk1"
label = "dcache-disk2"
cmsstor416:
ssh -l root cmsstor416 xfs_admin -l /dev/sdb; xfs_admin -l /dev/sdc
label = "dcache-disk1"
label = "dcache-disk2"
cmsstor417:
ssh -l root cmsstor417 xfs_admin -l /dev/sdb; xfs_admin -l /dev/sdc
label = "dcache-disk1"
label = "dcache-disk2"
cmsstor418:
ssh -l root cmsstor418 xfs_admin -l /dev/sdb; xfs_admin -l /dev/sdc
label = "dcache-disk1"
label = "dcache-disk2"
cmsstor419:
ssh -l root cmsstor419 xfs_admin -l /dev/sdb; xfs_admin -l /dev/sdc
label = "dcache-disk1"
label = "dcache-disk2"
cmsstor420:
ssh -l root cmsstor420 xfs_admin -l /dev/sdb; xfs_admin -l /dev/sdc
label = "dcache-disk1"
label = "dcache-disk2"
sh-4.1$

#9 Updated by Chih-Hao Huang about 4 years ago

sh-4.1$ cat newpools
cmsstor411
cmsstor412
cmsstor413
cmsstor414
cmsstor415
cmsstor416
cmsstor417
cmsstor418
cmsstor419
cmsstor420
sh-4.1$ cat check_empty_disk.sh
#!/bin/bash

cmd="ls -lR /storage/*"

for i in $*
do
echo $i:
echo ssh -l root $i $cmd
ssh -l root $i $cmd
done
sh-4.1$ ./check_empty_disk.sh `cat newpools` > check_empty_disk.log 2>&1
sh-4.1$ cat check_empty_disk.log
cmsstor411:
ssh -l root cmsstor411 ls -lR /storage/*
/storage/data1:
total 0

/storage/data2:
total 0
cmsstor412:
ssh -l root cmsstor412 ls -lR /storage/*
/storage/data1:
total 0

/storage/data2:
total 0
cmsstor413:
ssh -l root cmsstor413 ls -lR /storage/*
/storage/data1:
total 0

/storage/data2:
total 0
cmsstor414:
ssh -l root cmsstor414 ls -lR /storage/*
/storage/data1:
total 0

/storage/data2:
total 0
cmsstor415:
ssh -l root cmsstor415 ls -lR /storage/*
/storage/data1:
total 0

/storage/data2:
total 0
cmsstor416:
ssh -l root cmsstor416 ls -lR /storage/*
/storage/data1:
total 0
drwxr-xr-x 4 root root 71 Nov 2 14:22 write-pool

/storage/data1/write-pool:
total 0
drwxr-xr-x 2 root root 38 Nov 2 14:53 data
rw-r--r- 1 root root 0 Nov 2 14:22 lock
drwxr-xr-x 2 root root 122 Nov 2 14:53 meta
lrwxrwxrwx 1 root root 10 Nov 2 14:21 setup -> setup-disk

/storage/data1/write-pool/data:
total 0

/storage/data1/write-pool/meta:
total 4
rw-r--r- 1 root root 1396 Nov 2 14:22 00000000.jdb
rw-r--r- 1 root root 0 Nov 2 14:22 je.info.0
rw-r--r- 1 root root 0 Nov 2 14:22 je.info.0.lck
rw-r--r- 1 root root 0 Nov 2 14:22 je.lck

/storage/data2:
total 0
drwxr-xr-x 4 root root 71 Nov 2 14:22 write-pool

/storage/data2/write-pool:
total 0
drwxr-xr-x 2 root root 38 Nov 2 14:53 data
rw-r--r- 1 root root 0 Nov 2 14:22 lock
drwxr-xr-x 2 root root 122 Nov 2 14:53 meta
lrwxrwxrwx 1 root root 10 Nov 2 14:21 setup -> setup-disk

/storage/data2/write-pool/data:
total 0

/storage/data2/write-pool/meta:
total 4
rw-r--r- 1 root root 1396 Nov 2 14:22 00000000.jdb
rw-r--r- 1 root root 0 Nov 2 14:22 je.info.0
rw-r--r- 1 root root 0 Nov 2 14:22 je.info.0.lck
rw-r--r- 1 root root 0 Nov 2 14:22 je.lck
cmsstor417:
ssh -l root cmsstor417 ls -lR /storage/*
/storage/data1:
total 0

/storage/data2:
total 0
cmsstor418:
ssh -l root cmsstor418 ls -lR /storage/*
/storage/data1:
total 0

/storage/data2:
total 0
cmsstor419:
ssh -l root cmsstor419 ls -lR /storage/*
/storage/data1:
total 0

/storage/data2:
total 0
cmsstor420:
ssh -l root cmsstor420 ls -lR /storage/*
/storage/data1:
total 0

/storage/data2:
total 0
sh-4.1$

NOTE: cmsstor416 is NOT clean.

#10 Updated by Chih-Hao Huang about 4 years ago

sh-4.1$ before_reshooting.sh cmsstor416 > before_reshooting-cmsstor416.log 2>& 1
sh-4.1$ cat before_reshooting-cmsstor416.log
cmsstor416:
ssh -l root cmsstor416 puppet agent --disable 'pre reshoot'; service dcache-server stop; umount -a; mkfs.xfs -f -L dcache-disk1 /dev/sdb; mkfs.xfs -f -L dcache-disk2 /dev/sdc; mount -a; df -h
Stopping gridftp-cmsstor416Domain 0 done
Stopping w-cmsstor416-disk_itb-disk2Domain 0 1 done
Stopping w-cmsstor416-disk_itb-disk1Domain 0 1 done
umount: /: device is busy.
(In some cases useful info about processes that use
the device is found by lsof(8) or fuser(1))
meta-data=/dev/sdb isize=256 agcount=71, agsize=268435455 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=18852959008, imaxpct=1 = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
meta-data=/dev/sdc isize=256 agcount=71, agsize=268435455 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=18852959008, imaxpct=1 = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
mount: mount point /storage/data3 does not exist
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 901G 3.1G 852G 1% /
/dev/sda1 976M 41M 884M 5% /boot
tmpfs 32G 0 32G 0% /dev/shm
/dev/sdb 71T 35M 71T 1% /storage/data1
/dev/sdc 71T 35M 71T 1% /storage/data2
sh-4.1$ check_empty_disk.sh cmsstor416 > check_empty_disk-cmsstor416.log 2>&1
sh-4.1$ cat check_empty_disk-cmsstor416.log
cmsstor416:
ssh -l root cmsstor416 ls -lR /storage/*
/storage/data1:
total 0

/storage/data2:
total 0
sh-4.1$

#11 Updated by Chih-Hao Huang about 4 years ago

sh-4.1$ ./check_empty_disk.sh `cat newpools` > check_empty_disk.log 2>&1
sh-4.1$ cat check_empty_disk.log
cmsstor411:
ssh -l root cmsstor411 ls -lR /storage/*
/storage/data1:
total 0

/storage/data2:
total 0
cmsstor412:
ssh -l root cmsstor412 ls -lR /storage/*
/storage/data1:
total 0

/storage/data2:
total 0
cmsstor413:
ssh -l root cmsstor413 ls -lR /storage/*
/storage/data1:
total 0

/storage/data2:
total 0
cmsstor414:
ssh -l root cmsstor414 ls -lR /storage/*
/storage/data1:
total 0

/storage/data2:
total 0
cmsstor415:
ssh -l root cmsstor415 ls -lR /storage/*
/storage/data1:
total 0

/storage/data2:
total 0
cmsstor416:
ssh -l root cmsstor416 ls -lR /storage/*
/storage/data1:
total 0

/storage/data2:
total 0
cmsstor417:
ssh -l root cmsstor417 ls -lR /storage/*
/storage/data1:
total 0

/storage/data2:
total 0
cmsstor418:
ssh -l root cmsstor418 ls -lR /storage/*
/storage/data1:
total 0

/storage/data2:
total 0
cmsstor419:
ssh -l root cmsstor419 ls -lR /storage/*
/storage/data1:
total 0

/storage/data2:
total 0
cmsstor420:
ssh -l root cmsstor420 ls -lR /storage/*
/storage/data1:
total 0

/storage/data2:
total 0
sh-4.1$

Now it is fine.

#12 Updated by Chih-Hao Huang about 4 years ago

In ENC, put all nodes to role::dcache::pool::disk: / production

sh-3.2$ for i in `seq 411 420`; do git add cmsstor$i.fnal.gov.yaml; done
sh-3.2$ git commit
[master 23fd083] Put them back to role::dcache::pool::disk:
10 files changed, 10 insertions(), 11 deletions()
sh-3.2$ git pull
Already up-to-date.
sh-3.2$ git push
Counting objects: 7, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 366 bytes | 0 bytes/s, done.
Total 3 (delta 2), reused 0 (delta 0)
remote:
remote: diff-tree:
remote: :100644 100644 802911d88f8fafd4deb88e6fe9589832f60564f7 f2dc9bcd9f6aab69749bd7a30b141b8f66be8941 M hosts/cmsstor411.fnal.gov.yaml
remote: :100644 100644 802911d88f8fafd4deb88e6fe9589832f60564f7 f2dc9bcd9f6aab69749bd7a30b141b8f66be8941 M hosts/cmsstor412.fnal.gov.yaml
remote: :100644 100644 802911d88f8fafd4deb88e6fe9589832f60564f7 f2dc9bcd9f6aab69749bd7a30b141b8f66be8941 M hosts/cmsstor413.fnal.gov.yaml
remote: :100644 100644 802911d88f8fafd4deb88e6fe9589832f60564f7 f2dc9bcd9f6aab69749bd7a30b141b8f66be8941 M hosts/cmsstor414.fnal.gov.yaml
remote: :100644 100644 60d301e9adf3197622f2530a4fedf4779660a7c1 f2dc9bcd9f6aab69749bd7a30b141b8f66be8941 M hosts/cmsstor415.fnal.gov.yaml
remote: :100644 100644 802911d88f8fafd4deb88e6fe9589832f60564f7 f2dc9bcd9f6aab69749bd7a30b141b8f66be8941 M hosts/cmsstor416.fnal.gov.yaml
remote: :100644 100644 802911d88f8fafd4deb88e6fe9589832f60564f7 f2dc9bcd9f6aab69749bd7a30b141b8f66be8941 M hosts/cmsstor417.fnal.gov.yaml
remote: :100644 100644 802911d88f8fafd4deb88e6fe9589832f60564f7 f2dc9bcd9f6aab69749bd7a30b141b8f66be8941 M hosts/cmsstor418.fnal.gov.yaml
remote: :100644 100644 802911d88f8fafd4deb88e6fe9589832f60564f7 f2dc9bcd9f6aab69749bd7a30b141b8f66be8941 M hosts/cmsstor419.fnal.gov.yaml
remote: :100644 100644 802911d88f8fafd4deb88e6fe9589832f60564f7 f2dc9bcd9f6aab69749bd7a30b141b8f66be8941 M hosts/cmsstor420.fnal.gov.yaml
remote: omd-host-crud update cmsstor411 --role pool --instance disk --extra 'UNSET'
remote: error on update: Hostname cmsstor411 does not exist
remote: omd-host-crud update cmsstor412 --role pool --instance disk --extra 'UNSET'
remote: error on update: Hostname cmsstor412 does not exist
remote: omd-host-crud update cmsstor413 --role pool --instance disk --extra 'UNSET'
remote: error on update: Hostname cmsstor413 does not exist
remote: omd-host-crud update cmsstor414 --role pool --instance disk --extra 'UNSET'
remote: error on update: Hostname cmsstor414 does not exist
remote: omd-host-crud update cmsstor415 --role pool --instance disk --extra 'UNSET'
remote: error on update: Hostname cmsstor415 does not exist
remote: omd-host-crud update cmsstor416 --role pool --instance disk --extra 'UNSET'
remote: error on update: Hostname cmsstor416 does not exist
remote: omd-host-crud update cmsstor417 --role pool --instance disk --extra 'UNSET'
remote: error on update: Hostname cmsstor417 does not exist
remote: omd-host-crud update cmsstor418 --role pool --instance disk --extra 'UNSET'
remote: error on update: Hostname cmsstor418 does not exist
remote: omd-host-crud update cmsstor419 --role pool --instance disk --extra 'UNSET'
remote: error on update: Hostname cmsstor419 does not exist
remote: omd-host-crud update cmsstor420 --role pool --instance disk --extra 'UNSET'
remote: error on update: Hostname cmsstor420 does not exist
remote: Recieved from stdin:
remote: oldrev: 914b16c47cffb551c64f9d8c63772c2a1601a20d
remote: newrev: 23fd0838c48ac8c555badd7734ac7bb456eb3044
remote: refname: refs/heads/master
remote: Derived Configuration:
remote: REPO: puppet@cms-git:/var/lib/puppet/enc.git
remote: BRANCH: master
remote: BRANCH_DIR: /srv/puppet/enc
remote: PUPPET_SERVERS:
remote: Updating remote branch /srv/puppet/enc/master on
remote: From cms-git:/var/lib/puppet/enc
remote: * branch master -> FETCH_HEAD
remote: Updating 914b16c..23fd083
remote: Fast-forward
remote: hosts/cmsstor411.fnal.gov.yaml | 2 +

remote: hosts/cmsstor412.fnal.gov.yaml | 2 +
remote: hosts/cmsstor413.fnal.gov.yaml | 2 +

remote: hosts/cmsstor414.fnal.gov.yaml | 2 +
remote: hosts/cmsstor415.fnal.gov.yaml | 3 +-

remote: hosts/cmsstor416.fnal.gov.yaml | 2 +
remote: hosts/cmsstor417.fnal.gov.yaml | 2 +

remote: hosts/cmsstor418.fnal.gov.yaml | 2 +
remote: hosts/cmsstor419.fnal.gov.yaml | 2 +

remote: hosts/cmsstor420.fnal.gov.yaml | 2 +
remote: 10 files changed, 10 insertions(
), 11 deletions()
remote: Updating remote branch /srv/puppet/enc/master on
remote: From cms-git:/var/lib/puppet/enc
remote: * branch master > FETCH_HEAD
remote: Updating 914b16c..23fd083
remote: Fast-forward
remote: hosts/cmsstor411.fnal.gov.yaml | 2

remote: hosts/cmsstor412.fnal.gov.yaml | 2 +
remote: hosts/cmsstor413.fnal.gov.yaml | 2 +

remote: hosts/cmsstor414.fnal.gov.yaml | 2 +
remote: hosts/cmsstor415.fnal.gov.yaml | 3 +-

remote: hosts/cmsstor416.fnal.gov.yaml | 2 +
remote: hosts/cmsstor417.fnal.gov.yaml | 2 +

remote: hosts/cmsstor418.fnal.gov.yaml | 2 +
remote: hosts/cmsstor419.fnal.gov.yaml | 2 +

remote: hosts/cmsstor420.fnal.gov.yaml | 2 +
remote: 10 files changed, 10 insertions(), 11 deletions(
)
remote: Updating remote branch /srv/puppet/enc/master on
remote: From cms-git:/var/lib/puppet/enc
remote: * branch master > FETCH_HEAD
remote: Updating 914b16c..23fd083
remote: Fast-forward
remote: hosts/cmsstor411.fnal.gov.yaml | 2

remote: hosts/cmsstor412.fnal.gov.yaml | 2 +
remote: hosts/cmsstor413.fnal.gov.yaml | 2 +

remote: hosts/cmsstor414.fnal.gov.yaml | 2 +
remote: hosts/cmsstor415.fnal.gov.yaml | 3 +-

remote: hosts/cmsstor416.fnal.gov.yaml | 2 +
remote: hosts/cmsstor417.fnal.gov.yaml | 2 +

remote: hosts/cmsstor418.fnal.gov.yaml | 2 +
remote: hosts/cmsstor419.fnal.gov.yaml | 2 +

remote: hosts/cmsstor420.fnal.gov.yaml | 2 +
remote: 10 files changed, 10 insertions(), 11 deletions(
)
To :/var/lib/puppet/enc.git
914b16c..23fd083 master -> master
sh-3.2$

On cmsadmin1 as root: set reshoot status

[root@cmsadmin1 ADD-POOL]# for i in `cat newpools`; do cms-shoot $i; done | tee set-reshoot.log
removing host from rocks on cmsrocks51, if necessary
cmsstor24.fnal.gov: no host cmsstor411 to remove
Connection to cmsrocks51 closed.
removing host from rocks on cmsrocks52, if necessary
cmssrv26.fnal.gov: no host cmsstor411 to remove
Connection to cmsrocks52 closed.
stopping puppet on cmsstor411, if applicable
telling host to netboot on next boot
cmsstor411: netboot -> True
set 1 hosts to boot
1 system(s) updated
telling cmspuppetca to remove host's cert, if present
cleaning cert for cmsstor411.fnal.gov
Notice: Revoked certificate with serial 3197
Notice: Removing file Puppet::SSL::Certificate cmsstor411.fnal.gov at '/var/lib/puppet/ssl/ca/signed/cmsstor411.fnal.gov.pem'
Notice: Removing file Puppet::SSL::Certificate cmsstor411.fnal.gov at '/var/lib/puppet/ssl/certs/cmsstor411.fnal.gov.pem'
telling cmspuppetca to update autosign information
when you're ready to start, run:
cmspower-powerit --action cycle --comment 'reinstalling' cmsstor411
don't forget to disable zabbix monitoring if applicable
removing host from rocks on cmsrocks51, if necessary
cmsstor24.fnal.gov: no host cmsstor412 to remove
Connection to cmsrocks51 closed.
removing host from rocks on cmsrocks52, if necessary
cmssrv26.fnal.gov: no host cmsstor412 to remove
Connection to cmsrocks52 closed.
stopping puppet on cmsstor412, if applicable
telling host to netboot on next boot
cmsstor412: netboot -> True
set 1 hosts to boot
1 system(s) updated
telling cmspuppetca to remove host's cert, if present
cleaning cert for cmsstor412.fnal.gov
Notice: Revoked certificate with serial 3185
Notice: Removing file Puppet::SSL::Certificate cmsstor412.fnal.gov at '/var/lib/puppet/ssl/ca/signed/cmsstor412.fnal.gov.pem'
Notice: Removing file Puppet::SSL::Certificate cmsstor412.fnal.gov at '/var/lib/puppet/ssl/certs/cmsstor412.fnal.gov.pem'
telling cmspuppetca to update autosign information
when you're ready to start, run:
cmspower-powerit --action cycle --comment 'reinstalling' cmsstor412
don't forget to disable zabbix monitoring if applicable
removing host from rocks on cmsrocks51, if necessary
cmsstor24.fnal.gov: no host cmsstor413 to remove
Connection to cmsrocks51 closed.
removing host from rocks on cmsrocks52, if necessary
cmssrv26.fnal.gov: no host cmsstor413 to remove
Connection to cmsrocks52 closed.
stopping puppet on cmsstor413, if applicable
telling host to netboot on next boot
cmsstor413: netboot -> True
set 1 hosts to boot
1 system(s) updated
telling cmspuppetca to remove host's cert, if present
cleaning cert for cmsstor413.fnal.gov
Notice: Revoked certificate with serial 3202
Notice: Removing file Puppet::SSL::Certificate cmsstor413.fnal.gov at '/var/lib/puppet/ssl/ca/signed/cmsstor413.fnal.gov.pem'
Notice: Removing file Puppet::SSL::Certificate cmsstor413.fnal.gov at '/var/lib/puppet/ssl/certs/cmsstor413.fnal.gov.pem'
telling cmspuppetca to update autosign information
when you're ready to start, run:
cmspower-powerit --action cycle --comment 'reinstalling' cmsstor413
don't forget to disable zabbix monitoring if applicable
removing host from rocks on cmsrocks51, if necessary
cmsstor24.fnal.gov: no host cmsstor414 to remove
Connection to cmsrocks51 closed.
removing host from rocks on cmsrocks52, if necessary
cmssrv26.fnal.gov: no host cmsstor414 to remove
Connection to cmsrocks52 closed.
stopping puppet on cmsstor414, if applicable
telling host to netboot on next boot
cmsstor414: netboot -> True
set 1 hosts to boot
1 system(s) updated
telling cmspuppetca to remove host's cert, if present
cleaning cert for cmsstor414.fnal.gov
Notice: Revoked certificate with serial 3186
Notice: Removing file Puppet::SSL::Certificate cmsstor414.fnal.gov at '/var/lib/puppet/ssl/ca/signed/cmsstor414.fnal.gov.pem'
Notice: Removing file Puppet::SSL::Certificate cmsstor414.fnal.gov at '/var/lib/puppet/ssl/certs/cmsstor414.fnal.gov.pem'
telling cmspuppetca to update autosign information
when you're ready to start, run:
cmspower-powerit --action cycle --comment 'reinstalling' cmsstor414
don't forget to disable zabbix monitoring if applicable
removing host from rocks on cmsrocks51, if necessary
cmsstor24.fnal.gov: no host cmsstor415 to remove
Connection to cmsrocks51 closed.
removing host from rocks on cmsrocks52, if necessary
cmssrv26.fnal.gov: no host cmsstor415 to remove
Connection to cmsrocks52 closed.
stopping puppet on cmsstor415, if applicable
telling host to netboot on next boot
cmsstor415: netboot -> True
set 1 hosts to boot
1 system(s) updated
telling cmspuppetca to remove host's cert, if present
cleaning cert for cmsstor415.fnal.gov
Notice: Revoked certificate with serial 3187
Notice: Removing file Puppet::SSL::Certificate cmsstor415.fnal.gov at '/var/lib/puppet/ssl/ca/signed/cmsstor415.fnal.gov.pem'
Notice: Removing file Puppet::SSL::Certificate cmsstor415.fnal.gov at '/var/lib/puppet/ssl/certs/cmsstor415.fnal.gov.pem'
telling cmspuppetca to update autosign information
when you're ready to start, run:
cmspower-powerit --action cycle --comment 'reinstalling' cmsstor415
don't forget to disable zabbix monitoring if applicable
removing host from rocks on cmsrocks51, if necessary
cmsstor24.fnal.gov: no host cmsstor416 to remove
Connection to cmsrocks51 closed.
removing host from rocks on cmsrocks52, if necessary
cmssrv26.fnal.gov: no host cmsstor416 to remove
Connection to cmsrocks52 closed.
stopping puppet on cmsstor416, if applicable
telling host to netboot on next boot
cmsstor416: netboot -> True
set 1 hosts to boot
1 system(s) updated
telling cmspuppetca to remove host's cert, if present
cleaning cert for cmsstor416.fnal.gov
Notice: Revoked certificate with serial 3188
Notice: Removing file Puppet::SSL::Certificate cmsstor416.fnal.gov at '/var/lib/puppet/ssl/ca/signed/cmsstor416.fnal.gov.pem'
Notice: Removing file Puppet::SSL::Certificate cmsstor416.fnal.gov at '/var/lib/puppet/ssl/certs/cmsstor416.fnal.gov.pem'
telling cmspuppetca to update autosign information
when you're ready to start, run:
cmspower-powerit --action cycle --comment 'reinstalling' cmsstor416
don't forget to disable zabbix monitoring if applicable
removing host from rocks on cmsrocks51, if necessary
cmsstor24.fnal.gov: no host cmsstor417 to remove
Connection to cmsrocks51 closed.
removing host from rocks on cmsrocks52, if necessary
cmssrv26.fnal.gov: no host cmsstor417 to remove
Connection to cmsrocks52 closed.
stopping puppet on cmsstor417, if applicable
telling host to netboot on next boot
cmsstor417: netboot -> True
set 1 hosts to boot
1 system(s) updated
telling cmspuppetca to remove host's cert, if present
cleaning cert for cmsstor417.fnal.gov
Notice: Revoked certificate with serial 3193
Notice: Removing file Puppet::SSL::Certificate cmsstor417.fnal.gov at '/var/lib/puppet/ssl/ca/signed/cmsstor417.fnal.gov.pem'
Notice: Removing file Puppet::SSL::Certificate cmsstor417.fnal.gov at '/var/lib/puppet/ssl/certs/cmsstor417.fnal.gov.pem'
telling cmspuppetca to update autosign information
when you're ready to start, run:
cmspower-powerit --action cycle --comment 'reinstalling' cmsstor417
don't forget to disable zabbix monitoring if applicable
removing host from rocks on cmsrocks51, if necessary
cmsstor24.fnal.gov: no host cmsstor418 to remove
Connection to cmsrocks51 closed.
removing host from rocks on cmsrocks52, if necessary
cmssrv26.fnal.gov: no host cmsstor418 to remove
Connection to cmsrocks52 closed.
stopping puppet on cmsstor418, if applicable
telling host to netboot on next boot
cmsstor418: netboot -> True
set 1 hosts to boot
1 system(s) updated
telling cmspuppetca to remove host's cert, if present
cleaning cert for cmsstor418.fnal.gov
Notice: Revoked certificate with serial 3189
Notice: Removing file Puppet::SSL::Certificate cmsstor418.fnal.gov at '/var/lib/puppet/ssl/ca/signed/cmsstor418.fnal.gov.pem'
Notice: Removing file Puppet::SSL::Certificate cmsstor418.fnal.gov at '/var/lib/puppet/ssl/certs/cmsstor418.fnal.gov.pem'
telling cmspuppetca to update autosign information
when you're ready to start, run:
cmspower-powerit --action cycle --comment 'reinstalling' cmsstor418
don't forget to disable zabbix monitoring if applicable
removing host from rocks on cmsrocks51, if necessary
cmsstor24.fnal.gov: no host cmsstor419 to remove
Connection to cmsrocks51 closed.
removing host from rocks on cmsrocks52, if necessary
cmssrv26.fnal.gov: no host cmsstor419 to remove
Connection to cmsrocks52 closed.
stopping puppet on cmsstor419, if applicable
telling host to netboot on next boot
cmsstor419: netboot -> True
set 1 hosts to boot
1 system(s) updated
telling cmspuppetca to remove host's cert, if present
cleaning cert for cmsstor419.fnal.gov
Notice: Revoked certificate with serial 3190
Notice: Removing file Puppet::SSL::Certificate cmsstor419.fnal.gov at '/var/lib/puppet/ssl/ca/signed/cmsstor419.fnal.gov.pem'
Notice: Removing file Puppet::SSL::Certificate cmsstor419.fnal.gov at '/var/lib/puppet/ssl/certs/cmsstor419.fnal.gov.pem'
telling cmspuppetca to update autosign information
when you're ready to start, run:
cmspower-powerit --action cycle --comment 'reinstalling' cmsstor419
don't forget to disable zabbix monitoring if applicable
removing host from rocks on cmsrocks51, if necessary
cmsstor24.fnal.gov: no host cmsstor420 to remove
Connection to cmsrocks51 closed.
removing host from rocks on cmsrocks52, if necessary
cmssrv26.fnal.gov: no host cmsstor420 to remove
Connection to cmsrocks52 closed.
stopping puppet on cmsstor420, if applicable
telling host to netboot on next boot
cmsstor420: netboot -> True
set 1 hosts to boot
1 system(s) updated
telling cmspuppetca to remove host's cert, if present
cleaning cert for cmsstor420.fnal.gov
Notice: Revoked certificate with serial 3192
Notice: Removing file Puppet::SSL::Certificate cmsstor420.fnal.gov at '/var/lib/puppet/ssl/ca/signed/cmsstor420.fnal.gov.pem'
Notice: Removing file Puppet::SSL::Certificate cmsstor420.fnal.gov at '/var/lib/puppet/ssl/certs/cmsstor420.fnal.gov.pem'
telling cmspuppetca to update autosign information
when you're ready to start, run:
cmspower-powerit --action cycle --comment 'reinstalling' cmsstor420
don't forget to disable zabbix monitoring if applicable
[root@cmsadmin1 ADD-POOL]# grep cmspower set-reshoot.log
cmspower-powerit --action cycle --comment 'reinstalling' cmsstor411
cmspower-powerit --action cycle --comment 'reinstalling' cmsstor412
cmspower-powerit --action cycle --comment 'reinstalling' cmsstor413
cmspower-powerit --action cycle --comment 'reinstalling' cmsstor414
cmspower-powerit --action cycle --comment 'reinstalling' cmsstor415
cmspower-powerit --action cycle --comment 'reinstalling' cmsstor416
cmspower-powerit --action cycle --comment 'reinstalling' cmsstor417
cmspower-powerit --action cycle --comment 'reinstalling' cmsstor418
cmspower-powerit --action cycle --comment 'reinstalling' cmsstor419
cmspower-powerit --action cycle --comment 'reinstalling' cmsstor420
[root@cmsadmin1 ADD-POOL]# sh cmd2
/usr/bin/ssh -l root cmsconsole cmspower-powerit --action cycle --comment \'root: reinstalling\' cmsstor411
Outlet state: OFF
Outlet state: ON === cmsstor411 ===
connecting to APC apccms1015-1, outlet 4
connecting to APC apccms1015-1, outlet 4
/usr/bin/ssh -l root cmsconsole cmspower-powerit --action cycle --comment \'root: reinstalling\' cmsstor412
Outlet state: OFF
Outlet state: ON === cmsstor412 ===
connecting to APC apccms1015-1, outlet 12
connecting to APC apccms1015-1, outlet 12
/usr/bin/ssh -l root cmsconsole cmspower-powerit --action cycle --comment \'root: reinstalling\' cmsstor413
Outlet state: OFF
Outlet state: ON === cmsstor413 ===
connecting to APC apccms1015-1, outlet 2
connecting to APC apccms1015-1, outlet 2
/usr/bin/ssh -l root cmsconsole cmspower-powerit --action cycle --comment \'root: reinstalling\' cmsstor414
Outlet state: OFF
Outlet state: ON === cmsstor414 ===
connecting to APC apccms1015-1, outlet 10
connecting to APC apccms1015-1, outlet 10
/usr/bin/ssh -l root cmsconsole cmspower-powerit --action cycle --comment \'root: reinstalling\' cmsstor415
Outlet state: OFF
Outlet state: ON === cmsstor415 ===
connecting to APC apccms1015-1, outlet 1
connecting to APC apccms1015-1, outlet 1
/usr/bin/ssh -l root cmsconsole cmspower-powerit --action cycle --comment \'root: reinstalling\' cmsstor416
Outlet state: OFF
Outlet state: ON === cmsstor416 ===
connecting to APC apccms1015-1, outlet 9
connecting to APC apccms1015-1, outlet 9
/usr/bin/ssh -l root cmsconsole cmspower-powerit --action cycle --comment \'root: reinstalling\' cmsstor417
Outlet state: OFF
Outlet state: ON === cmsstor417 ===
connecting to APC apccms1015-1, outlet 3
connecting to APC apccms1015-1, outlet 3
/usr/bin/ssh -l root cmsconsole cmspower-powerit --action cycle --comment \'root: reinstalling\' cmsstor418
Outlet state: OFF
Outlet state: ON === cmsstor418 ===
connecting to APC apccms1015-1, outlet 11
connecting to APC apccms1015-1, outlet 11
/usr/bin/ssh -l root cmsconsole cmspower-powerit --action cycle --comment \'root: reinstalling\' cmsstor419
Outlet state: OFF
Outlet state: ON === cmsstor419 ===
connecting to APC apccms1016-1, outlet 4
connecting to APC apccms1016-1, outlet 4
/usr/bin/ssh -l root cmsconsole cmspower-powerit --action cycle --comment \'root: reinstalling\' cmsstor420
Outlet state: OFF
Outlet state: ON === cmsstor420 ===
connecting to APC apccms1016-1, outlet 12
connecting to APC apccms1016-1, outlet 12
[root@cmsadmin1 ADD-POOL]#

Reshooting now!

#13 Updated by Chih-Hao Huang about 4 years ago

  • % Done changed from 80 to 90

sh-4.1$ check_empty_disk.sh `cat newpools`
cmsstor411:
ssh -l root cmsstor411 ls -lR /storage/*
/storage/data1:
total 0
drwxr-xr-x 4 root root 71 Nov 2 16:17 write-pool

/storage/data1/write-pool:
total 4
drwxr-xr-x 2 root root 38 Nov 2 17:20 data
rw-r--r- 1 root root 0 Nov 2 16:17 lock
drwxr-xr-x 2 root root 122 Nov 2 17:20 meta
-rwxr-xr-x 1 root root 1442 Nov 2 16:17 setup

/storage/data1/write-pool/data:
total 0

/storage/data1/write-pool/meta:
total 4
rw-r--r- 1 root root 1396 Nov 2 16:17 00000000.jdb
rw-r--r- 1 root root 0 Nov 2 16:17 je.info.0
rw-r--r- 1 root root 0 Nov 2 16:17 je.info.0.lck
rw-r--r- 1 root root 0 Nov 2 16:17 je.lck

/storage/data2:
total 0
drwxr-xr-x 4 root root 71 Nov 2 16:17 write-pool

/storage/data2/write-pool:
total 4
drwxr-xr-x 2 root root 38 Nov 2 17:20 data
rw-r--r- 1 root root 0 Nov 2 16:17 lock
drwxr-xr-x 2 root root 122 Nov 2 17:20 meta
-rwxr-xr-x 1 root root 1442 Nov 2 16:17 setup

/storage/data2/write-pool/data:
total 0

/storage/data2/write-pool/meta:
total 4
rw-r--r- 1 root root 1396 Nov 2 16:17 00000000.jdb
rw-r--r- 1 root root 0 Nov 2 16:17 je.info.0
rw-r--r- 1 root root 0 Nov 2 16:17 je.info.0.lck
rw-r--r- 1 root root 0 Nov 2 16:17 je.lck
cmsstor412:
ssh -l root cmsstor412 ls -lR /storage/*
/usr/bin/xauth: creating new authority file /root/.Xauthority
/storage/data1:
total 0
drwxr-xr-x 4 root root 71 Nov 2 16:16 write-pool

/storage/data1/write-pool:
total 4
drwxr-xr-x 2 root root 38 Nov 2 17:19 data
rw-r--r- 1 root root 0 Nov 2 16:16 lock
drwxr-xr-x 2 root root 122 Nov 2 17:19 meta
-rwxr-xr-x 1 root root 1442 Nov 2 16:16 setup

/storage/data1/write-pool/data:
total 0

/storage/data1/write-pool/meta:
total 4
rw-r--r- 1 root root 1396 Nov 2 16:16 00000000.jdb
rw-r--r- 1 root root 0 Nov 2 16:16 je.info.0
rw-r--r- 1 root root 0 Nov 2 16:16 je.info.0.lck
rw-r--r- 1 root root 0 Nov 2 16:16 je.lck

/storage/data2:
total 0
drwxr-xr-x 4 root root 71 Nov 2 16:16 write-pool

/storage/data2/write-pool:
total 4
drwxr-xr-x 2 root root 38 Nov 2 17:19 data
rw-r--r- 1 root root 0 Nov 2 16:16 lock
drwxr-xr-x 2 root root 122 Nov 2 17:19 meta
-rwxr-xr-x 1 root root 1442 Nov 2 16:16 setup

/storage/data2/write-pool/data:
total 0

/storage/data2/write-pool/meta:
total 4
rw-r--r- 1 root root 1396 Nov 2 16:16 00000000.jdb
rw-r--r- 1 root root 0 Nov 2 16:16 je.info.0
rw-r--r- 1 root root 0 Nov 2 16:16 je.info.0.lck
rw-r--r- 1 root root 0 Nov 2 16:16 je.lck
cmsstor413:
ssh -l root cmsstor413 ls -lR /storage/*
/usr/bin/xauth: creating new authority file /root/.Xauthority
/storage/data1:
total 0
drwxr-xr-x 4 root root 71 Nov 2 16:14 write-pool

/storage/data1/write-pool:
total 4
drwxr-xr-x 2 root root 38 Nov 2 17:20 data
rw-r--r- 1 root root 0 Nov 2 16:14 lock
drwxr-xr-x 2 root root 122 Nov 2 17:20 meta
-rwxr-xr-x 1 root root 1442 Nov 2 16:13 setup

/storage/data1/write-pool/data:
total 0

/storage/data1/write-pool/meta:
total 4
rw-r--r- 1 root root 1396 Nov 2 16:14 00000000.jdb
rw-r--r- 1 root root 0 Nov 2 16:14 je.info.0
rw-r--r- 1 root root 0 Nov 2 16:14 je.info.0.lck
rw-r--r- 1 root root 0 Nov 2 16:14 je.lck

/storage/data2:
total 0
drwxr-xr-x 4 root root 71 Nov 2 16:14 write-pool

/storage/data2/write-pool:
total 4
drwxr-xr-x 2 root root 38 Nov 2 17:20 data
rw-r--r- 1 root root 0 Nov 2 16:14 lock
drwxr-xr-x 2 root root 122 Nov 2 17:20 meta
-rwxr-xr-x 1 root root 1442 Nov 2 16:13 setup

/storage/data2/write-pool/data:
total 0

/storage/data2/write-pool/meta:
total 4
rw-r--r- 1 root root 1396 Nov 2 16:14 00000000.jdb
rw-r--r- 1 root root 0 Nov 2 16:14 je.info.0
rw-r--r- 1 root root 0 Nov 2 16:14 je.info.0.lck
rw-r--r- 1 root root 0 Nov 2 16:14 je.lck
cmsstor414:
ssh -l root cmsstor414 ls -lR /storage/*
/usr/bin/xauth: creating new authority file /root/.Xauthority
/storage/data1:
total 0
drwxr-xr-x 4 root root 71 Nov 2 16:13 write-pool

/storage/data1/write-pool:
total 4
drwxr-xr-x 2 root root 38 Nov 2 17:20 data
rw-r--r- 1 root root 0 Nov 2 16:13 lock
drwxr-xr-x 2 root root 122 Nov 2 17:20 meta
-rwxr-xr-x 1 root root 1442 Nov 2 16:13 setup

/storage/data1/write-pool/data:
total 0

/storage/data1/write-pool/meta:
total 4
rw-r--r- 1 root root 1396 Nov 2 16:13 00000000.jdb
rw-r--r- 1 root root 0 Nov 2 16:13 je.info.0
rw-r--r- 1 root root 0 Nov 2 16:13 je.info.0.lck
rw-r--r- 1 root root 0 Nov 2 16:13 je.lck

/storage/data2:
total 0
drwxr-xr-x 4 root root 71 Nov 2 16:13 write-pool

/storage/data2/write-pool:
total 4
drwxr-xr-x 2 root root 38 Nov 2 17:20 data
rw-r--r- 1 root root 0 Nov 2 16:13 lock
drwxr-xr-x 2 root root 122 Nov 2 17:20 meta
-rwxr-xr-x 1 root root 1442 Nov 2 16:13 setup

/storage/data2/write-pool/data:
total 0

/storage/data2/write-pool/meta:
total 4
rw-r--r- 1 root root 1396 Nov 2 16:13 00000000.jdb
rw-r--r- 1 root root 0 Nov 2 16:13 je.info.0
rw-r--r- 1 root root 0 Nov 2 16:13 je.info.0.lck
rw-r--r- 1 root root 0 Nov 2 16:13 je.lck
cmsstor415:
ssh -l root cmsstor415 ls -lR /storage/*
/usr/bin/xauth: creating new authority file /root/.Xauthority
/storage/data1:
total 0
drwxr-xr-x 4 root root 71 Nov 2 16:14 write-pool

/storage/data1/write-pool:
total 4
drwxr-xr-x 2 root root 38 Nov 2 17:20 data
rw-r--r- 1 root root 0 Nov 2 16:14 lock
drwxr-xr-x 2 root root 122 Nov 2 17:20 meta
-rwxr-xr-x 1 root root 1442 Nov 2 16:14 setup

/storage/data1/write-pool/data:
total 0

/storage/data1/write-pool/meta:
total 4
rw-r--r- 1 root root 1396 Nov 2 16:14 00000000.jdb
rw-r--r- 1 root root 0 Nov 2 16:14 je.info.0
rw-r--r- 1 root root 0 Nov 2 16:14 je.info.0.lck
rw-r--r- 1 root root 0 Nov 2 16:14 je.lck

/storage/data2:
total 0
drwxr-xr-x 4 root root 71 Nov 2 16:14 write-pool

/storage/data2/write-pool:
total 4
drwxr-xr-x 2 root root 38 Nov 2 17:20 data
rw-r--r- 1 root root 0 Nov 2 16:14 lock
drwxr-xr-x 2 root root 122 Nov 2 17:20 meta
-rwxr-xr-x 1 root root 1442 Nov 2 16:14 setup

/storage/data2/write-pool/data:
total 0

/storage/data2/write-pool/meta:
total 4
rw-r--r- 1 root root 1396 Nov 2 16:14 00000000.jdb
rw-r--r- 1 root root 0 Nov 2 16:14 je.info.0
rw-r--r- 1 root root 0 Nov 2 16:14 je.info.0.lck
rw-r--r- 1 root root 0 Nov 2 16:14 je.lck
cmsstor416:
ssh -l root cmsstor416 ls -lR /storage/*
/usr/bin/xauth: creating new authority file /root/.Xauthority
/storage/data1:
total 0
drwxr-xr-x 4 root root 71 Nov 2 16:16 write-pool

/storage/data1/write-pool:
total 4
drwxr-xr-x 2 root root 38 Nov 2 17:19 data
rw-r--r- 1 root root 0 Nov 2 16:16 lock
drwxr-xr-x 2 root root 122 Nov 2 17:19 meta
-rwxr-xr-x 1 root root 1442 Nov 2 16:16 setup

/storage/data1/write-pool/data:
total 0

/storage/data1/write-pool/meta:
total 4
rw-r--r- 1 root root 1396 Nov 2 16:16 00000000.jdb
rw-r--r- 1 root root 0 Nov 2 16:16 je.info.0
rw-r--r- 1 root root 0 Nov 2 16:16 je.info.0.lck
rw-r--r- 1 root root 0 Nov 2 16:16 je.lck

/storage/data2:
total 0
drwxr-xr-x 4 root root 71 Nov 2 16:16 write-pool

/storage/data2/write-pool:
total 4
drwxr-xr-x 2 root root 38 Nov 2 17:19 data
rw-r--r- 1 root root 0 Nov 2 16:16 lock
drwxr-xr-x 2 root root 122 Nov 2 17:19 meta
-rwxr-xr-x 1 root root 1442 Nov 2 16:16 setup

/storage/data2/write-pool/data:
total 0

/storage/data2/write-pool/meta:
total 4
rw-r--r- 1 root root 1396 Nov 2 16:16 00000000.jdb
rw-r--r- 1 root root 0 Nov 2 16:16 je.info.0
rw-r--r- 1 root root 0 Nov 2 16:16 je.info.0.lck
rw-r--r- 1 root root 0 Nov 2 16:16 je.lck
cmsstor417:
ssh -l root cmsstor417 ls -lR /storage/*
/usr/bin/xauth: creating new authority file /root/.Xauthority
/storage/data1:
total 0
drwxr-xr-x 4 root root 71 Nov 2 16:15 write-pool

/storage/data1/write-pool:
total 4
drwxr-xr-x 2 root root 38 Nov 2 17:20 data
rw-r--r- 1 root root 0 Nov 2 16:15 lock
drwxr-xr-x 2 root root 122 Nov 2 17:20 meta
-rwxr-xr-x 1 root root 1442 Nov 2 16:15 setup

/storage/data1/write-pool/data:
total 0

/storage/data1/write-pool/meta:
total 4
rw-r--r- 1 root root 1396 Nov 2 16:15 00000000.jdb
rw-r--r- 1 root root 0 Nov 2 16:15 je.info.0
rw-r--r- 1 root root 0 Nov 2 16:15 je.info.0.lck
rw-r--r- 1 root root 0 Nov 2 16:15 je.lck

/storage/data2:
total 0
drwxr-xr-x 4 root root 71 Nov 2 16:15 write-pool

/storage/data2/write-pool:
total 4
drwxr-xr-x 2 root root 38 Nov 2 17:20 data
rw-r--r- 1 root root 0 Nov 2 16:15 lock
drwxr-xr-x 2 root root 122 Nov 2 17:20 meta
-rwxr-xr-x 1 root root 1442 Nov 2 16:15 setup

/storage/data2/write-pool/data:
total 0

/storage/data2/write-pool/meta:
total 4
rw-r--r- 1 root root 1396 Nov 2 16:15 00000000.jdb
rw-r--r- 1 root root 0 Nov 2 16:15 je.info.0
rw-r--r- 1 root root 0 Nov 2 16:15 je.info.0.lck
rw-r--r- 1 root root 0 Nov 2 16:15 je.lck
cmsstor418:
ssh -l root cmsstor418 ls -lR /storage/*
/usr/bin/xauth: creating new authority file /root/.Xauthority
/storage/data1:
total 0
drwxr-xr-x 4 root root 71 Nov 2 16:17 write-pool

/storage/data1/write-pool:
total 4
drwxr-xr-x 2 root root 38 Nov 2 17:19 data
rw-r--r- 1 root root 0 Nov 2 16:17 lock
drwxr-xr-x 2 root root 122 Nov 2 17:19 meta
-rwxr-xr-x 1 root root 1442 Nov 2 16:17 setup

/storage/data1/write-pool/data:
total 0

/storage/data1/write-pool/meta:
total 4
rw-r--r- 1 root root 1396 Nov 2 16:17 00000000.jdb
rw-r--r- 1 root root 0 Nov 2 16:17 je.info.0
rw-r--r- 1 root root 0 Nov 2 16:17 je.info.0.lck
rw-r--r- 1 root root 0 Nov 2 16:17 je.lck

/storage/data2:
total 0
drwxr-xr-x 4 root root 71 Nov 2 16:17 write-pool

/storage/data2/write-pool:
total 4
drwxr-xr-x 2 root root 38 Nov 2 17:19 data
rw-r--r- 1 root root 0 Nov 2 16:17 lock
drwxr-xr-x 2 root root 122 Nov 2 17:19 meta
-rwxr-xr-x 1 root root 1442 Nov 2 16:17 setup

/storage/data2/write-pool/data:
total 0

/storage/data2/write-pool/meta:
total 4
rw-r--r- 1 root root 1396 Nov 2 16:17 00000000.jdb
rw-r--r- 1 root root 0 Nov 2 16:17 je.info.0
rw-r--r- 1 root root 0 Nov 2 16:17 je.info.0.lck
rw-r--r- 1 root root 0 Nov 2 16:17 je.lck
cmsstor419:
ssh -l root cmsstor419 ls -lR /storage/*
/usr/bin/xauth: creating new authority file /root/.Xauthority
/storage/data1:
total 0
drwxr-xr-x 4 root root 71 Nov 2 16:17 write-pool

/storage/data1/write-pool:
total 4
drwxr-xr-x 2 root root 38 Nov 2 17:19 data
rw-r--r- 1 root root 0 Nov 2 16:17 lock
drwxr-xr-x 2 root root 122 Nov 2 17:19 meta
-rwxr-xr-x 1 root root 1442 Nov 2 16:17 setup

/storage/data1/write-pool/data:
total 0

/storage/data1/write-pool/meta:
total 4
rw-r--r- 1 root root 1396 Nov 2 16:17 00000000.jdb
rw-r--r- 1 root root 0 Nov 2 16:17 je.info.0
rw-r--r- 1 root root 0 Nov 2 16:17 je.info.0.lck
rw-r--r- 1 root root 0 Nov 2 16:17 je.lck

/storage/data2:
total 0
drwxr-xr-x 4 root root 71 Nov 2 16:17 write-pool

/storage/data2/write-pool:
total 4
drwxr-xr-x 2 root root 38 Nov 2 17:19 data
rw-r--r- 1 root root 0 Nov 2 16:17 lock
drwxr-xr-x 2 root root 122 Nov 2 17:19 meta
-rwxr-xr-x 1 root root 1442 Nov 2 16:17 setup

/storage/data2/write-pool/data:
total 0

/storage/data2/write-pool/meta:
total 4
rw-r--r- 1 root root 1396 Nov 2 16:17 00000000.jdb
rw-r--r- 1 root root 0 Nov 2 16:17 je.info.0
rw-r--r- 1 root root 0 Nov 2 16:17 je.info.0.lck
rw-r--r- 1 root root 0 Nov 2 16:17 je.lck
cmsstor420:
ssh -l root cmsstor420 ls -lR /storage/*
/usr/bin/xauth: creating new authority file /root/.Xauthority
/storage/data1:
total 0
drwxr-xr-x 4 root root 71 Nov 2 16:17 write-pool

/storage/data1/write-pool:
total 4
drwxr-xr-x 2 root root 38 Nov 2 17:19 data
rw-r--r- 1 root root 0 Nov 2 16:17 lock
drwxr-xr-x 2 root root 122 Nov 2 17:19 meta
-rwxr-xr-x 1 root root 1442 Nov 2 16:17 setup

/storage/data1/write-pool/data:
total 0

/storage/data1/write-pool/meta:
total 4
rw-r--r- 1 root root 1396 Nov 2 16:17 00000000.jdb
rw-r--r- 1 root root 0 Nov 2 16:17 je.info.0
rw-r--r- 1 root root 0 Nov 2 16:17 je.info.0.lck
rw-r--r- 1 root root 0 Nov 2 16:17 je.lck

/storage/data2:
total 0
drwxr-xr-x 4 root root 71 Nov 2 16:17 write-pool

/storage/data2/write-pool:
total 4
drwxr-xr-x 2 root root 38 Nov 2 17:19 data
rw-r--r- 1 root root 0 Nov 2 16:17 lock
drwxr-xr-x 2 root root 122 Nov 2 17:19 meta
-rwxr-xr-x 1 root root 1442 Nov 2 16:17 setup

/storage/data2/write-pool/data:
total 0

/storage/data2/write-pool/meta:
total 4
rw-r--r- 1 root root 1396 Nov 2 16:17 00000000.jdb
rw-r--r- 1 root root 0 Nov 2 16:17 je.info.0
rw-r--r- 1 root root 0 Nov 2 16:17 je.info.0.lck
rw-r--r- 1 root root 0 Nov 2 16:17 je.lck
sh-4.1$

#14 Updated by Chih-Hao Huang about 4 years ago

See if there is any error after system came up:

sh-4.1$ date
Mon Nov 2 17:25:48 CST 2015
sh-4.1$ for i in `cat newpools`; do echo $i:; ssh -l root $i tail /var/log/messages; done
cmsstor411:
Nov 2 16:17:01 cmsstor411 kernel: SGI XFS with ACLs, security attributes, large block/inode numbers, no debug enabled
Nov 2 16:17:01 cmsstor411 kernel: SGI XFS Quota Management subsystem
Nov 2 16:17:01 cmsstor411 kernel: XFS (sdb): Mounting Filesystem
Nov 2 16:17:01 cmsstor411 kernel: XFS (sdb): Starting recovery (logdev: internal)
Nov 2 16:17:01 cmsstor411 kernel: XFS (sdb): Ending recovery (logdev: internal)
Nov 2 16:17:01 cmsstor411 kernel: XFS (sdc): Mounting Filesystem
Nov 2 16:17:01 cmsstor411 kernel: XFS (sdc): Starting recovery (logdev: internal)
Nov 2 16:17:01 cmsstor411 kernel: XFS (sdc): Ending recovery (logdev: internal)
Nov 2 16:29:55 cmsstor411 ntpd7260: 0.0.0.0 c612 02 freq_set kernel -41.848 PPM
Nov 2 16:29:55 cmsstor411 ntpd7260: 0.0.0.0 c615 05 clock_sync
cmsstor412:
Nov 2 16:16:17 cmsstor412 kernel: SGI XFS with ACLs, security attributes, large block/inode numbers, no debug enabled
Nov 2 16:16:17 cmsstor412 kernel: SGI XFS Quota Management subsystem
Nov 2 16:16:17 cmsstor412 kernel: XFS (sdb): Mounting Filesystem
Nov 2 16:16:17 cmsstor412 kernel: XFS (sdb): Starting recovery (logdev: internal)
Nov 2 16:16:17 cmsstor412 kernel: XFS (sdb): Ending recovery (logdev: internal)
Nov 2 16:16:17 cmsstor412 kernel: XFS (sdc): Mounting Filesystem
Nov 2 16:16:17 cmsstor412 kernel: XFS (sdc): Starting recovery (logdev: internal)
Nov 2 16:16:17 cmsstor412 kernel: XFS (sdc): Ending recovery (logdev: internal)
Nov 2 16:29:44 cmsstor412 ntpd7256: 0.0.0.0 c612 02 freq_set kernel -37.278 PPM
Nov 2 16:29:44 cmsstor412 ntpd7256: 0.0.0.0 c615 05 clock_sync
cmsstor413:
Nov 2 16:13:48 cmsstor413 kernel: SGI XFS with ACLs, security attributes, large block/inode numbers, no debug enabled
Nov 2 16:13:48 cmsstor413 kernel: SGI XFS Quota Management subsystem
Nov 2 16:13:48 cmsstor413 kernel: XFS (sdb): Mounting Filesystem
Nov 2 16:13:48 cmsstor413 kernel: XFS (sdb): Starting recovery (logdev: internal)
Nov 2 16:13:48 cmsstor413 kernel: XFS (sdb): Ending recovery (logdev: internal)
Nov 2 16:13:48 cmsstor413 kernel: XFS (sdc): Mounting Filesystem
Nov 2 16:13:48 cmsstor413 kernel: XFS (sdc): Starting recovery (logdev: internal)
Nov 2 16:13:48 cmsstor413 kernel: XFS (sdc): Ending recovery (logdev: internal)
Nov 2 16:28:03 cmsstor413 ntpd7285: 0.0.0.0 c612 02 freq_set kernel -42.526 PPM
Nov 2 16:28:03 cmsstor413 ntpd7285: 0.0.0.0 c615 05 clock_sync
cmsstor414:
Nov 2 16:12:51 cmsstor414 kernel: SGI XFS with ACLs, security attributes, large block/inode numbers, no debug enabled
Nov 2 16:12:51 cmsstor414 kernel: SGI XFS Quota Management subsystem
Nov 2 16:12:51 cmsstor414 kernel: XFS (sdb): Mounting Filesystem
Nov 2 16:12:51 cmsstor414 kernel: XFS (sdb): Starting recovery (logdev: internal)
Nov 2 16:12:51 cmsstor414 kernel: XFS (sdb): Ending recovery (logdev: internal)
Nov 2 16:12:51 cmsstor414 kernel: XFS (sdc): Mounting Filesystem
Nov 2 16:12:51 cmsstor414 kernel: XFS (sdc): Starting recovery (logdev: internal)
Nov 2 16:12:51 cmsstor414 kernel: XFS (sdc): Ending recovery (logdev: internal)
Nov 2 16:27:06 cmsstor414 ntpd7282: 0.0.0.0 c612 02 freq_set kernel -40.233 PPM
Nov 2 16:27:06 cmsstor414 ntpd7282: 0.0.0.0 c615 05 clock_sync
cmsstor415:
Nov 2 16:13:52 cmsstor415 kernel: SGI XFS with ACLs, security attributes, large block/inode numbers, no debug enabled
Nov 2 16:13:52 cmsstor415 kernel: SGI XFS Quota Management subsystem
Nov 2 16:13:52 cmsstor415 kernel: XFS (sdb): Mounting Filesystem
Nov 2 16:13:52 cmsstor415 kernel: XFS (sdb): Starting recovery (logdev: internal)
Nov 2 16:13:52 cmsstor415 kernel: XFS (sdb): Ending recovery (logdev: internal)
Nov 2 16:13:52 cmsstor415 kernel: XFS (sdc): Mounting Filesystem
Nov 2 16:13:52 cmsstor415 kernel: XFS (sdc): Starting recovery (logdev: internal)
Nov 2 16:13:52 cmsstor415 kernel: XFS (sdc): Ending recovery (logdev: internal)
Nov 2 16:28:05 cmsstor415 ntpd7283: 0.0.0.0 c612 02 freq_set kernel -45.165 PPM
Nov 2 16:28:05 cmsstor415 ntpd7283: 0.0.0.0 c615 05 clock_sync
cmsstor416:
Nov 2 16:16:18 cmsstor416 kernel: SGI XFS with ACLs, security attributes, large block/inode numbers, no debug enabled
Nov 2 16:16:18 cmsstor416 kernel: SGI XFS Quota Management subsystem
Nov 2 16:16:18 cmsstor416 kernel: XFS (sdb): Mounting Filesystem
Nov 2 16:16:18 cmsstor416 kernel: XFS (sdb): Starting recovery (logdev: internal)
Nov 2 16:16:19 cmsstor416 kernel: XFS (sdb): Ending recovery (logdev: internal)
Nov 2 16:16:19 cmsstor416 kernel: XFS (sdc): Mounting Filesystem
Nov 2 16:16:19 cmsstor416 kernel: XFS (sdc): Starting recovery (logdev: internal)
Nov 2 16:16:19 cmsstor416 kernel: XFS (sdc): Ending recovery (logdev: internal)
Nov 2 16:29:57 cmsstor416 ntpd7261: 0.0.0.0 c612 02 freq_set kernel -44.557 PPM
Nov 2 16:29:57 cmsstor416 ntpd7261: 0.0.0.0 c615 05 clock_sync
cmsstor417:
Nov 2 16:14:53 cmsstor417 kernel: SGI XFS with ACLs, security attributes, large block/inode numbers, no debug enabled
Nov 2 16:14:53 cmsstor417 kernel: SGI XFS Quota Management subsystem
Nov 2 16:14:53 cmsstor417 kernel: XFS (sdb): Mounting Filesystem
Nov 2 16:14:53 cmsstor417 kernel: XFS (sdb): Starting recovery (logdev: internal)
Nov 2 16:14:53 cmsstor417 kernel: XFS (sdb): Ending recovery (logdev: internal)
Nov 2 16:14:53 cmsstor417 kernel: XFS (sdc): Mounting Filesystem
Nov 2 16:14:53 cmsstor417 kernel: XFS (sdc): Starting recovery (logdev: internal)
Nov 2 16:14:53 cmsstor417 kernel: XFS (sdc): Ending recovery (logdev: internal)
Nov 2 16:29:09 cmsstor417 ntpd7280: 0.0.0.0 c612 02 freq_set kernel -47.100 PPM
Nov 2 16:29:09 cmsstor417 ntpd7280: 0.0.0.0 c615 05 clock_sync
cmsstor418:
Nov 2 16:17:12 cmsstor418 kernel: SGI XFS with ACLs, security attributes, large block/inode numbers, no debug enabled
Nov 2 16:17:12 cmsstor418 kernel: SGI XFS Quota Management subsystem
Nov 2 16:17:12 cmsstor418 kernel: XFS (sdb): Mounting Filesystem
Nov 2 16:17:12 cmsstor418 kernel: XFS (sdb): Starting recovery (logdev: internal)
Nov 2 16:17:12 cmsstor418 kernel: XFS (sdb): Ending recovery (logdev: internal)
Nov 2 16:17:13 cmsstor418 kernel: XFS (sdc): Mounting Filesystem
Nov 2 16:17:13 cmsstor418 kernel: XFS (sdc): Starting recovery (logdev: internal)
Nov 2 16:17:13 cmsstor418 kernel: XFS (sdc): Ending recovery (logdev: internal)
Nov 2 16:33:54 cmsstor418 ntpd7282: 0.0.0.0 0612 02 freq_set kernel -27.188 PPM
Nov 2 16:33:54 cmsstor418 ntpd7282: 0.0.0.0 0615 05 clock_sync
cmsstor419:
Nov 2 16:17:13 cmsstor419 kernel: SGI XFS with ACLs, security attributes, large block/inode numbers, no debug enabled
Nov 2 16:17:13 cmsstor419 kernel: SGI XFS Quota Management subsystem
Nov 2 16:17:13 cmsstor419 kernel: XFS (sdb): Mounting Filesystem
Nov 2 16:17:14 cmsstor419 kernel: XFS (sdb): Starting recovery (logdev: internal)
Nov 2 16:17:14 cmsstor419 kernel: XFS (sdb): Ending recovery (logdev: internal)
Nov 2 16:17:14 cmsstor419 kernel: XFS (sdc): Mounting Filesystem
Nov 2 16:17:14 cmsstor419 kernel: XFS (sdc): Starting recovery (logdev: internal)
Nov 2 16:17:14 cmsstor419 kernel: XFS (sdc): Ending recovery (logdev: internal)
Nov 2 16:31:38 cmsstor419 ntpd7260: 0.0.0.0 0612 02 freq_set kernel -26.509 PPM
Nov 2 16:31:38 cmsstor419 ntpd7260: 0.0.0.0 0615 05 clock_sync
cmsstor420:
Nov 2 16:17:43 cmsstor420 kernel: SGI XFS with ACLs, security attributes, large block/inode numbers, no debug enabled
Nov 2 16:17:43 cmsstor420 kernel: SGI XFS Quota Management subsystem
Nov 2 16:17:43 cmsstor420 kernel: XFS (sdb): Mounting Filesystem
Nov 2 16:17:43 cmsstor420 kernel: XFS (sdb): Starting recovery (logdev: internal)
Nov 2 16:17:43 cmsstor420 kernel: XFS (sdb): Ending recovery (logdev: internal)
Nov 2 16:17:43 cmsstor420 kernel: XFS (sdc): Mounting Filesystem
Nov 2 16:17:43 cmsstor420 kernel: XFS (sdc): Starting recovery (logdev: internal)
Nov 2 16:17:43 cmsstor420 kernel: XFS (sdc): Ending recovery (logdev: internal)
Nov 2 16:30:46 cmsstor420 ntpd7255: 0.0.0.0 0612 02 freq_set kernel -1.920 PPM
Nov 2 16:30:46 cmsstor420 ntpd7255: 0.0.0.0 0615 05 clock_sync
sh-4.1$

Checking dCache logs:

sh-4.1$ for i in `cat newpools`; do echo $i:; ssh -l root $i tail /var/log/dcache/w-cmsstor*; done
cmsstor411:
> /var/log/dcache/w-cmsstor411-disk-disk1Domain.log <

2015-11-02 16:17:08 Launching /usr/bin/java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.net.inetaddr.ttl=1800 -Dorg.globus.tcp.port.range=15000,35000 -Djava.net.preferIPv4Stack=true -Dorg.dcache.dcap.port=0 -Dorg.dcache.net.tcp.portrange=15000:35000 -Dorg.globus.jglobus.delegation.cache.lifetime=30000 -Dorg.globus.jglobus.crl.cache.lifetime=60000 -Djava.security.krb5.realm=EXAMPLE.ORG -Djava.security.krb5.kdc=localhost -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/etc/dcache/jgss.conf -Djava.awt.headless=true -DwantLog4jSetup=n -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dcache/w-cmsstor411-disk-disk1Domain-oom.hprof -javaagent:/usr/share/dcache/classes/spring-instrument-3.1.4.RELEASE.jar -Ddcache.home=/usr/share/dcache -Ddcache.paths.defaults=/usr/share/dcache/defaults org.dcache.boot.BootLoader start w-cmsstor411-disk-disk1Domain
02 Nov 2015 16:17:11 (System) [] Created : w-cmsstor411-disk-disk1Domain
02 Nov 2015 16:17:14 (w-cmsstor411-disk-disk1) [] Queue not created, name already exists: p2p
02 Nov 2015 16:17:14 (w-cmsstor411-disk-disk1) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Initializing
02 Nov 2015 16:17:14 (w-cmsstor411-disk-disk1) [] Reading inventory from [data=/storage/data1/write-pool/data;meta=/storage/data1/write-pool/meta]
02 Nov 2015 16:17:14 (w-cmsstor411-disk-disk1) [] Pool mode changed to disabled(store,stage,p2p-client): Initializing
02 Nov 2015 16:17:14 (w-cmsstor411-disk-disk1) [] Pool mode changed to enabled

> /var/log/dcache/w-cmsstor411-disk-disk2Domain.log <

2015-11-02 16:17:09 Launching /usr/bin/java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.net.inetaddr.ttl=1800 -Dorg.globus.tcp.port.range=15000,35000 -Djava.net.preferIPv4Stack=true -Dorg.dcache.dcap.port=0 -Dorg.dcache.net.tcp.portrange=15000:35000 -Dorg.globus.jglobus.delegation.cache.lifetime=30000 -Dorg.globus.jglobus.crl.cache.lifetime=60000 -Djava.security.krb5.realm=EXAMPLE.ORG -Djava.security.krb5.kdc=localhost -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/etc/dcache/jgss.conf -Djava.awt.headless=true -DwantLog4jSetup=n -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dcache/w-cmsstor411-disk-disk2Domain-oom.hprof -javaagent:/usr/share/dcache/classes/spring-instrument-3.1.4.RELEASE.jar -Ddcache.home=/usr/share/dcache -Ddcache.paths.defaults=/usr/share/dcache/defaults org.dcache.boot.BootLoader start w-cmsstor411-disk-disk2Domain
02 Nov 2015 16:17:11 (System) [] Created : w-cmsstor411-disk-disk2Domain
02 Nov 2015 16:17:14 (w-cmsstor411-disk-disk2) [] Queue not created, name already exists: p2p
02 Nov 2015 16:17:14 (w-cmsstor411-disk-disk2) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Initializing
02 Nov 2015 16:17:14 (w-cmsstor411-disk-disk2) [] Reading inventory from [data=/storage/data2/write-pool/data;meta=/storage/data2/write-pool/meta]
02 Nov 2015 16:17:14 (w-cmsstor411-disk-disk2) [] Pool mode changed to disabled(store,stage,p2p-client): Initializing
02 Nov 2015 16:17:14 (w-cmsstor411-disk-disk2) [] Pool mode changed to enabled
cmsstor412:
> /var/log/dcache/w-cmsstor412-disk-disk1Domain.log <

2015-11-02 16:16:26 Launching /usr/bin/java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.net.inetaddr.ttl=1800 -Dorg.globus.tcp.port.range=15000,35000 -Djava.net.preferIPv4Stack=true -Dorg.dcache.dcap.port=0 -Dorg.dcache.net.tcp.portrange=15000:35000 -Dorg.globus.jglobus.delegation.cache.lifetime=30000 -Dorg.globus.jglobus.crl.cache.lifetime=60000 -Djava.security.krb5.realm=EXAMPLE.ORG -Djava.security.krb5.kdc=localhost -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/etc/dcache/jgss.conf -Djava.awt.headless=true -DwantLog4jSetup=n -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dcache/w-cmsstor412-disk-disk1Domain-oom.hprof -javaagent:/usr/share/dcache/classes/spring-instrument-3.1.4.RELEASE.jar -Ddcache.home=/usr/share/dcache -Ddcache.paths.defaults=/usr/share/dcache/defaults org.dcache.boot.BootLoader start w-cmsstor412-disk-disk1Domain
02 Nov 2015 16:16:28 (System) [] Created : w-cmsstor412-disk-disk1Domain
02 Nov 2015 16:16:31 (w-cmsstor412-disk-disk1) [] Queue not created, name already exists: p2p
02 Nov 2015 16:16:31 (w-cmsstor412-disk-disk1) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Initializing
02 Nov 2015 16:16:31 (w-cmsstor412-disk-disk1) [] Reading inventory from [data=/storage/data1/write-pool/data;meta=/storage/data1/write-pool/meta]
02 Nov 2015 16:16:31 (w-cmsstor412-disk-disk1) [] Pool mode changed to disabled(store,stage,p2p-client): Initializing
02 Nov 2015 16:16:31 (w-cmsstor412-disk-disk1) [] Pool mode changed to enabled
02 Nov 2015 16:17:31 (w-cmsstor412-disk-disk1) [] The file system containing the data files appears to have less free space (77,219,547,144,192 bytes) than expected (77,219,547,148,288 bytes); reducing the pool size to 77,219,547,144,192 bytes to compensate. Notice that this does not leave any space for the meta data. If such data is stored on the same file system, then it is paramount that the pool size is reconfigured to leave enough space for the meta data.

> /var/log/dcache/w-cmsstor412-disk-disk2Domain.log <

2015-11-02 16:16:26 Launching /usr/bin/java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.net.inetaddr.ttl=1800 -Dorg.globus.tcp.port.range=15000,35000 -Djava.net.preferIPv4Stack=true -Dorg.dcache.dcap.port=0 -Dorg.dcache.net.tcp.portrange=15000:35000 -Dorg.globus.jglobus.delegation.cache.lifetime=30000 -Dorg.globus.jglobus.crl.cache.lifetime=60000 -Djava.security.krb5.realm=EXAMPLE.ORG -Djava.security.krb5.kdc=localhost -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/etc/dcache/jgss.conf -Djava.awt.headless=true -DwantLog4jSetup=n -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dcache/w-cmsstor412-disk-disk2Domain-oom.hprof -javaagent:/usr/share/dcache/classes/spring-instrument-3.1.4.RELEASE.jar -Ddcache.home=/usr/share/dcache -Ddcache.paths.defaults=/usr/share/dcache/defaults org.dcache.boot.BootLoader start w-cmsstor412-disk-disk2Domain
02 Nov 2015 16:16:28 (System) [] Created : w-cmsstor412-disk-disk2Domain
02 Nov 2015 16:16:31 (w-cmsstor412-disk-disk2) [] Queue not created, name already exists: p2p
02 Nov 2015 16:16:31 (w-cmsstor412-disk-disk2) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Initializing
02 Nov 2015 16:16:31 (w-cmsstor412-disk-disk2) [] Reading inventory from [data=/storage/data2/write-pool/data;meta=/storage/data2/write-pool/meta]
02 Nov 2015 16:16:31 (w-cmsstor412-disk-disk2) [] Pool mode changed to disabled(store,stage,p2p-client): Initializing
02 Nov 2015 16:16:31 (w-cmsstor412-disk-disk2) [] Pool mode changed to enabled
cmsstor413:
> /var/log/dcache/w-cmsstor413-disk-disk1Domain.log <

2015-11-02 16:13:56 Launching /usr/bin/java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.net.inetaddr.ttl=1800 -Dorg.globus.tcp.port.range=15000,35000 -Djava.net.preferIPv4Stack=true -Dorg.dcache.dcap.port=0 -Dorg.dcache.net.tcp.portrange=15000:35000 -Dorg.globus.jglobus.delegation.cache.lifetime=30000 -Dorg.globus.jglobus.crl.cache.lifetime=60000 -Djava.security.krb5.realm=EXAMPLE.ORG -Djava.security.krb5.kdc=localhost -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/etc/dcache/jgss.conf -Djava.awt.headless=true -DwantLog4jSetup=n -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dcache/w-cmsstor413-disk-disk1Domain-oom.hprof -javaagent:/usr/share/dcache/classes/spring-instrument-3.1.4.RELEASE.jar -Ddcache.home=/usr/share/dcache -Ddcache.paths.defaults=/usr/share/dcache/defaults org.dcache.boot.BootLoader start w-cmsstor413-disk-disk1Domain
02 Nov 2015 16:13:58 (System) [] Created : w-cmsstor413-disk-disk1Domain
02 Nov 2015 16:14:01 (w-cmsstor413-disk-disk1) [] Queue not created, name already exists: p2p
02 Nov 2015 16:14:01 (w-cmsstor413-disk-disk1) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Initializing
02 Nov 2015 16:14:01 (w-cmsstor413-disk-disk1) [] Reading inventory from [data=/storage/data1/write-pool/data;meta=/storage/data1/write-pool/meta]
02 Nov 2015 16:14:01 (w-cmsstor413-disk-disk1) [] Pool mode changed to disabled(store,stage,p2p-client): Initializing
02 Nov 2015 16:14:01 (w-cmsstor413-disk-disk1) [] Pool mode changed to enabled

> /var/log/dcache/w-cmsstor413-disk-disk2Domain.log <

2015-11-02 16:13:56 Launching /usr/bin/java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.net.inetaddr.ttl=1800 -Dorg.globus.tcp.port.range=15000,35000 -Djava.net.preferIPv4Stack=true -Dorg.dcache.dcap.port=0 -Dorg.dcache.net.tcp.portrange=15000:35000 -Dorg.globus.jglobus.delegation.cache.lifetime=30000 -Dorg.globus.jglobus.crl.cache.lifetime=60000 -Djava.security.krb5.realm=EXAMPLE.ORG -Djava.security.krb5.kdc=localhost -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/etc/dcache/jgss.conf -Djava.awt.headless=true -DwantLog4jSetup=n -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dcache/w-cmsstor413-disk-disk2Domain-oom.hprof -javaagent:/usr/share/dcache/classes/spring-instrument-3.1.4.RELEASE.jar -Ddcache.home=/usr/share/dcache -Ddcache.paths.defaults=/usr/share/dcache/defaults org.dcache.boot.BootLoader start w-cmsstor413-disk-disk2Domain
02 Nov 2015 16:13:58 (System) [] Created : w-cmsstor413-disk-disk2Domain
02 Nov 2015 16:14:01 (w-cmsstor413-disk-disk2) [] Queue not created, name already exists: p2p
02 Nov 2015 16:14:01 (w-cmsstor413-disk-disk2) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Initializing
02 Nov 2015 16:14:01 (w-cmsstor413-disk-disk2) [] Reading inventory from [data=/storage/data2/write-pool/data;meta=/storage/data2/write-pool/meta]
02 Nov 2015 16:14:01 (w-cmsstor413-disk-disk2) [] Pool mode changed to disabled(store,stage,p2p-client): Initializing
02 Nov 2015 16:14:01 (w-cmsstor413-disk-disk2) [] Pool mode changed to enabled
cmsstor414:
> /var/log/dcache/w-cmsstor414-disk-disk1Domain.log <

2015-11-02 16:12:59 Launching /usr/bin/java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.net.inetaddr.ttl=1800 -Dorg.globus.tcp.port.range=15000,35000 -Djava.net.preferIPv4Stack=true -Dorg.dcache.dcap.port=0 -Dorg.dcache.net.tcp.portrange=15000:35000 -Dorg.globus.jglobus.delegation.cache.lifetime=30000 -Dorg.globus.jglobus.crl.cache.lifetime=60000 -Djava.security.krb5.realm=EXAMPLE.ORG -Djava.security.krb5.kdc=localhost -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/etc/dcache/jgss.conf -Djava.awt.headless=true -DwantLog4jSetup=n -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dcache/w-cmsstor414-disk-disk1Domain-oom.hprof -javaagent:/usr/share/dcache/classes/spring-instrument-3.1.4.RELEASE.jar -Ddcache.home=/usr/share/dcache -Ddcache.paths.defaults=/usr/share/dcache/defaults org.dcache.boot.BootLoader start w-cmsstor414-disk-disk1Domain
02 Nov 2015 16:13:01 (System) [] Created : w-cmsstor414-disk-disk1Domain
02 Nov 2015 16:13:05 (w-cmsstor414-disk-disk1) [] Queue not created, name already exists: p2p
02 Nov 2015 16:13:05 (w-cmsstor414-disk-disk1) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Initializing
02 Nov 2015 16:13:05 (w-cmsstor414-disk-disk1) [] Reading inventory from [data=/storage/data1/write-pool/data;meta=/storage/data1/write-pool/meta]
02 Nov 2015 16:13:05 (w-cmsstor414-disk-disk1) [] Pool mode changed to disabled(store,stage,p2p-client): Initializing
02 Nov 2015 16:13:05 (w-cmsstor414-disk-disk1) [] Pool mode changed to enabled

> /var/log/dcache/w-cmsstor414-disk-disk2Domain.log <

2015-11-02 16:12:59 Launching /usr/bin/java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.net.inetaddr.ttl=1800 -Dorg.globus.tcp.port.range=15000,35000 -Djava.net.preferIPv4Stack=true -Dorg.dcache.dcap.port=0 -Dorg.dcache.net.tcp.portrange=15000:35000 -Dorg.globus.jglobus.delegation.cache.lifetime=30000 -Dorg.globus.jglobus.crl.cache.lifetime=60000 -Djava.security.krb5.realm=EXAMPLE.ORG -Djava.security.krb5.kdc=localhost -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/etc/dcache/jgss.conf -Djava.awt.headless=true -DwantLog4jSetup=n -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dcache/w-cmsstor414-disk-disk2Domain-oom.hprof -javaagent:/usr/share/dcache/classes/spring-instrument-3.1.4.RELEASE.jar -Ddcache.home=/usr/share/dcache -Ddcache.paths.defaults=/usr/share/dcache/defaults org.dcache.boot.BootLoader start w-cmsstor414-disk-disk2Domain
02 Nov 2015 16:13:01 (System) [] Created : w-cmsstor414-disk-disk2Domain
02 Nov 2015 16:13:05 (w-cmsstor414-disk-disk2) [] Queue not created, name already exists: p2p
02 Nov 2015 16:13:05 (w-cmsstor414-disk-disk2) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Initializing
02 Nov 2015 16:13:05 (w-cmsstor414-disk-disk2) [] Reading inventory from [data=/storage/data2/write-pool/data;meta=/storage/data2/write-pool/meta]
02 Nov 2015 16:13:05 (w-cmsstor414-disk-disk2) [] Pool mode changed to disabled(store,stage,p2p-client): Initializing
02 Nov 2015 16:13:05 (w-cmsstor414-disk-disk2) [] Pool mode changed to enabled
cmsstor415:
> /var/log/dcache/w-cmsstor415-disk-disk1Domain.log <

2015-11-02 16:14:00 Launching /usr/bin/java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.net.inetaddr.ttl=1800 -Dorg.globus.tcp.port.range=15000,35000 -Djava.net.preferIPv4Stack=true -Dorg.dcache.dcap.port=0 -Dorg.dcache.net.tcp.portrange=15000:35000 -Dorg.globus.jglobus.delegation.cache.lifetime=30000 -Dorg.globus.jglobus.crl.cache.lifetime=60000 -Djava.security.krb5.realm=EXAMPLE.ORG -Djava.security.krb5.kdc=localhost -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/etc/dcache/jgss.conf -Djava.awt.headless=true -DwantLog4jSetup=n -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dcache/w-cmsstor415-disk-disk1Domain-oom.hprof -javaagent:/usr/share/dcache/classes/spring-instrument-3.1.4.RELEASE.jar -Ddcache.home=/usr/share/dcache -Ddcache.paths.defaults=/usr/share/dcache/defaults org.dcache.boot.BootLoader start w-cmsstor415-disk-disk1Domain
02 Nov 2015 16:14:02 (System) [] Created : w-cmsstor415-disk-disk1Domain
02 Nov 2015 16:14:06 (w-cmsstor415-disk-disk1) [] Queue not created, name already exists: p2p
02 Nov 2015 16:14:06 (w-cmsstor415-disk-disk1) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Initializing
02 Nov 2015 16:14:06 (w-cmsstor415-disk-disk1) [] Reading inventory from [data=/storage/data1/write-pool/data;meta=/storage/data1/write-pool/meta]
02 Nov 2015 16:14:06 (w-cmsstor415-disk-disk1) [] Pool mode changed to disabled(store,stage,p2p-client): Initializing
02 Nov 2015 16:14:06 (w-cmsstor415-disk-disk1) [] Pool mode changed to enabled

> /var/log/dcache/w-cmsstor415-disk-disk2Domain.log <

2015-11-02 16:14:01 Launching /usr/bin/java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.net.inetaddr.ttl=1800 -Dorg.globus.tcp.port.range=15000,35000 -Djava.net.preferIPv4Stack=true -Dorg.dcache.dcap.port=0 -Dorg.dcache.net.tcp.portrange=15000:35000 -Dorg.globus.jglobus.delegation.cache.lifetime=30000 -Dorg.globus.jglobus.crl.cache.lifetime=60000 -Djava.security.krb5.realm=EXAMPLE.ORG -Djava.security.krb5.kdc=localhost -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/etc/dcache/jgss.conf -Djava.awt.headless=true -DwantLog4jSetup=n -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dcache/w-cmsstor415-disk-disk2Domain-oom.hprof -javaagent:/usr/share/dcache/classes/spring-instrument-3.1.4.RELEASE.jar -Ddcache.home=/usr/share/dcache -Ddcache.paths.defaults=/usr/share/dcache/defaults org.dcache.boot.BootLoader start w-cmsstor415-disk-disk2Domain
02 Nov 2015 16:14:02 (System) [] Created : w-cmsstor415-disk-disk2Domain
02 Nov 2015 16:14:06 (w-cmsstor415-disk-disk2) [] Queue not created, name already exists: p2p
02 Nov 2015 16:14:06 (w-cmsstor415-disk-disk2) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Initializing
02 Nov 2015 16:14:06 (w-cmsstor415-disk-disk2) [] Reading inventory from [data=/storage/data2/write-pool/data;meta=/storage/data2/write-pool/meta]
02 Nov 2015 16:14:06 (w-cmsstor415-disk-disk2) [] Pool mode changed to disabled(store,stage,p2p-client): Initializing
02 Nov 2015 16:14:06 (w-cmsstor415-disk-disk2) [] Pool mode changed to enabled
cmsstor416:
> /var/log/dcache/w-cmsstor416-disk-disk1Domain.log <

2015-11-02 16:16:26 Launching /usr/bin/java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.net.inetaddr.ttl=1800 -Dorg.globus.tcp.port.range=15000,35000 -Djava.net.preferIPv4Stack=true -Dorg.dcache.dcap.port=0 -Dorg.dcache.net.tcp.portrange=15000:35000 -Dorg.globus.jglobus.delegation.cache.lifetime=30000 -Dorg.globus.jglobus.crl.cache.lifetime=60000 -Djava.security.krb5.realm=EXAMPLE.ORG -Djava.security.krb5.kdc=localhost -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/etc/dcache/jgss.conf -Djava.awt.headless=true -DwantLog4jSetup=n -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dcache/w-cmsstor416-disk-disk1Domain-oom.hprof -javaagent:/usr/share/dcache/classes/spring-instrument-3.1.4.RELEASE.jar -Ddcache.home=/usr/share/dcache -Ddcache.paths.defaults=/usr/share/dcache/defaults org.dcache.boot.BootLoader start w-cmsstor416-disk-disk1Domain
02 Nov 2015 16:16:28 (System) [] Created : w-cmsstor416-disk-disk1Domain
02 Nov 2015 16:16:31 (w-cmsstor416-disk-disk1) [] Queue not created, name already exists: p2p
02 Nov 2015 16:16:31 (w-cmsstor416-disk-disk1) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Initializing
02 Nov 2015 16:16:31 (w-cmsstor416-disk-disk1) [] Reading inventory from [data=/storage/data1/write-pool/data;meta=/storage/data1/write-pool/meta]
02 Nov 2015 16:16:32 (w-cmsstor416-disk-disk1) [] Pool mode changed to disabled(store,stage,p2p-client): Initializing
02 Nov 2015 16:16:32 (w-cmsstor416-disk-disk1) [] Pool mode changed to enabled
02 Nov 2015 16:17:32 (w-cmsstor416-disk-disk1) [] The file system containing the data files appears to have less free space (77,219,547,144,192 bytes) than expected (77,219,547,148,288 bytes); reducing the pool size to 77,219,547,144,192 bytes to compensate. Notice that this does not leave any space for the meta data. If such data is stored on the same file system, then it is paramount that the pool size is reconfigured to leave enough space for the meta data.

> /var/log/dcache/w-cmsstor416-disk-disk2Domain.log <

2015-11-02 16:16:26 Launching /usr/bin/java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.net.inetaddr.ttl=1800 -Dorg.globus.tcp.port.range=15000,35000 -Djava.net.preferIPv4Stack=true -Dorg.dcache.dcap.port=0 -Dorg.dcache.net.tcp.portrange=15000:35000 -Dorg.globus.jglobus.delegation.cache.lifetime=30000 -Dorg.globus.jglobus.crl.cache.lifetime=60000 -Djava.security.krb5.realm=EXAMPLE.ORG -Djava.security.krb5.kdc=localhost -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/etc/dcache/jgss.conf -Djava.awt.headless=true -DwantLog4jSetup=n -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dcache/w-cmsstor416-disk-disk2Domain-oom.hprof -javaagent:/usr/share/dcache/classes/spring-instrument-3.1.4.RELEASE.jar -Ddcache.home=/usr/share/dcache -Ddcache.paths.defaults=/usr/share/dcache/defaults org.dcache.boot.BootLoader start w-cmsstor416-disk-disk2Domain
02 Nov 2015 16:16:28 (System) [] Created : w-cmsstor416-disk-disk2Domain
02 Nov 2015 16:16:31 (w-cmsstor416-disk-disk2) [] Queue not created, name already exists: p2p
02 Nov 2015 16:16:31 (w-cmsstor416-disk-disk2) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Initializing
02 Nov 2015 16:16:31 (w-cmsstor416-disk-disk2) [] Reading inventory from [data=/storage/data2/write-pool/data;meta=/storage/data2/write-pool/meta]
02 Nov 2015 16:16:32 (w-cmsstor416-disk-disk2) [] Pool mode changed to disabled(store,stage,p2p-client): Initializing
02 Nov 2015 16:16:32 (w-cmsstor416-disk-disk2) [] Pool mode changed to enabled
02 Nov 2015 16:17:32 (w-cmsstor416-disk-disk2) [] The file system containing the data files appears to have less free space (77,219,547,144,192 bytes) than expected (77,219,547,148,288 bytes); reducing the pool size to 77,219,547,144,192 bytes to compensate. Notice that this does not leave any space for the meta data. If such data is stored on the same file system, then it is paramount that the pool size is reconfigured to leave enough space for the meta data.
cmsstor417:
> /var/log/dcache/w-cmsstor417-disk-disk1Domain.log <

2015-11-02 16:15:01 Launching /usr/bin/java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.net.inetaddr.ttl=1800 -Dorg.globus.tcp.port.range=15000,35000 -Djava.net.preferIPv4Stack=true -Dorg.dcache.dcap.port=0 -Dorg.dcache.net.tcp.portrange=15000:35000 -Dorg.globus.jglobus.delegation.cache.lifetime=30000 -Dorg.globus.jglobus.crl.cache.lifetime=60000 -Djava.security.krb5.realm=EXAMPLE.ORG -Djava.security.krb5.kdc=localhost -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/etc/dcache/jgss.conf -Djava.awt.headless=true -DwantLog4jSetup=n -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dcache/w-cmsstor417-disk-disk1Domain-oom.hprof -javaagent:/usr/share/dcache/classes/spring-instrument-3.1.4.RELEASE.jar -Ddcache.home=/usr/share/dcache -Ddcache.paths.defaults=/usr/share/dcache/defaults org.dcache.boot.BootLoader start w-cmsstor417-disk-disk1Domain
02 Nov 2015 16:15:03 (System) [] Created : w-cmsstor417-disk-disk1Domain
02 Nov 2015 16:15:06 (w-cmsstor417-disk-disk1) [] Queue not created, name already exists: p2p
02 Nov 2015 16:15:06 (w-cmsstor417-disk-disk1) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Initializing
02 Nov 2015 16:15:07 (w-cmsstor417-disk-disk1) [] Reading inventory from [data=/storage/data1/write-pool/data;meta=/storage/data1/write-pool/meta]
02 Nov 2015 16:15:07 (w-cmsstor417-disk-disk1) [] Pool mode changed to disabled(store,stage,p2p-client): Initializing
02 Nov 2015 16:15:07 (w-cmsstor417-disk-disk1) [] Pool mode changed to enabled

> /var/log/dcache/w-cmsstor417-disk-disk2Domain.log <

2015-11-02 16:15:01 Launching /usr/bin/java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.net.inetaddr.ttl=1800 -Dorg.globus.tcp.port.range=15000,35000 -Djava.net.preferIPv4Stack=true -Dorg.dcache.dcap.port=0 -Dorg.dcache.net.tcp.portrange=15000:35000 -Dorg.globus.jglobus.delegation.cache.lifetime=30000 -Dorg.globus.jglobus.crl.cache.lifetime=60000 -Djava.security.krb5.realm=EXAMPLE.ORG -Djava.security.krb5.kdc=localhost -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/etc/dcache/jgss.conf -Djava.awt.headless=true -DwantLog4jSetup=n -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dcache/w-cmsstor417-disk-disk2Domain-oom.hprof -javaagent:/usr/share/dcache/classes/spring-instrument-3.1.4.RELEASE.jar -Ddcache.home=/usr/share/dcache -Ddcache.paths.defaults=/usr/share/dcache/defaults org.dcache.boot.BootLoader start w-cmsstor417-disk-disk2Domain
02 Nov 2015 16:15:03 (System) [] Created : w-cmsstor417-disk-disk2Domain
02 Nov 2015 16:15:06 (w-cmsstor417-disk-disk2) [] Queue not created, name already exists: p2p
02 Nov 2015 16:15:06 (w-cmsstor417-disk-disk2) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Initializing
02 Nov 2015 16:15:07 (w-cmsstor417-disk-disk2) [] Reading inventory from [data=/storage/data2/write-pool/data;meta=/storage/data2/write-pool/meta]
02 Nov 2015 16:15:07 (w-cmsstor417-disk-disk2) [] Pool mode changed to disabled(store,stage,p2p-client): Initializing
02 Nov 2015 16:15:07 (w-cmsstor417-disk-disk2) [] Pool mode changed to enabled
cmsstor418:
> /var/log/dcache/w-cmsstor418-disk-disk1Domain.log <

2015-11-02 16:17:21 Launching /usr/bin/java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.net.inetaddr.ttl=1800 -Dorg.globus.tcp.port.range=15000,35000 -Djava.net.preferIPv4Stack=true -Dorg.dcache.dcap.port=0 -Dorg.dcache.net.tcp.portrange=15000:35000 -Dorg.globus.jglobus.delegation.cache.lifetime=30000 -Dorg.globus.jglobus.crl.cache.lifetime=60000 -Djava.security.krb5.realm=EXAMPLE.ORG -Djava.security.krb5.kdc=localhost -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/etc/dcache/jgss.conf -Djava.awt.headless=true -DwantLog4jSetup=n -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dcache/w-cmsstor418-disk-disk1Domain-oom.hprof -javaagent:/usr/share/dcache/classes/spring-instrument-3.1.4.RELEASE.jar -Ddcache.home=/usr/share/dcache -Ddcache.paths.defaults=/usr/share/dcache/defaults org.dcache.boot.BootLoader start w-cmsstor418-disk-disk1Domain
02 Nov 2015 16:17:23 (System) [] Created : w-cmsstor418-disk-disk1Domain
02 Nov 2015 16:17:26 (w-cmsstor418-disk-disk1) [] Queue not created, name already exists: p2p
02 Nov 2015 16:17:26 (w-cmsstor418-disk-disk1) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Initializing
02 Nov 2015 16:17:26 (w-cmsstor418-disk-disk1) [] Reading inventory from [data=/storage/data1/write-pool/data;meta=/storage/data1/write-pool/meta]
02 Nov 2015 16:17:26 (w-cmsstor418-disk-disk1) [] Pool mode changed to disabled(store,stage,p2p-client): Initializing
02 Nov 2015 16:17:26 (w-cmsstor418-disk-disk1) [] Pool mode changed to enabled
02 Nov 2015 16:18:26 (w-cmsstor418-disk-disk1) [] The file system containing the data files appears to have less free space (77,219,547,144,192 bytes) than expected (77,219,547,148,288 bytes); reducing the pool size to 77,219,547,144,192 bytes to compensate. Notice that this does not leave any space for the meta data. If such data is stored on the same file system, then it is paramount that the pool size is reconfigured to leave enough space for the meta data.

> /var/log/dcache/w-cmsstor418-disk-disk2Domain.log <

2015-11-02 16:17:21 Launching /usr/bin/java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.net.inetaddr.ttl=1800 -Dorg.globus.tcp.port.range=15000,35000 -Djava.net.preferIPv4Stack=true -Dorg.dcache.dcap.port=0 -Dorg.dcache.net.tcp.portrange=15000:35000 -Dorg.globus.jglobus.delegation.cache.lifetime=30000 -Dorg.globus.jglobus.crl.cache.lifetime=60000 -Djava.security.krb5.realm=EXAMPLE.ORG -Djava.security.krb5.kdc=localhost -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/etc/dcache/jgss.conf -Djava.awt.headless=true -DwantLog4jSetup=n -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dcache/w-cmsstor418-disk-disk2Domain-oom.hprof -javaagent:/usr/share/dcache/classes/spring-instrument-3.1.4.RELEASE.jar -Ddcache.home=/usr/share/dcache -Ddcache.paths.defaults=/usr/share/dcache/defaults org.dcache.boot.BootLoader start w-cmsstor418-disk-disk2Domain
02 Nov 2015 16:17:23 (System) [] Created : w-cmsstor418-disk-disk2Domain
02 Nov 2015 16:17:26 (w-cmsstor418-disk-disk2) [] Queue not created, name already exists: p2p
02 Nov 2015 16:17:26 (w-cmsstor418-disk-disk2) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Initializing
02 Nov 2015 16:17:26 (w-cmsstor418-disk-disk2) [] Reading inventory from [data=/storage/data2/write-pool/data;meta=/storage/data2/write-pool/meta]
02 Nov 2015 16:17:26 (w-cmsstor418-disk-disk2) [] Pool mode changed to disabled(store,stage,p2p-client): Initializing
02 Nov 2015 16:17:26 (w-cmsstor418-disk-disk2) [] Pool mode changed to enabled
cmsstor419:
> /var/log/dcache/w-cmsstor419-disk-disk1Domain.log <

2015-11-02 16:17:21 Launching /usr/bin/java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.net.inetaddr.ttl=1800 -Dorg.globus.tcp.port.range=15000,35000 -Djava.net.preferIPv4Stack=true -Dorg.dcache.dcap.port=0 -Dorg.dcache.net.tcp.portrange=15000:35000 -Dorg.globus.jglobus.delegation.cache.lifetime=30000 -Dorg.globus.jglobus.crl.cache.lifetime=60000 -Djava.security.krb5.realm=EXAMPLE.ORG -Djava.security.krb5.kdc=localhost -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/etc/dcache/jgss.conf -Djava.awt.headless=true -DwantLog4jSetup=n -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dcache/w-cmsstor419-disk-disk1Domain-oom.hprof -javaagent:/usr/share/dcache/classes/spring-instrument-3.1.4.RELEASE.jar -Ddcache.home=/usr/share/dcache -Ddcache.paths.defaults=/usr/share/dcache/defaults org.dcache.boot.BootLoader start w-cmsstor419-disk-disk1Domain
02 Nov 2015 16:17:23 (System) [] Created : w-cmsstor419-disk-disk1Domain
02 Nov 2015 16:17:26 (w-cmsstor419-disk-disk1) [] Queue not created, name already exists: p2p
02 Nov 2015 16:17:26 (w-cmsstor419-disk-disk1) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Initializing
02 Nov 2015 16:17:26 (w-cmsstor419-disk-disk1) [] Reading inventory from [data=/storage/data1/write-pool/data;meta=/storage/data1/write-pool/meta]
02 Nov 2015 16:17:26 (w-cmsstor419-disk-disk1) [] Pool mode changed to disabled(store,stage,p2p-client): Initializing
02 Nov 2015 16:17:26 (w-cmsstor419-disk-disk1) [] Pool mode changed to enabled
02 Nov 2015 16:18:26 (w-cmsstor419-disk-disk1) [] The file system containing the data files appears to have less free space (77,219,547,144,192 bytes) than expected (77,219,547,148,288 bytes); reducing the pool size to 77,219,547,144,192 bytes to compensate. Notice that this does not leave any space for the meta data. If such data is stored on the same file system, then it is paramount that the pool size is reconfigured to leave enough space for the meta data.

> /var/log/dcache/w-cmsstor419-disk-disk2Domain.log <

2015-11-02 16:17:21 Launching /usr/bin/java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.net.inetaddr.ttl=1800 -Dorg.globus.tcp.port.range=15000,35000 -Djava.net.preferIPv4Stack=true -Dorg.dcache.dcap.port=0 -Dorg.dcache.net.tcp.portrange=15000:35000 -Dorg.globus.jglobus.delegation.cache.lifetime=30000 -Dorg.globus.jglobus.crl.cache.lifetime=60000 -Djava.security.krb5.realm=EXAMPLE.ORG -Djava.security.krb5.kdc=localhost -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/etc/dcache/jgss.conf -Djava.awt.headless=true -DwantLog4jSetup=n -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dcache/w-cmsstor419-disk-disk2Domain-oom.hprof -javaagent:/usr/share/dcache/classes/spring-instrument-3.1.4.RELEASE.jar -Ddcache.home=/usr/share/dcache -Ddcache.paths.defaults=/usr/share/dcache/defaults org.dcache.boot.BootLoader start w-cmsstor419-disk-disk2Domain
02 Nov 2015 16:17:23 (System) [] Created : w-cmsstor419-disk-disk2Domain
02 Nov 2015 16:17:26 (w-cmsstor419-disk-disk2) [] Queue not created, name already exists: p2p
02 Nov 2015 16:17:26 (w-cmsstor419-disk-disk2) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Initializing
02 Nov 2015 16:17:26 (w-cmsstor419-disk-disk2) [] Reading inventory from [data=/storage/data2/write-pool/data;meta=/storage/data2/write-pool/meta]
02 Nov 2015 16:17:26 (w-cmsstor419-disk-disk2) [] Pool mode changed to disabled(store,stage,p2p-client): Initializing
02 Nov 2015 16:17:26 (w-cmsstor419-disk-disk2) [] Pool mode changed to enabled
02 Nov 2015 16:18:26 (w-cmsstor419-disk-disk2) [] The file system containing the data files appears to have less free space (77,219,547,144,192 bytes) than expected (77,219,547,148,288 bytes); reducing the pool size to 77,219,547,144,192 bytes to compensate. Notice that this does not leave any space for the meta data. If such data is stored on the same file system, then it is paramount that the pool size is reconfigured to leave enough space for the meta data.
cmsstor420:
> /var/log/dcache/w-cmsstor420-disk-disk1Domain.log <

2015-11-02 16:17:51 Launching /usr/bin/java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.net.inetaddr.ttl=1800 -Dorg.globus.tcp.port.range=15000,35000 -Djava.net.preferIPv4Stack=true -Dorg.dcache.dcap.port=0 -Dorg.dcache.net.tcp.portrange=15000:35000 -Dorg.globus.jglobus.delegation.cache.lifetime=30000 -Dorg.globus.jglobus.crl.cache.lifetime=60000 -Djava.security.krb5.realm=EXAMPLE.ORG -Djava.security.krb5.kdc=localhost -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/etc/dcache/jgss.conf -Djava.awt.headless=true -DwantLog4jSetup=n -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dcache/w-cmsstor420-disk-disk1Domain-oom.hprof -javaagent:/usr/share/dcache/classes/spring-instrument-3.1.4.RELEASE.jar -Ddcache.home=/usr/share/dcache -Ddcache.paths.defaults=/usr/share/dcache/defaults org.dcache.boot.BootLoader start w-cmsstor420-disk-disk1Domain
02 Nov 2015 16:17:53 (System) [] Created : w-cmsstor420-disk-disk1Domain
02 Nov 2015 16:17:56 (w-cmsstor420-disk-disk1) [] Queue not created, name already exists: p2p
02 Nov 2015 16:17:56 (w-cmsstor420-disk-disk1) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Initializing
02 Nov 2015 16:17:56 (w-cmsstor420-disk-disk1) [] Reading inventory from [data=/storage/data1/write-pool/data;meta=/storage/data1/write-pool/meta]
02 Nov 2015 16:17:56 (w-cmsstor420-disk-disk1) [] Pool mode changed to disabled(store,stage,p2p-client): Initializing
02 Nov 2015 16:17:56 (w-cmsstor420-disk-disk1) [] Pool mode changed to enabled

> /var/log/dcache/w-cmsstor420-disk-disk2Domain.log <

2015-11-02 16:17:51 Launching /usr/bin/java -server -Xmx5096m -XX:MaxDirectMemorySize=5096m -Dsun.net.inetaddr.ttl=1800 -Dorg.globus.tcp.port.range=15000,35000 -Djava.net.preferIPv4Stack=true -Dorg.dcache.dcap.port=0 -Dorg.dcache.net.tcp.portrange=15000:35000 -Dorg.globus.jglobus.delegation.cache.lifetime=30000 -Dorg.globus.jglobus.crl.cache.lifetime=60000 -Djava.security.krb5.realm=EXAMPLE.ORG -Djava.security.krb5.kdc=localhost -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/etc/dcache/jgss.conf -Djava.awt.headless=true -DwantLog4jSetup=n -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dcache/w-cmsstor420-disk-disk2Domain-oom.hprof -javaagent:/usr/share/dcache/classes/spring-instrument-3.1.4.RELEASE.jar -Ddcache.home=/usr/share/dcache -Ddcache.paths.defaults=/usr/share/dcache/defaults org.dcache.boot.BootLoader start w-cmsstor420-disk-disk2Domain
02 Nov 2015 16:17:53 (System) [] Created : w-cmsstor420-disk-disk2Domain
02 Nov 2015 16:17:56 (w-cmsstor420-disk-disk2) [] Queue not created, name already exists: p2p
02 Nov 2015 16:17:56 (w-cmsstor420-disk-disk2) [] Pool mode changed to disabled(fetch,store,stage,p2p-client,p2p-server): Initializing
02 Nov 2015 16:17:56 (w-cmsstor420-disk-disk2) [] Reading inventory from [data=/storage/data2/write-pool/data;meta=/storage/data2/write-pool/meta]
02 Nov 2015 16:17:56 (w-cmsstor420-disk-disk2) [] Pool mode changed to disabled(store,stage,p2p-client): Initializing
02 Nov 2015 16:17:56 (w-cmsstor420-disk-disk2) [] Pool mode changed to enabled
sh-4.1$

Checking dCache:

sh-4.1$ for i in `cat newpools`; do echo $i:; ssh -l root $i dcache status; done
cmsstor411:
DOMAIN STATUS PID USER
w-cmsstor411-disk-disk1Domain running 10180 root
w-cmsstor411-disk-disk2Domain running 10240 root
gridftp-cmsstor411Domain running 10300 root
cmsstor412:
DOMAIN STATUS PID USER
w-cmsstor412-disk-disk1Domain running 10177 root
w-cmsstor412-disk-disk2Domain running 10237 root
gridftp-cmsstor412Domain running 10301 root
cmsstor413:
DOMAIN STATUS PID USER
w-cmsstor413-disk-disk1Domain running 10203 root
w-cmsstor413-disk-disk2Domain running 10261 root
gridftp-cmsstor413Domain running 10324 root
cmsstor414:
DOMAIN STATUS PID USER
w-cmsstor414-disk-disk1Domain running 10202 root
w-cmsstor414-disk-disk2Domain running 10262 root
gridftp-cmsstor414Domain running 10326 root
cmsstor415:
DOMAIN STATUS PID USER
w-cmsstor415-disk-disk1Domain running 10198 root
w-cmsstor415-disk-disk2Domain running 10258 root
gridftp-cmsstor415Domain running 10319 root
cmsstor416:
DOMAIN STATUS PID USER
w-cmsstor416-disk-disk1Domain running 10181 root
w-cmsstor416-disk-disk2Domain running 10241 root
gridftp-cmsstor416Domain running 10301 root
cmsstor417:
DOMAIN STATUS PID USER
w-cmsstor417-disk-disk1Domain running 10201 root
w-cmsstor417-disk-disk2Domain running 10260 root
gridftp-cmsstor417Domain running 10321 root
cmsstor418:
DOMAIN STATUS PID USER
w-cmsstor418-disk-disk1Domain running 10202 root
w-cmsstor418-disk-disk2Domain running 10262 root
gridftp-cmsstor418Domain running 10322 root
cmsstor419:
DOMAIN STATUS PID USER
w-cmsstor419-disk-disk1Domain running 10180 root
w-cmsstor419-disk-disk2Domain running 10239 root
gridftp-cmsstor419Domain running 10304 root
cmsstor420:
DOMAIN STATUS PID USER
w-cmsstor420-disk-disk1Domain running 10175 root
w-cmsstor420-disk-disk2Domain running 10233 root
gridftp-cmsstor420Domain running 10295 root
sh-4.1$

#15 Updated by Chih-Hao Huang about 4 years ago

sh-3.2$ git branch
  • huangch_add_cmsstor411_to_cmsstor420_to_production
    huangch_cmsstor263_issue
    itb
    master
    production
    sh-3.2$ git diff itb
    diff --git a/modules/dcache/files/etc/dcache/poolmanager-disk.conf b/modules/dcache/files/etc/dcache/poolmanager-disk.conf
    index 2eb2120..1f49bd4 100644
    --- a/modules/dcache/files/etc/dcache/poolmanager-disk.conf
    +++ b/modules/dcache/files/etc/dcache/poolmanager-disk.conf
    @ -604,6 +604,26 @ psu create pool w-cmsstor409-disk-disk1
    psu create pool w-cmsstor409-disk-disk2
    psu create pool w-cmsstor410-disk-disk1
    psu create pool w-cmsstor410-disk-disk2
    +psu create pool w-cmsstor411-disk-disk1
    +psu create pool w-cmsstor411-disk-disk2
    +psu create pool w-cmsstor412-disk-disk1
    +psu create pool w-cmsstor412-disk-disk2
    +psu create pool w-cmsstor413-disk-disk1
    +psu create pool w-cmsstor413-disk-disk2
    +psu create pool w-cmsstor414-disk-disk1
    +psu create pool w-cmsstor414-disk-disk2
    +psu create pool w-cmsstor415-disk-disk1
    +psu create pool w-cmsstor415-disk-disk2
    +psu create pool w-cmsstor416-disk-disk1
    +psu create pool w-cmsstor416-disk-disk2
    +psu create pool w-cmsstor417-disk-disk1
    +psu create pool w-cmsstor417-disk-disk2
    +psu create pool w-cmsstor418-disk-disk1
    +psu create pool w-cmsstor418-disk-disk2
    +psu create pool w-cmsstor419-disk-disk1
    +psu create pool w-cmsstor419-disk-disk2
    +psu create pool w-cmsstor420-disk-disk1
    +psu create pool w-cmsstor420-disk-disk2
#
 # The pool groups ...
@ -1180,6 +1200,26 @ psu addto pgroup flushPools w-cmsstor409-disk-disk1
psu addto pgroup flushPools w-cmsstor409-disk-disk2
psu addto pgroup flushPools w-cmsstor410-disk-disk1
psu addto pgroup flushPools w-cmsstor410-disk-disk2
+psu addto pgroup flushPools w-cmsstor411-disk-disk1
+psu addto pgroup flushPools w-cmsstor411-disk-disk2
+psu addto pgroup flushPools w-cmsstor412-disk-disk1
+psu addto pgroup flushPools w-cmsstor412-disk-disk2
+psu addto pgroup flushPools w-cmsstor413-disk-disk1
+psu addto pgroup flushPools w-cmsstor413-disk-disk2
+psu addto pgroup flushPools w-cmsstor414-disk-disk1
+psu addto pgroup flushPools w-cmsstor414-disk-disk2
+psu addto pgroup flushPools w-cmsstor415-disk-disk1
+psu addto pgroup flushPools w-cmsstor415-disk-disk2
+psu addto pgroup flushPools w-cmsstor416-disk-disk1
+psu addto pgroup flushPools w-cmsstor416-disk-disk2
+psu addto pgroup flushPools w-cmsstor417-disk-disk1
+psu addto pgroup flushPools w-cmsstor417-disk-disk2
+psu addto pgroup flushPools w-cmsstor418-disk-disk1
+psu addto pgroup flushPools w-cmsstor418-disk-disk2
+psu addto pgroup flushPools w-cmsstor419-disk-disk1
+psu addto pgroup flushPools w-cmsstor419-disk-disk2
+psu addto pgroup flushPools w-cmsstor420-disk-disk1
+psu addto pgroup flushPools w-cmsstor420-disk-disk2
#
 # Sections
sh-3.2$

#16 Updated by Chih-Hao Huang about 4 years ago

sh-3.2$ push_to_itb b huangch_add_cmsstor411_to_cmsstor420_to_production -s Gerard
Branch to merge: huangch_add_cmsstor411_to_cmsstor420_to_production
Signed off by: Gerard
Merging to branch: itb
Switched to branch 'itb'
Your branch is up-to-date with 'origin/itb'.
Updating 4b623bc..de53f54
Fast-forward
Squash commit -
not updating HEAD
modules/dcache/files/etc/dcache/poolmanager-disk.conf | 40 ++++++++++++++++++++++++++++++++++++++
1 file changed, 40 insertions()
[itb 74f9561] Add new pools to configuration file
1 file changed, 40 insertions(
)
Counting objects: 1, done.
Writing objects: 100% (1/1), 222 bytes | 0 bytes/s, done.
Total 1 (delta 0), reused 0 (delta 0)
remote:
remote: diff-tree:
remote: :100644 100644 2eb2120c9b90d7d91cff344e69ea32d97c349db4 1f49bd4725fcdae18052c644df4b2e04fbea2a28 M modules/dcache/files/etc/dcache/poolmanager-disk.conf
remote: Recieved from stdin:
remote: oldrev: 4b623bc19b449c6696451e334f8548aa1ce30086
remote: newrev: 74f9561aa58c9f384f87223cee9013ff6d768a4c
remote: refname: refs/heads/itb
remote: Derived Configuration:
remote: REPO: puppet@cms-git:/var/lib/puppet/puppet.git
remote: BRANCH: itb
remote: BRANCH_DIR: /srv/puppet/environments
remote: PUPPET_SERVERS:
remote: Updating remote branch /srv/puppet/environments/itb on
remote: attempting to pull branch itb to /srv/puppet/environments/itb
remote: From cms-git:/var/lib/puppet/puppet
remote: * branch itb > FETCH_HEAD
remote: Updating 4b623bc..74f9561
remote: Fast-forward
remote: .../dcache/files/etc/dcache/poolmanager-disk.conf | 40 ++++++++++++++++++
remote: 1 files changed, 40 insertions(), 0 deletions(
)
remote: Updating remote branch /srv/puppet/environments/itb on
remote: attempting to pull branch itb to /srv/puppet/environments/itb
remote: From cms-git:/var/lib/puppet/puppet
remote: * branch itb > FETCH_HEAD
remote: Updating 4b623bc..74f9561
remote: Fast-forward
remote: .../dcache/files/etc/dcache/poolmanager-disk.conf | 40 +++++++++++++++++++
remote: 1 files changed, 40 insertions(), 0 deletions(
)
remote: Updating remote branch /srv/puppet/environments/itb on
remote: attempting to pull branch itb to /srv/puppet/environments/itb
remote: From cms-git:/var/lib/puppet/puppet
remote: * branch itb > FETCH_HEAD
remote: Updating 4b623bc..74f9561
remote: Fast-forward
remote: .../dcache/files/etc/dcache/poolmanager-disk.conf | 40 +++++++++++++++++++
remote: 1 files changed, 40 insertions(+), 0 deletions(
)
To :/var/lib/puppet/puppet.git
4b623bc..74f9561 itb -> itb
Switched to branch 'huangch_add_cmsstor411_to_cmsstor420_to_production'
Your branch is up-to-date with 'origin/huangch_add_cmsstor411_to_cmsstor420_to_production'.
sh-3.2$

#17 Updated by Chih-Hao Huang about 4 years ago

  • Status changed from Assigned to Resolved
  • % Done changed from 90 to 100

cmsstor411-cmsstor420 have been in production for three weeks.



Also available in: Atom PDF