Project

General

Profile

Task #8928

Enhance procedure for adding new pools

Added by Natalia Ratnikova over 4 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
Normal
Start date:
05/27/2015
Due date:
06/11/2015
% Done:

100%

Estimated time:
10.00 h
Spent time:
Duration: 16

Description

Review current steps for adding new pools and include extra step for testing the hardware.

Gerards_nodes_testing_sketch.jpg (61 KB) Gerards_nodes_testing_sketch.jpg Natalia Ratnikova, 05/26/2015 10:15 AM

Related issues

Precedes CMS dCache - Task #8955: Add cmsstor409 and cmsstor410 in dCache Disk Production following new enhanced procedureResolved06/12/201506/17/2015

Precedes CMS dCache - Bug #8957: Follow the enhanced procedure for adding new pools on cmsstor112 and add it to dCache disk TestAccepted06/12/201506/16/2015

Precedes (1 day) CMS dCache - Task #8910: Configure cmsstor108 and cmsstor110 to dCache-tape test standClosed08/11/201508/11/2015

History

#1 Updated by Natalia Ratnikova over 4 years ago

The idea is that from now on the DCSO will add pools to the test dCache instance, perform a "burning" procedure ( similar to ECF-CIS?) and then move pools into production.

Another option would be to use a special group for test pools within the production instance.

Proposal sketch by Gerard is attached.

#2 Updated by Gerard Bernabeu Altayo over 4 years ago

The procedure that we need to enhance is:

https://cmsweb.fnal.gov/bin/view/Storage/DCache22Procedures#Add_new_dCache_pool_procedure

Actually we should create a proper workflow that does:

1. https://cmsweb.fnal.gov/bin/view/Storage/DCache22Procedures#Add_new_dCache_pool_procedure to the TEST instance (of disk and/or tape), on a 'burnin poolgroup'

2. Start 'migration copy' processes so that all the pools receive and send data arround, we should copy a few (~5TB should be enough). Migrations should all succeed and have no errors for the pools to pass the acceptance test.
- Migrations should look like:
-- On all the 'permanent pools' we run: migration copy 'permanent pool with persistent data'->pgroup_burnin
-- On all the pools in pgroup_burnin we run: migration copy -persistent to prgroup_burnin. This will distribute files within the pgroup_burnin itself, causing data reads and writes on the new pools.

3. Stop all migration processes (in dCache test instance). Remove all data from the pools (there is a pool purge command that will do this, on the dCache CLI after you login to the pool cell).

4. Remove the pools from dCache test instance (PoolManager)

5. https://cmsweb.fnal.gov/bin/view/Storage/DCache22Procedures#Add_new_dCache_pool_procedure to the PRODUCTION instance (of disk and/or tape), on the right poolgroup

#3 Updated by Gerard Bernabeu Altayo over 4 years ago

cmsstor409/410 already had a working XFS FS, not clear if this happened like this because ECF will create the partitions now or why. Email sent to ECF-CIS asking for that.

Our old procedure relies on DCSO creating the partitions, so to break the already done partitions I did:
cmsstor410.fnal.gov - dcachepooldisk/production (SLF 6.6)
16-core Opteron 6320 (H8DGU); 62.90 GB RAM, 20.00 GB swap
[root@cmsstor410 ~]# service puppet stop
Stopping puppet agent: [ OK ]
[root@cmsstor410 ~]# service dcache-server stop
Stopping gridftp-cmsstor410Domain 0 done
Stopping w-cmsstor410-disk-disk2Domain 0 1 done
Stopping w-cmsstor410-disk-disk1Domain 0 1 done
[root@cmsstor410 ~]# umount /dev/sdc
[root@cmsstor410 ~]# umount /dev/sdb
[root@cmsstor410 ~]# dd if=/dev/zero of=/dev/sdc bs=10240 count=10000
10000+0 records in
10000+0 records out
102400000 bytes (102 MB) copied, 0.615781 s, 166 MB/s
[root@cmsstor410 ~]# dd if=/dev/zero of=/dev/sdb bs=10240 count=10000
10000+0 records in
10000+0 records out
102400000 bytes (102 MB) copied, 0.646658 s, 158 MB/s
[root@cmsstor410 ~]# mount -a
mount: special device LABEL=dcache-disk1 does not exist
mount: special device LABEL=dcache-disk2 does not exist
[root@cmsstor410 ~]#

#4 Updated by Gerard Bernabeu Altayo over 4 years ago

  • Precedes Task #8955: Add cmsstor409 and cmsstor410 in dCache Disk Production following new enhanced procedure added

#5 Updated by Gerard Bernabeu Altayo over 4 years ago

  • Due date set to 06/02/2015
  • Start date changed from 05/26/2015 to 05/27/2015
  • Estimated time set to 10.00 h

#6 Updated by Gerard Bernabeu Altayo over 4 years ago

  • Precedes Bug #8957: Follow the enhanced procedure for adding new pools on cmsstor112 and add it to dCache disk Test added

#7 Updated by Gerard Bernabeu Altayo over 4 years ago

  • Precedes Task #8910: Configure cmsstor108 and cmsstor110 to dCache-tape test stand added

#8 Updated by Natalia Ratnikova over 4 years ago

  • Created puppet branch natalia_burning_dcache_pools
    - added new group for burning pools
    - merged in Gerard's general-type changes from 4702b16f which remove Resilient pools related stuff and add readonlyPools group.
    - added cmsstor151 pools to readonlyPools group, as we already have cmsstor155 pools in the flush group.

    Refer to dCache docs for groups/links setup instructions.

#9 Updated by Natalia Ratnikova over 4 years ago

  • Tested block of instructions for creating xfs/labeling data disks on cmsstor151 which already has dcache installed

[root@cmsstor151 ~]# service puppet stop
Stopping puppet agent: [ OK ]
[root@cmsstor151 ~]# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda3 222630532 13899504 197415368 7% /
tmpfs 8164516 0 8164516 0% /dev/shm
/dev/sda1 999320 60404 886488 7% /boot
/dev/sdd 11718276608 33188 11718243420 1% /storage/data3
/dev/sdb 11718276608 33188 11718243420 1% /storage/data1
/dev/sdc 11718276608 33188 11718243420 1% /storage/data2
[root@cmsstor151 ~]# xfs_admin -l /dev/sdd
label = "dcache-disk3"
[root@cmsstor151 ~]# xfs_admin -l /dev/sdb
label = "dcache-disk1"
[root@cmsstor151 ~]# xfs_admin -l /dev/sdc
label = "dcache-disk2"
[root@cmsstor151 ~]# dcache status
DOMAIN STATUS PID USER
w-cmsstor151-disk_itb-disk1Domain running 24813 root
w-cmsstor151-disk_itb-disk2Domain running 24868 root
w-cmsstor151-disk_itb-disk3Domain running 24928 root
gridftp-cmsstor151Domain running 24987 root
[root@cmsstor151 ~]# dcache stop
Stopping gridftp-cmsstor151Domain 0 done
Stopping w-cmsstor151-disk_itb-disk3Domain 0 done
Stopping w-cmsstor151-disk_itb-disk2Domain 0 done
Stopping w-cmsstor151-disk_itb-disk1Domain 0 done
[root@cmsstor151 ~]# mkfs.xfs /dev/sdb;
mkfs.xfs: /dev/sdb contains a mounted filesystem
Usage: mkfs.xfs
/* blocksize / [-b log=n|size=num]
/
data subvol / [-d agcount=n,agsize=n,file,name=xxx,size=num,
(sunit=value,swidth=value|su=num,sw=num),
sectlog=n|sectsize=num
/
inode size / [-i log=n|perblock=n|size=num,maxpct=n,attr=0|1|2,
projid32bit=0|1]
/
log subvol / [-l agnum=n,internal,size=num,logdev=xxx,version=n
sunit=value|su=num,sectlog=n|sectsize=num,
lazy-count=0|1]
/
label / [-L label (maximum 12 characters)]
/
naming / [-n log=n|size=num,version=2|ci]
/
prototype file / [-p fname]
/
quiet / [-q]
/
realtime subvol / [-r extsize=num,size=num,rtdev=xxx]
/
sectorsize / [-s log=n|size=num]
/
version */ [-V]
devicename
<devicename> is required unless -d name=xxx is given.
<num> is xxx (bytes), xxxs (sectors), xxxb (fs blocks), xxxk (xxx KiB),
xxxm (xxx MiB), xxxg (xxx GiB), xxxt (xxx TiB) or xxxp (xxx PiB).
<value> is xxx (512 byte blocks).
[root@cmsstor151 ~]# xfs_admin -L dcache-disk1 /dev/sdb;
xfs_admin: /dev/sdb contains a mounted filesystem

fatal error -- couldn't initialize XFS library
[root@cmsstor151 ~]# xfs_admin -l /dev/sdc
label = "dcache-disk2"
[root@cmsstor151 ~]# xfs_admin -l /dev/sdb
label = "dcache-disk1"
[root@cmsstor151 ~]# puppet agent -t
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts in /var/lib/puppet/lib/facter/facter_dot_d.rb
Info: Loading facts in /var/lib/puppet/lib/facter/code_server.rb
Info: Loading facts in /var/lib/puppet/lib/facter/cvmfsversion.rb
Info: Loading facts in /var/lib/puppet/lib/facter/postgres_default_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/cvmfspartsize.rb
Info: Loading facts in /var/lib/puppet/lib/facter/certificate_facts.rb
Info: Loading facts in /var/lib/puppet/lib/facter/root_home.rb
Info: Loading facts in /var/lib/puppet/lib/facter/concat_basedir.rb
Info: Loading facts in /var/lib/puppet/lib/facter/ip6tables_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb
Info: Loading facts in /var/lib/puppet/lib/facter/os_maj_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/iptables_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/rsyslog_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/iptables_persistent_version.rb
Info: Caching catalog for cmsstor151.fnal.gov
Info: Applying configuration version '1432862178'
Notice: /Stage[main]/Dcache/Service[dcache-server]/ensure: ensure changed 'stopped' to 'running'
Info: /Stage[main]/Dcache/Service[dcache-server]: Unscheduling refresh on Service[dcache-server]
Notice: Finished catalog run in 22.84 seconds

#10 Updated by Natalia Ratnikova over 4 years ago

  • Stop dcache and break the file system on the disks to simulate uninitialized state, see Gerard's hints above.

[root@cmsstor151 ~]# dcache stop
Stopping gridftp-cmsstor151Domain 0 done
Stopping w-cmsstor151-disk_itb-disk3Domain 0 1 done
Stopping w-cmsstor151-disk_itb-disk2Domain 0 1 done
Stopping w-cmsstor151-disk_itb-disk1Domain 0 1 done
[root@cmsstor151 ~]# umount /dev/sdb
[root@cmsstor151 ~]# umount /dev/sdc
[root@cmsstor151 ~]# umount /dev/sdd
[root@cmsstor151 ~]# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda3 222630532 13899712 197415160 7% /
tmpfs 8164516 0 8164516 0% /dev/shm
/dev/sda1 999320 60404 886488 7% /boot
[root@cmsstor151 ~]# dd if=/dev/zero of=/dev/sdb bs=10240 count=10000
10000+0 records in
10000+0 records out
102400000 bytes (102 MB) copied, 2.38226 s, 43.0 MB/s
[root@cmsstor151 ~]# dd if=/dev/zero of=/dev/sdc bs=10240 count=10000
10000+0 records in
10000+0 records out
102400000 bytes (102 MB) copied, 2.27509 s, 45.0 MB/s
[root@cmsstor151 ~]# dd if=/dev/zero of=/dev/sdd bs=10240 count=10000
10000+0 records in
10000+0 records out
102400000 bytes (102 MB) copied, 2.26636 s, 45.2 MB/s
[root@cmsstor151 ~]# mount -a
mount: special device LABEL=dcache-disk3 does not exist
mount: special device LABEL=dcache-disk1 does not exist
mount: special device LABEL=dcache-disk2 does not exist

#11 Updated by Natalia Ratnikova over 4 years ago

Follow steps at:
https://cmsweb.fnal.gov/bin/view/Storage/DCache22Procedures#Add_new_dCache_pool_procedure

push ENC change: e3c1987f796dac2a242352f7876c203e3197004d .

On the node:

[root@cmsstor151 ~]# modprobe qla2xxx;
[root@cmsstor151 ~]# yum install -y xfsprogs;
Loaded plugins: priorities, security
Setting up Install Process
7749 packages excluded due to repository priority protections
Package xfsprogs-3.1.1-16.el6.x86_64 already installed and latest version
Nothing to do
[root@cmsstor151 ~]# mkfs.xfs /dev/sdb;
meta-data=/dev/sdb isize=256 agcount=32, agsize=91565344 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=2930090880, imaxpct=5 = sunit=32 swidth=98304 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=32 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
[root@cmsstor151 ~]# xfs_admin -L dcache-disk1 /dev/sdb;
writing all SBs
new label = "dcache-disk1"
[root@cmsstor151 ~]# mkfs.xfs /dev/sdc;
meta-data=/dev/sdc isize=256 agcount=32, agsize=91565344 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=2930090880, imaxpct=5 = sunit=32 swidth=98304 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=32 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
[root@cmsstor151 ~]# xfs_admin -L dcache-disk2 /dev/sdc;
writing all SBs
new label = "dcache-disk2"
[root@cmsstor151 ~]# mkfs.xfs /dev/sdd;
meta-data=/dev/sdd isize=256 agcount=32, agsize=91565344 blks = sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=2930090880, imaxpct=5 = sunit=32 swidth=98304 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=521728, version=2 = sectsz=512 sunit=32 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
[root@cmsstor151 ~]# xfs_admin -L dcache-disk3 /dev/sdd;
writing all SBs
new label = "dcache-disk3"
[root@cmsstor151 ~]#

Run puppet twice: first run mounts the pools and starts dcache; second run is clean.

Move cmsstor151 from "unconfigured" into "dCache test" group in Zabbix.

On the admin node check old configuration:

[cmsstor152.fnal.gov] (PoolManager) admin > psu ls pgroup
flushPools
[cmsstor152.fnal.gov] (PoolManager) admin > psu ls link
flush-link
Resilient-link
[cmsstor152.fnal.gov] (PoolManager) admin > psu ls pool
w-cmsstor155-disk_itb-disk1
w-cmsstor151-disk_itb-disk2
w-cmsstor151-disk_itb-disk3
w-cmsstor155-disk_itb-disk2
w-cmsstor151-disk_itb-disk1
w-cmsstor155-disk_itb-disk3

In ENC : move admin node cmsstor152 to natalia_burning_dcache_pools branch.

Stop puppet on the admin node.

push ENC changes.

Run puppet with --noop - everything looks fine run puppet again to load modifications to poolmanager.conf.

Check the effect from the admin interface:

[cmsstor152.fnal.gov] (PoolManager) admin > psu ls pgroup
flushPools
burningPools
readonlyPools
[cmsstor152.fnal.gov] (PoolManager) admin > psu ls link
flush-link
burning-link
readonly-link
[cmsstor152.fnal.gov] (PoolManager) admin > psu ls pool
w-cmsstor155-disk_itb-disk1
w-cmsstor151-disk_itb-disk2
w-cmsstor151-disk_itb-disk3
w-cmsstor155-disk_itb-disk2
w-cmsstor155-disk_itb-disk3
w-cmsstor151-disk_itb-disk1
[cmsstor152.fnal.gov] (PoolManager) admin > psu ls pool -l w-cmsstor151-disk_itb-disk2
w-cmsstor151-disk_itb-disk2 (enabled=true;active=23;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
linkList :
[cmsstor152.fnal.gov] (PoolManager) admin > psu ls pool -l w-cmsstor155-disk_itb-disk2
w-cmsstor155-disk_itb-disk2 (enabled=true;active=29;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
linkList :
[cmsstor152.fnal.gov] (PoolManager) admin > psu ls pgroup -l readonlyPools
readonlyPools
linkList :
readonly-link (pref=5/0/0/0;flush-section;ugroups=3;pools=1)
poolList :
[cmsstor152.fnal.gov] (PoolManager) admin > psu ls pgroup -l flushPools
flushPools
linkList :
flush-link (pref=10/0/10/10;flush-section;ugroups=3;pools=1)
poolList :
w-cmsstor155-disk_itb-disk1 (enabled=true;active=3;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
w-cmsstor151-disk_itb-disk2 (enabled=true;active=12;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
w-cmsstor151-disk_itb-disk3 (enabled=true;active=12;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
w-cmsstor155-disk_itb-disk2 (enabled=true;active=10;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
w-cmsstor155-disk_itb-disk3 (enabled=true;active=25;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
w-cmsstor151-disk_itb-disk1 (enabled=true;active=12;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)

#12 Updated by Natalia Ratnikova over 4 years ago

Added cmsstor410 and cmsstor409 to disk_itb instance burning group via puppet commit: 0bffd062c5d024bc52d550329ef7da2ee23794e4 .

Check that it takes effect:

[cmsstor152.fnal.gov] (PoolManager) admin > psu ls pgroup -l burningPools
burningPools
linkList :
burning-link (pref=10/0/10/10;flush-section;ugroups=3;pools=1)
poolList :
w-cmsstor410-disk_itb-disk1 (enabled=true;active=6;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
w-cmsstor409-disk_itb-disk2 (enabled=true;active=10;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
w-cmsstor410-disk_itb-disk2 (enabled=true;active=6;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
w-cmsstor409-disk_itb-disk1 (enabled=true;active=10;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)

Leave these two pools in dcache-pool group in Zabbix for now, since they will be eventually moved there anyway.

Updated twiki instructions: added start puppet step on the pool, once the work is complete.

Fixed formatting broken by Gerard's previous editing.

#13 Updated by Natalia Ratnikova over 4 years ago

  • % Done changed from 0 to 30

#14 Updated by Natalia Ratnikova over 4 years ago

===== Current space usage status =====

- 860Gb of data on the whole instance according du on the namespace: 
- dCache monitoring stats page shows: 879.661 MB
- spacecount run on the dump of the namespace: 922 391 818 304 Byte = 859.044 323 03 Gigabyt

-bash-3.2$ spacecount posix --dump ~/T1_US_FNAL_DISK_ITB.1430760780.txt --level 3
upload parameter: strict > 0
upload parameter: /pnfs/dcache/uscms_test > 922391818304
upload parameter: timestamp > 1430760780
upload parameter: node >
upload parameter: /pnfs > 922391818304
upload parameter: /pnfs/dcache > 922391818304
total number of records: 6

=== Mount nfs v3 on cmsstor155 pool ===

This is only for convenience to check the contents.
Use non-standard mount point:

mkdir  /dcache_itb
mount -o nfsvers=3,intr,hard,rw,noac,noacl,noatime,nodiratime cmsstor153:/dcache /dcache_itb

==== Create directory for test data ====

dCache itb instance re-uses disk instance authentication/authorization,
cmsphedex role is mapped to cmsprod and is allowed to write. normal users can only read.

[natasha@cmslpc23 natasha]$ voms-proxy-init -voms cms:/cms/Role=cmsphedex
....
[natasha@cmslpc23 ~]$ voms-proxy-info -all | grep phedex
attribute : /cms/Role=cmsphedex/Capability=NULL
[natasha@cmslpc23 ~]$ srmmkdir 'srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning'

srmmkdir 'srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning'

==== Create data files ======

[natasha@cmslpc23 natasha]$ pwd
/uscmst1b_scratch/lpc1/3DayLifetime/natasha
[natasha@cmslpc23 natasha]$ dd if=/dev/zero of=1g.burningtest bs=1 count=0 seek=1G
0+0 records in
0+0 records out
0 bytes (0 B) copied, 0.000171006 s, 0.0 kB/s

same for 2,3,4 GB...

[natasha@cmslpc23 natasha]$ dd if=/dev/zero of=4g.burningtest bs=4 count=0 seek=1G
[natasha@cmslpc23 natasha]$ ls latr
total 10485856
drwxrwxrwt 386 root root 59392 Jun 1 14:49 ..
-rw-r--r-
1 natasha us_cms 1073741824 Jun 1 14:54 1g.burningtest
rw-r--r- 1 natasha us_cms 2147483648 Jun 1 14:56 2g.burningtest
rw-r--r- 1 natasha us_cms 3221225472 Jun 1 14:56 3g.burningtest
rw-r--r- 1 natasha us_cms 4294967296 Jun 1 14:56 4g.burningtest
drwxr-xr-x 2 natasha us_cms 2048 Jun 1 14:57 .
bash-4.1$ for f in *.burningtest; do srmcp file:///`pwd`/$f srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning/$f; done

check namespace on cmsstor155:
drwxrwxrwx 110 root root 512 Jun 1 16:33 ..
rw-rw-rw 1 cmsprod us_cms 1073741824 Jun 1 17:11 1g.burningtest
rw-rw-rw 1 cmsprod us_cms 2147483648 Jun 1 17:12 2g.burningtest
rw-rw-rw 1 cmsprod us_cms 3221225472 Jun 1 17:13 3g.burningtest
rw-rw-rw 1 cmsprod us_cms 4294967296 Jun 1 17:14 4g.burningtest

#15 Updated by Natalia Ratnikova over 4 years ago

Start and monitor data migration on the burning pools.

The srmcp-ed data were written to the burning pools as well. To fix this:
set affected pools in readonly state, migrate move data onto flushPools group.
Get dump of files will locations on the pools

Add four more files 1,2,3,4 GB size.
Get dump again.
Tomorrow - clarify Gerard's idea of continuous migration - what options need to be used ..

#16 Updated by Natalia Ratnikova over 4 years ago

  • Status changed from Assigned to Accepted

[cmsstor152.fnal.gov] (PoolManager) admin > psu ls pool -l
w-cmsstor410-disk_itb-disk1 (enabled=true;active=0;rdOnly=false;links=0;pgroups=1;hsm=[];mode=disabled(store,stage,p2p-client))
linkList :
w-cmsstor409-disk_itb-disk2 (enabled=true;active=17;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
linkList :
w-cmsstor155-disk_itb-disk1 (enabled=true;active=8;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
linkList :
w-cmsstor410-disk_itb-disk2 (enabled=true;active=12;rdOnly=false;links=0;pgroups=1;hsm=[];mode=disabled(store,stage,p2p-client))
linkList :
w-cmsstor409-disk_itb-disk1 (enabled=true;active=17;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
linkList :
w-cmsstor151-disk_itb-disk2 (enabled=true;active=8;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
linkList :
w-cmsstor151-disk_itb-disk3 (enabled=true;active=5;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
linkList :
w-cmsstor155-disk_itb-disk2 (enabled=true;active=3;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
linkList :
w-cmsstor155-disk_itb-disk3 (enabled=true;active=0;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
linkList :
w-cmsstor151-disk_itb-disk1 (enabled=true;active=0;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
linkList :
[cmsstor152.fnal.gov] (PoolManager) admin > man psu set

#17 Updated by Natalia Ratnikova over 4 years ago

[cmsstor152.fnal.gov] (local) admin > cd w-cmsstor410-disk_itb-disk2
[cmsstor152.fnal.gov] (w-cmsstor410-disk_itb-disk2) admin > pool enable
Pool w-cmsstor410-disk_itb-disk2 enabled
[cmsstor152.fnal.gov] (w-cmsstor410-disk_itb-disk2) admin > ..

  1. Repeat the same for w-cmsstor409-disk_itb-disk1

[cmsstor152.fnal.gov] (local) admin > cd PoolManager
[cmsstor152.fnal.gov] (PoolManager) admin > psu ls pool -l
w-cmsstor410-disk_itb-disk1 (enabled=true;active=23;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
linkList :
w-cmsstor409-disk_itb-disk2 (enabled=true;active=10;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
linkList :
w-cmsstor155-disk_itb-disk1 (enabled=true;active=1;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
linkList :
w-cmsstor410-disk_itb-disk2 (enabled=true;active=5;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
linkList :
w-cmsstor409-disk_itb-disk1 (enabled=true;active=10;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
linkList :
w-cmsstor151-disk_itb-disk2 (enabled=true;active=1;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
linkList :
w-cmsstor151-disk_itb-disk3 (enabled=true;active=28;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
linkList :
w-cmsstor155-disk_itb-disk2 (enabled=true;active=26;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
linkList :
w-cmsstor155-disk_itb-disk3 (enabled=true;active=23;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
linkList :
w-cmsstor151-disk_itb-disk1 (enabled=true;active=23;rdOnly=false;links=0;pgroups=1;hsm=[];mode=enabled)
linkList :

#18 Updated by Natalia Ratnikova over 4 years ago

[cmsstor152.fnal.gov] (w-cmsstor155-disk_itb-disk1) admin > migration copy target=pgroup burningPools
[1] INITIALIZING migration copy -target=pgroup -
burningPools
[cmsstor152.fnal.gov] (w-cmsstor155-disk_itb-disk1) admin > ..
[cmsstor152.fnal.gov] (local) admin > cd w-cmsstor155-disk_itb-disk2
[cmsstor152.fnal.gov] (w-cmsstor155-disk_itb-disk2) admin > migration copy target=pgroup burningPools
[1] INITIALIZING migration copy -target=pgroup -
burningPools
[cmsstor152.fnal.gov] (w-cmsstor155-disk_itb-disk2) admin > ..
[cmsstor152.fnal.gov] (local) admin > cd w-cmsstor155-disk_itb-disk3
[cmsstor152.fnal.gov] (w-cmsstor155-disk_itb-disk3) admin > migration copy target=pgroup burningPools
[1] INITIALIZING migration copy -target=pgroup -
burningPools
[cmsstor152.fnal.gov] (w-cmsstor155-disk_itb-disk3) admin > ..

#19 Updated by Natalia Ratnikova over 4 years ago

Some p2pserver and p2pclient transfers are showing in the pool queue monitoring.

Now start permanent migration on the burning pools to the burning group.

Repeat the following for the four pools in testing:

[cmsstor152.fnal.gov] (w-cmsstor409-disk_itb-disk1) admin > migration copy permanent -target=pgroup burningPools
[1] INITIALIZING migration copy -permanent -target=pgroup -
burningPools
[cmsstor152.fnal.gov] (w-cmsstor409-disk_itb-disk1) admin > migration ls
[1] RUNNING migration copy permanent -target=pgroup - burningPools
[cmsstor152.fnal.gov] (w-cmsstor409-disk_itb-disk1) admin > migration info 1
Command : migration copy permanent -target=pgroup - burningPools
State : RUNNING
Queued : 2293
Attempts : 135
Targets : w-cmsstor410-disk_itb-disk1,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk2
Completed : 134 files; 607972874 bytes; 5%
Total : 11016105508 bytes
Concurrency: 1
Running tasks:
[134] 0000B7300BD669C34542958F4BFDFA7C7F0C: TASK.Copying -> [w-cmsstor410-disk_itb-disk2@local]
[cmsstor152.fnal.gov] (w-cmsstor409-disk_itb-disk1) admin > ..

#20 Updated by Natalia Ratnikova over 4 years ago

  • % Done changed from 30 to 60

Cancel migration on one of the pools and start a new one with extra options ( checked with Gerard).

This can be done later on all pools pools, also including the source ones with persistent data , especially the permanent option, which will migrate newly added files.

[cmsstor152.fnal.gov] (w-cmsstor410-disk_itb-disk2) admin > migration copy verify -concurrency=10 -tmode=cached -pins=keep -permanent -target=pgroup burningPools
[2] INITIALIZING migration copy -verify -concurrency=10 -tmode=cached -pins=keep -permanent -target=pgroup -
burningPools
[cmsstor152.fnal.gov] (w-cmsstor410-disk_itb-disk2) admin > migration info 2
Command : migration copy verify -concurrency=10 -tmode=cached -pins=keep -permanent -target=pgroup - burningPools
State : RUNNING
Queued : 13785
Attempts : 2259
Targets : w-cmsstor410-disk_itb-disk1,w-cmsstor409-disk_itb-disk2,w-cmsstor409-disk_itb-disk1
Completed : 2249 files; 10203962639 bytes; 14%
Total : 72793408884 bytes
Concurrency: 10
Running tasks:
[14855] 00007AE58B46679A4E7BAC9ED6F489837CCA: TASK.Copying -> [w-cmsstor409-disk_itb-disk2@local]
[14857] 00002C481976419D45BCAA795A46C49B6DC6: TASK.Copying -> [w-cmsstor409-disk_itb-disk1@local]
[14860] 00006A76679D124E4EE9AC9D39E8AF6F7C1F: TASK.Copying -> [w-cmsstor409-disk_itb-disk1@local]
[14861] 00001990CC860D3F468698E67282EC2508D3: TASK.Copying -> [w-cmsstor409-disk_itb-disk1@local]
[14862] 0000CB573030608D47E59A28F62170974039: TASK.Copying -> [w-cmsstor409-disk_itb-disk2@local]
[14863] 000017E76068339E46738974996B37B64493: TASK.Copying -> [w-cmsstor409-disk_itb-disk1@local]
[14864] 000072F71632319F4C6483870DE39A3B7AA3: TASK.Copying -> [w-cmsstor410-disk_itb-disk1@local]
[14865] 0000AAF9392C975541FF993DF67F36EECDBD: TASK.Copying -> [w-cmsstor409-disk_itb-disk1@local]
[14866] 000097020C79D7AB4B5C876B7C55EA3C11B9: TASK.UpdatingExistingFile -> [w-cmsstor409-disk_itb-disk2@local]
[14867] 00009AE2FE8868A44F908B689564A4D055B7: TASK.GettingLocations
[cmsstor152.fnal.gov] (w-cmsstor410-disk_itb-disk2) admin > ..

#21 Updated by Natalia Ratnikova over 4 years ago

  • % Done changed from 60 to 70

test time log

#22 Updated by Natalia Ratnikova over 4 years ago

  • % Done changed from 70 to 60

#23 Updated by Natalia Ratnikova over 4 years ago

Run storage dump on cmspnfs1 again - most burning files have migrated to other pool.
Not every pool has every file, but they are more or less evenly distributed among all pools:

[root@cmspnfs1 chimera-list]# for p in `grep burning chimera_2015-06-03_1608 | awk '{print $NF}' | tr ',' "\n" | sort -u `; do echo $p; grep burning chimera_2015-06-03_1608| grep -c $p ; done
/mnt/dcache/uscms_test/burning
1
w-cmsstor155-disk_itb-disk1
2
w-cmsstor155-disk_itb-disk2
2
w-cmsstor155-disk_itb-disk3
5
w-cmsstor409-disk_itb-disk1
4
w-cmsstor409-disk_itb-disk2
5
w-cmsstor410-disk_itb-disk1
4
w-cmsstor410-disk_itb-disk2
5

#24 Updated by Natalia Ratnikova over 4 years ago

ADMIN interface :

Two pools have locked files errors ( PNFS IDs are for the test files):

[cmsstor152.fnal.gov] (w-cmsstor410-disk_itb-disk1) admin > migration info 2
Command : migration copy permanent -target=pgroup - burningPools
State : SLEEPING
Queued : 0
Attempts : 103033
Targets : w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk2,w-cmsstor409-disk_itb-disk1
Completed : 103020 files; 474632812306 bytes; 100%
Total : 474632812306 bytes
Concurrency: 1
Running tasks:
Most recent errors:
18:03:52 [29185] 00006CBC9BCB10F84E3284BFB42B33AD18D2: Replica is locked on target pool
[...]

Two other errors:

17:51:08 [32735] 00001CF3A4708D7D431587F4B980ADED405D: Replica is locked on target pool
18:03:49 [40476] 0000904DC5D8386A48D89F9EC3E6A7F75929: PnfsManager failed (Unexpected reply: )
18:06:49 [40462] 00006CBC9BCB10F84E3284BFB42B33AD18D2: Pool [w-cmsstor410-disk_itb-disk1@local] failed (no response)

In the logs : no messages except for pool size adjustment message, which is normal.

#25 Updated by Natalia Ratnikova over 4 years ago

On cmslps24 open screen session with
screen -S burning_test

cd /uscmst1b_scratch/lpc1/3DayLifetime/natasha
[natasha@cmslpc24 natasha]$ ls latr
total 11993216
drwxrwxrwt 386 root root 59392 Jun 1 14:49 ..
-rw-r--r-
1 natasha us_cms 1073741824 Jun 1 14:54 1g.burningtest
rw-r--r- 1 natasha us_cms 2147483648 Jun 1 14:56 2g.burningtest
rw-r--r- 1 natasha us_cms 3221225472 Jun 1 14:56 3g.burningtest
rw-r--r- 1 natasha us_cms 4294967296 Jun 1 14:56 4g.burningtest
drwxr-xr-x 3 natasha us_cms 2048 Jun 3 09:55 .

Now copy every file via srmcp into 4 fileswith a different name (total 16):

for i in {2..5}; do for f in *.burningtest; do srmcp file:///`pwd`/$f srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning/$f-$i; done; done

Detach from screen and leave it run for a while.

ONce arrive in the system, files are expected to get distributed over all pools, including burning pools group.
Then migration jobs should replicate them further.

#26 Updated by Natalia Ratnikova over 4 years ago

For the record , the resulting copy commands are:

bash-4.1$ for i in {2..5}; do for f in *.burningtest; do echo srmcp file:///`pwd`/$f srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning/$f-$i; done

done

srmcp file:////uscmst1b_scratch/lpc1/3DayLifetime/natasha/1g.burningtest srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning/1g.burningtest-2
srmcp file:////uscmst1b_scratch/lpc1/3DayLifetime/natasha/2g.burningtest srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning/2g.burningtest-2
srmcp file:////uscmst1b_scratch/lpc1/3DayLifetime/natasha/3g.burningtest srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning/3g.burningtest-2
srmcp file:////uscmst1b_scratch/lpc1/3DayLifetime/natasha/4g.burningtest srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning/4g.burningtest-2
srmcp file:////uscmst1b_scratch/lpc1/3DayLifetime/natasha/1g.burningtest srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning/1g.burningtest-3
srmcp file:////uscmst1b_scratch/lpc1/3DayLifetime/natasha/2g.burningtest srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning/2g.burningtest-3
srmcp file:////uscmst1b_scratch/lpc1/3DayLifetime/natasha/3g.burningtest srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning/3g.burningtest-3
srmcp file:////uscmst1b_scratch/lpc1/3DayLifetime/natasha/4g.burningtest srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning/4g.burningtest-3
srmcp file:////uscmst1b_scratch/lpc1/3DayLifetime/natasha/1g.burningtest srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning/1g.burningtest-4
srmcp file:////uscmst1b_scratch/lpc1/3DayLifetime/natasha/2g.burningtest srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning/2g.burningtest-4
srmcp file:////uscmst1b_scratch/lpc1/3DayLifetime/natasha/3g.burningtest srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning/3g.burningtest-4
srmcp file:////uscmst1b_scratch/lpc1/3DayLifetime/natasha/4g.burningtest srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning/4g.burningtest-4
srmcp file:////uscmst1b_scratch/lpc1/3DayLifetime/natasha/1g.burningtest srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning/1g.burningtest-5
srmcp file:////uscmst1b_scratch/lpc1/3DayLifetime/natasha/2g.burningtest srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning/2g.burningtest-5
srmcp file:////uscmst1b_scratch/lpc1/3DayLifetime/natasha/3g.burningtest srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning/3g.burningtest-5
srmcp file:////uscmst1b_scratch/lpc1/3DayLifetime/natasha/4g.burningtest srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning/4g.burningtest-5

#27 Updated by Natalia Ratnikova over 4 years ago

Check locations of data added to the migration.

Some files landed only on cmsstor155 - the permanent pool, see dump in [*] below.

Check migration status: all migrations on cmsstor155 have finished, see admin interface output in [**].

Now we want to start permanent migration on the "source" pool (repeat for all three pools):

[cmsstor152.fnal.gov] (local) admin > cd w-cmsstor155-disk_itb-disk1
[cmsstor152.fnal.gov] (w-cmsstor155-disk_itb-disk1) admin > migration copy verify -concurrency=10 -tmode=cached -pins=keep -permanent -target=pgroup burningPools
[2] INITIALIZING migration copy -verify -concurrency=10 -tmode=cached -pins=keep -permanent -target=pgroup -
burningPools

Check data volume :
flushPools group ~ 0.9 TB
burningPools group ~ 1.8 TB

Add more data into the system:
IN screen on cmslpc24 => create long-living proxy, just in case. And run this:

for i in {6..100}; do for f in *.burningtest; do srmcp debug file:///`pwd`/$f srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning/$f$i; done; done

[*]
[root@cmspnfs1 chimera-list]# grep burning chimera_2015-06-05_1606
4M.burningtest 00006500B72CC40345F0B012352E668D735F 03c00001 4194304 1433189301 w-cmsstor155-disk_itb-disk3,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk1
/mnt/dcache/uscms_test/burning
1g.burningtest 000056542939ECAC4AE4B65837E95EEC7E79 c02d0001 1073741824 1433196695 w-cmsstor155-disk_itb-disk2,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk1
1g.burningtest-1 00008B0D2E76D6F54DCEA2996BE3D1A2AF8F c02d0001 1073741824 1433202894 w-cmsstor155-disk_itb-disk1,w-cmsstor410-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
1g.burningtest-2 0000B29829D3503748F2B7E4EE98BABC0851 c02d0001 1073741824 1433372235 w-cmsstor155-disk_itb-disk3
1g.burningtest-3 0000709AD38722DA4068A13577AB59824B83 c02d0001 1073741824 1433372473 w-cmsstor409-disk_itb-disk1,w-cmsstor410-disk_itb-disk1
1g.burningtest-4 0000CBD8C97C803F48088B484D2BD349E639 c02d0001 1073741824 1433372643 w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk2
1g.burningtest-5 000032FB7480FC6C455090209381507C2EB1 c02d0001 1073741824 1433372804 w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk2
2g.burningtest 00000A09D788C87A4FABB7F01AD0EF279892 80690001 2147483648 1433196734 w-cmsstor155-disk_itb-disk3,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk1
2g.burningtest-1 0000CF79AAE9371045A88E7F674ED500F755 80690001 2147483648 1433202922 w-cmsstor155-disk_itb-disk3,w-cmsstor409-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
2g.burningtest-2 000083D169DC09EA44AEA747CA177319FC37 80690001 2147483648 1433372275 w-cmsstor409-disk_itb-disk1,w-cmsstor409-disk_itb-disk2
2g.burningtest-3 00004200F5FDF5B34F939169E406ACE46825 80690001 2147483648 1433372498 w-cmsstor409-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
2g.burningtest-4 0000EBE3B7A4871049D9B23204369C25FF17 80690001 2147483648 1433372663 w-cmsstor409-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
2g.burningtest-5 0000845B0200115749A3A1D585280026E71D 80690001 2147483648 1433372825 w-cmsstor409-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
3g.burningtest 000000BC6E780FC1472DA52FF1A50AE70FC2 40a50001 3221225472 1433196797 w-cmsstor155-disk_itb-disk3,w-cmsstor409-disk_itb-disk1,w-cmsstor409-disk_itb-disk2
3g.burningtest-1 0000B41911047FA046EEA021115E2D089235 40a50001 3221225472 1433202962 w-cmsstor155-disk_itb-disk3,w-cmsstor409-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
3g.burningtest-2 0000C00652BD62F94EB9AD63DB97EC422865 40a50001 3221225472 1433372335 w-cmsstor410-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
3g.burningtest-3 00004FE1C12D0FB04CB7AB6BEA758E9F85CE 40a50001 3221225472 1433372533 w-cmsstor155-disk_itb-disk3
3g.burningtest-4 0000DFEE466CDA974A7286A36AC5E5CF01C1 40a50001 3221225472 1433372698 w-cmsstor155-disk_itb-disk1
3g.burningtest-5 00007C45836252A8471B8F72B028A0BD9CE3 40a50001 3221225472 1433372860 w-cmsstor155-disk_itb-disk2
4g.burningtest 00008DD4DBB180364478B9BC55547019EB38 00e10001 4294967296 1433196881 w-cmsstor155-disk_itb-disk1,w-cmsstor409-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
4g.burningtest-1 0000E5D22106D8B84501924680AAF38CFD96 00e10001 4294967296 1433203023 w-cmsstor155-disk_itb-disk2,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk2
4g.burningtest-2 00007D88F64C4A454928BE5A402C9A74CEEE 00e10001 4294967296 1433372418 w-cmsstor409-disk_itb-disk1,w-cmsstor409-disk_itb-disk2
4g.burningtest-3 000072515886E060447D8768CF74FB86116C 00e10001 4294967296 1433372587 w-cmsstor155-disk_itb-disk1
4g.burningtest-4 0000B97D2A915462453D98A813B2D8AEBF80 00e10001 4294967296 1433372752 w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk1
4g.burningtest-5 0000C29A9454867F467685C95C86AF755935 00e10001 4294967296 1433372913 w-cmsstor409-disk_itb-disk1,w-cmsstor410-disk_itb-disk1
[root@cmspnfs1 chimera-list]#

[**]

[cmsstor152.fnal.gov] (w-cmsstor155-disk_itb-disk3) admin > migration ls
[1] FINISHED migration copy target=pgroup - burningPools
[cmsstor152.fnal.gov] (w-cmsstor155-disk_itb-disk3) admin > ..
[cmsstor152.fnal.gov] (local) admin > cd w-cmsstor155-disk_itb-disk2
[cmsstor152.fnal.gov] (w-cmsstor155-disk_itb-disk2) admin > migration ls
[1] FINISHED migration copy target=pgroup - burningPools
[cmsstor152.fnal.gov] (w-cmsstor155-disk_itb-disk2) admin > ..
[cmsstor152.fnal.gov] (local) admin > cd w-cmsstor155-disk_itb-disk1
[cmsstor152.fnal.gov] (w-cmsstor155-disk_itb-disk1) admin > migration ls
[1] FINISHED migration copy target=pgroup - burningPools
[cmsstor152.fnal.gov] (w-cmsstor155-disk_itb-disk1) admin > ..
[cmsstor152.fnal.gov] (local) admin > cd w-cmsstor410-disk_itb-disk1
[cmsstor152.fnal.gov] (w-cmsstor410-disk_itb-disk1) admin > migration ls
[2] SLEEPING migration copy permanent -target=pgroup - burningPools
[1] FINISHED migration move target=pgroup - flushPools

#28 Updated by Natalia Ratnikova over 4 years ago

Check that new data files (*burningtest-6 and higher ) start arriving into the system :

[root@cmsstor155 ~]# ls /dcache_itb/uscms_test/burning/
1g.burningtest 2g.burningtest 3g.burningtest 4g.burningtest
1g.burningtest-1 2g.burningtest-1 3g.burningtest-1 4g.burningtest-1
1g.burningtest-2 2g.burningtest-2 3g.burningtest-2 4g.burningtest-2
1g.burningtest-3 2g.burningtest-3 3g.burningtest-3 4g.burningtest-3
1g.burningtest-4 2g.burningtest-4 3g.burningtest-4 4g.burningtest-4
1g.burningtest-5 2g.burningtest-5 3g.burningtest-5 4g.burningtest-5
1g.burningtest-6 2g.burningtest-6 3g.burningtest-6 4g.burningtest-6
1g.burningtest-7 2g.burningtest-7 3g.burningtest-7

#29 Updated by Natalia Ratnikova over 4 years ago

Check files distribution over the pools

[root@cmspnfs1 chimera-list]# for p in `grep burning chimera_2015-06-08_1315 | awk '{print $NF}' | tr ',' "\n" | sort -u `; do echo $p; grep burning chimera_2015-06-08_1315| grep -c $p ; done
/mnt/dcache/uscms_test/burning
1
w-cmsstor155-disk_itb-disk1
58
w-cmsstor155-disk_itb-disk2
54
w-cmsstor155-disk_itb-disk3
60
w-cmsstor409-disk_itb-disk1
186
w-cmsstor409-disk_itb-disk2
199
w-cmsstor410-disk_itb-disk1
199
w-cmsstor410-disk_itb-disk2
226

#30 Updated by Natalia Ratnikova over 4 years ago

  • Status changed from Accepted to Closed

#31 Updated by Natalia Ratnikova over 4 years ago

  • Status changed from Closed to Accepted

#32 Updated by Natalia Ratnikova over 4 years ago

Before decommissioning the 409/410 pools from the test instance, data need to be migrated back onto a permanent pool(s).

[root@cmspnfs1 chimera-list]# grep burning chimera_2015-06-08_1315 | awk '{print $1}' | sort -u | wc -l
406
[root@cmspnfs1 chimera-list]# for f in `grep burning chimera_2015-06-08_1315 | awk '{print $1}' | sort -u |head `; do grep $f chimera_2015-06-08_1315 | grep 155 ; done
1g.burningtest 000056542939ECAC4AE4B65837E95EEC7E79 c02d0001 1073741824 1433196695 w-cmsstor155-disk_itb-disk2,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk1
1g.burningtest-1 00008B0D2E76D6F54DCEA2996BE3D1A2AF8F c02d0001 1073741824 1433202894 w-cmsstor155-disk_itb-disk1,w-cmsstor410-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
1g.burningtest-10 00007CAA659D4377495DA63CDB4643BBA761 c02d0001 1073741824 1433542156 w-cmsstor155-disk_itb-disk1,w-cmsstor409-disk_itb-disk1,w-cmsstor410-disk_itb-disk1
1g.burningtest-100 0000C42BC90C8B50485593CDABF0396B2E35 c02d0001 1073741824 1433557155 w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk1
1g.burningtest-15 0000DB3D5FCA185E4C3CAEB789B493F73DA1 c02d0001 1073741824 1433542978 w-cmsstor155-disk_itb-disk3,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk2
1g.burningtest-17 0000853020BE01E8489AA5C255AD9C34FAA8 c02d0001 1073741824 1433543311 w-cmsstor155-disk_itb-disk2,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk1
1g.burningtest-18 0000BB18308A451D47EC96DED12958E28929 c02d0001 1073741824 1433543480 w-cmsstor155-disk_itb-disk1,w-cmsstor409-disk_itb-disk1,w-cmsstor409-disk_itb-disk2
1g.burningtest-2 0000B29829D3503748F2B7E4EE98BABC0851 c02d0001 1073741824 1433372235 w-cmsstor155-disk_itb-disk3,w-cmsstor409-disk_itb-disk1,w-cmsstor410-disk_itb-disk1
1g.burningtest-20 000014D3FC06C44C42049ADFA80C2AFCC27D c02d0001 1073741824 1433543806 w-cmsstor155-disk_itb-disk2,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk2
1g.burningtest-22 000082D9A48B008041AA82CABED3574A971A c02d0001 1073741824 1433544133 w-cmsstor155-disk_itb-disk3,w-cmsstor410-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
1g.burningtest-24 0000CA11CD50F0BA44878FFE863C6EF8F3D4 c02d0001 1073741824 1433544456 w-cmsstor155-disk_itb-disk1,w-cmsstor409-disk_itb-disk1,w-cmsstor410-disk_itb-disk1
1g.burningtest-25 0000B66FDCD8ABD34EA88F9B145185F4DEFE c02d0001 1073741824 1433544617 w-cmsstor155-disk_itb-disk2,w-cmsstor409-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
1g.burningtest-26 0000796A216741694AE5A1C0550B996BDF58 c02d0001 1073741824 1433544776 w-cmsstor155-disk_itb-disk2,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk2
1g.burningtest-27 0000CB0C0A7DE5D74009BDF3432CFE8882A7 c02d0001 1073741824 1433544940 w-cmsstor155-disk_itb-disk1,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk1
1g.burningtest-28 00003888162C53BC4B0FA3AF8EE7A1206064 c02d0001 1073741824 1433545101 w-cmsstor155-disk_itb-disk3,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk2
1g.burningtest-31 00006EE6AECDF6E245E4A3797255F1892406 c02d0001 1073741824 1433545589 w-cmsstor155-disk_itb-disk3,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk2
1g.burningtest-32 000092350305DD934A4C8D21E6CFBAD7425C c02d0001 1073741824 1433545757 w-cmsstor155-disk_itb-disk1,w-cmsstor409-disk_itb-disk1,w-cmsstor409-disk_itb-disk2
1g.burningtest-33 0000503F924D5BF74A70AD01363285528B25 c02d0001 1073741824 1433545934 w-cmsstor155-disk_itb-disk1,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk2
1g.burningtest-37 0000C066FA03E89747DBA26B0339C9A64A06 c02d0001 1073741824 1433546592 w-cmsstor155-disk_itb-disk3,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk1
1g.burningtest-43 0000EF4746F5F9CE4F74B64CC720C64717D2 c02d0001 1073741824 1433547571 w-cmsstor155-disk_itb-disk1,w-cmsstor410-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
1g.burningtest-45 000048054DFF1C6F4E3ABAF3134362D86B6E c02d0001 1073741824 1433547906 w-cmsstor155-disk_itb-disk1,w-cmsstor409-disk_itb-disk1,w-cmsstor409-disk_itb-disk2
1g.burningtest-49 0000509DACD391D74F0EAFF0593B64F8F47F c02d0001 1073741824 1433548574 w-cmsstor155-disk_itb-disk1,w-cmsstor409-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
1g.burningtest-50 00008D524093BB154374AB86A14950246FF0 c02d0001 1073741824 1433548748 w-cmsstor155-disk_itb-disk2,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk1
1g.burningtest-51 0000457592C665FA40938346183DF8292F6A c02d0001 1073741824 1433548921 w-cmsstor155-disk_itb-disk2,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk2
1g.burningtest-55 00001336CAC43CC0459DB041B80201D20307 c02d0001 1073741824 1433549585 w-cmsstor155-disk_itb-disk1,w-cmsstor409-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
1g.burningtest-6 00002912D8EA53BC4316AF0B1869144E4149 c02d0001 1073741824 1433541426 w-cmsstor155-disk_itb-disk1,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk1
1g.burningtest-60 00000C482A5B3B86491586D6591547A6A135 c02d0001 1073741824 1433550429 w-cmsstor155-disk_itb-disk3,w-cmsstor410-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
1g.burningtest-61 000000ECC7B7974746C5815A48FFD7CCF027 c02d0001 1073741824 1433550599 w-cmsstor155-disk_itb-disk3,w-cmsstor409-disk_itb-disk1,w-cmsstor409-disk_itb-disk2
1g.burningtest-66 00005BB63F2317E0447991B414785C10FD77 c02d0001 1073741824 1433551427 w-cmsstor155-disk_itb-disk1,w-cmsstor410-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
1g.burningtest-67 0000DA78FCCD15F7490E92A8EC240191E35A c02d0001 1073741824 1433551598 w-cmsstor155-disk_itb-disk1,w-cmsstor409-disk_itb-disk1,w-cmsstor410-disk_itb-disk1
1g.burningtest-68 00008041AFC262B24065BCAF715837D30D93 c02d0001 1073741824 1433551772 w-cmsstor155-disk_itb-disk1,w-cmsstor409-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
1g.burningtest-69 00003114C483796941F297CD9FEB2C8B9756 c02d0001 1073741824 1433551942 w-cmsstor155-disk_itb-disk1,w-cmsstor409-disk_itb-disk1,w-cmsstor409-disk_itb-disk2
1g.burningtest-7 00005B1405FA550C489590DF7E599CD79B41 c02d0001 1073741824 1433541665 w-cmsstor155-disk_itb-disk3,w-cmsstor410-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
1g.burningtest-70 000089BD26C14287434486BF49156E07D19B c02d0001 1073741824 1433552110 w-cmsstor155-disk_itb-disk3,w-cmsstor410-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
1g.burningtest-74 000071A5C06C539F46A08F317F77D1158BC2 c02d0001 1073741824 1433552783 w-cmsstor155-disk_itb-disk2,w-cmsstor410-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
1g.burningtest-75 0000795E696DDAC94EE5B25327AA04E76AAD c02d0001 1073741824 1433552958 w-cmsstor155-disk_itb-disk2,w-cmsstor409-disk_itb-disk1,w-cmsstor410-disk_itb-disk1
1g.burningtest-76 0000D4B2716AC5C04232ACFB128C96288651 c02d0001 1073741824 1433553130 w-cmsstor155-disk_itb-disk3,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk2
1g.burningtest-78 00006F549893A27241029B38391C119AF6A1 c02d0001 1073741824 1433553458 w-cmsstor155-disk_itb-disk3,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk2
1g.burningtest-79 00002A5B7D72D2154955A2157647D8130D57 c02d0001 1073741824 1433553622 w-cmsstor155-disk_itb-disk3,w-cmsstor409-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
1g.burningtest-8 000081CFDF7E7CB44FCE9AF62CACD2DA59CF c02d0001 1073741824 1433541831 w-cmsstor155-disk_itb-disk1,w-cmsstor409-disk_itb-disk1,w-cmsstor409-disk_itb-disk2
1g.burningtest-81 0000ADA151076A384A2DA284E98E5C311CB3 c02d0001 1073741824 1433553960 w-cmsstor155-disk_itb-disk2,w-cmsstor409-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
1g.burningtest-83 000097A9F155D3DC453FB06BF7F25BD2541F c02d0001 1073741824 1433554290 w-cmsstor410-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
1g.burningtest-84 0000B523F2E6B00D4295808486B0D6EA4E1B c02d0001 1073741824 1433554458 w-cmsstor155-disk_itb-disk3,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk2
1g.burningtest-86 00003B301EF207A546088272F60BC5907B07 c02d0001 1073741824 1433554793 w-cmsstor155-disk_itb-disk3,w-cmsstor409-disk_itb-disk1,w-cmsstor409-disk_itb-disk2
1g.burningtest-88 0000EDA21B3B989A4C02AF8858000A13805A c02d0001 1073741824 1433555122 w-cmsstor155-disk_itb-disk2,w-cmsstor409-disk_itb-disk1,w-cmsstor409-disk_itb-disk2
1g.burningtest-92 0000715A29104515413B9BF36A6700312A16 c02d0001 1073741824 1433555782 w-cmsstor155-disk_itb-disk2,w-cmsstor410-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
1g.burningtest-95 00003A36948BCFDC448F9CB282CE88BB6F43 c02d0001 1073741824 1433556288 w-cmsstor155-disk_itb-disk2,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk2
1g.burningtest-98 0000F75563B746B8465BB0FA718868459D74 c02d0001 1073741824 1433556803 w-cmsstor155-disk_itb-disk3,w-cmsstor409-disk_itb-disk1,w-cmsstor409-disk_itb-disk2
1g.burningtest-99 0000CDD69018859B4816B6D8A50223400CA1 c02d0001 1073741824 1433556974 w-cmsstor155-disk_itb-disk1,w-cmsstor409-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
1g.burningtest-1 00008B0D2E76D6F54DCEA2996BE3D1A2AF8F c02d0001 1073741824 1433202894 w-cmsstor155-disk_itb-disk1,w-cmsstor410-disk_itb-disk1,w-cmsstor410-disk_itb-disk2
1g.burningtest-10 00007CAA659D4377495DA63CDB4643BBA761 c02d0001 1073741824 1433542156 w-cmsstor155-disk_itb-disk1,w-cmsstor409-disk_itb-disk1,w-cmsstor410-disk_itb-disk1
1g.burningtest-100 0000C42BC90C8B50485593CDABF0396B2E35 c02d0001 1073741824 1433557155 w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk1
1g.burningtest-15 0000DB3D5FCA185E4C3CAEB789B493F73DA1 c02d0001 1073741824 1433542978 w-cmsstor155-disk_itb-disk3,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk2
1g.burningtest-17 0000853020BE01E8489AA5C255AD9C34FAA8 c02d0001 1073741824 1433543311 w-cmsstor155-disk_itb-disk2,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk1
1g.burningtest-18 0000BB18308A451D47EC96DED12958E28929 c02d0001 1073741824 1433543480 w-cmsstor155-disk_itb-disk1,w-cmsstor409-disk_itb-disk1,w-cmsstor409-disk_itb-disk2
1g.burningtest-10 00007CAA659D4377495DA63CDB4643BBA761 c02d0001 1073741824 1433542156 w-cmsstor155-disk_itb-disk1,w-cmsstor409-disk_itb-disk1,w-cmsstor410-disk_itb-disk1
1g.burningtest-100 0000C42BC90C8B50485593CDABF0396B2E35 c02d0001 1073741824 1433557155 w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk1
1g.burningtest-100 0000C42BC90C8B50485593CDABF0396B2E35 c02d0001 1073741824 1433557155 w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk1
1g.burningtest-15 0000DB3D5FCA185E4C3CAEB789B493F73DA1 c02d0001 1073741824 1433542978 w-cmsstor155-disk_itb-disk3,w-cmsstor409-disk_itb-disk2,w-cmsstor410-disk_itb-disk2
[root@cmspnfs1 chimera-list]# for f in `grep burning chimera_2015-06-08_1315 | awk '{print $1}' | sort -u |head `; do grep $f chimera_2015-06-08_1315 | grep 155 ; done | wc -l
59

#33 Updated by Gerard Bernabeu Altayo over 4 years ago

  • Due date changed from 06/02/2015 to 06/11/2015

#34 Updated by Natalia Ratnikova over 4 years ago

Prevent writing or reading data to/from the burning pools via client

Removed burning-link from modules/dcache/files/etc/dcache/poolmanager-disk_itb.conf
in natalia_burning_dcache_pools branch

Push the change to puppet central repo.

Check links via admin interface:

[cmsstor152.fnal.gov] (PoolManager) admin > psu ls link
flush-link
readonly-link
[cmsstor152.fnal.gov] (PoolManager) admin >

There are only two links already.

that's because puppet was running, here is the update record in the puppet log:

2015-06-10T17:39:12.963283-05:00 cmsstor152 puppet-agent23615: (/Stage[main]/Dcache::Dcachedomain/File[/etc/dcache/poolmanager.conf]/content) content changed '{md5}2f4670f5c5ce1b17b4a57eefcbaa4b90' to '{md5}84516872dc662e68840a46b10957ffd5'
2015-06-10T17:45:10.850496-05:00 cmsstor152 puppet-agent29326: Applying configuration version '1433976282'

Restarted dcache on the head node for it to take an effect if updated poolmanager.conf triggers any change.
Run puppet agent -t
- no change , clear run.

Check links again:

[cmsstor152.fnal.gov] (local) admin > cd PoolManager
[cmsstor152.fnal.gov] (PoolManager) admin > psu ls link
flush-link
readonly-link
[cmsstor152.fnal.gov] (PoolManager) admin >

#35 Updated by Natalia Ratnikova over 4 years ago

The files created with /dev/zero and "seek" options do not actually contain any data.

To created data with real random data, use /dev/urandom :

cd /uscmst1b_scratch/lpc1/3DayLifetime/natasha/pool_test_data

time dd if=/dev/urandom of=2g.burningtest-source bs=1G count=1
time dd if=/dev/urandom of=2g.burningtest-source bs=1G count=2
cat 1g.burningtest-source 2g.burningtest-source > 3g.burningtest-source
cat 3g.burningtest-source 1g.burningtest-source > 4g.burningtest-source

Now copy them into the system:

bash-4.1$ time for f in *.burningtest-source; do srmcp file:///`pwd`/$f srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning/$f; done

real 2m59.811s
user 1m52.489s
sys 0m36.141s

#36 Updated by Natalia Ratnikova over 4 years ago

To replicate the source files to 5TB of data:

in screen on cmslpc23:

bash
cd /uscmst1b_scratch/lpc1/3DayLifetime/natasha/pool_test_data
for i in {1..500}; do for c in *.burningtest-source; do f=`echo $c | sed 's?test-source??'`; srmcp debug file:///`pwd`/$c srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning/$f$i; done; done

#37 Updated by Natalia Ratnikova over 4 years ago

Created new page for enhanced procedure:
https://cmsweb.fnal.gov/bin/view/Storage/PoolAdd

Replaced old procedure with a link to the new one
Described the workflow in the introduction section
reworked previous documentation based on the input from various discussions
Added description of procedure for pools burning tests

#38 Updated by Natalia Ratnikova over 4 years ago

Write and test scripts for starting and checking csm status and errors.

Corresponding dCache functionality is described in:
https://www.dcache.org/manuals/Book-2.2/cookbook/cb-pool-checksumming-fhs.shtml

The commands can be executed in batch via connectChimera-disk_itb.sh script :

We do not have it intalled from rpm yet, but it can be created by patching disk intance script which is installed on the disk_itb instance servers:

cat /usr/libexec/dcache/connectChimera-disk.sh | sed "s/cmsdcacheadmindisk/cmsstor152/" > connectChimera-disk_itb.sh

cat admin.query | /usr/libexec/dcache/connectChimera-disk_itb.sh | tee admin.query.`date +%s`.log

Example for veryifying checksums for all data on cmsstor409 and 410 pools:

1) get pool names from /etc/dcache/poolmanager.conf and create admin query for csm status:

touch csm_status.query; for p in `for n in cmsstor409 cmsstor410; do grep ^psu /etc/dcache/poolmanager.conf | grep $n | tr " " "\n" | grep $n| sort -u ;done` ; do echo "cd $p" >> csm_status.query ; echo "csm status" >> csm_status.query ; echo ".." >> csm_status.query ; done; echo "logoff" >> csm_status.query;

2) And this is a query for initializing checksumming:

touch csm_check.query; for p in `for n in cmsstor409 cmsstor410; do grep ^psu /etc/dcache/poolmanager.conf | grep $n | tr " " "\n" | grep $n| sort -u ;done` ; do echo "cd $p" >> csm_check.query ; echo "csm check *" >> csm_check.query ; echo ".." >> csm_check.query ; done; echo "logoff" >> csm_check.query;

Now execute the check and then status query:

cat csm_check.query | ./connectChimera-disk_itb.sh | tee csm_check.query`date+%s`.log

cat csm_status.query | ./connectChimera-disk_itb.sh | tee csm_status.query`date+%s`.log

The output is :

dCache Admin (VII) (user=admin)

[cmsstor152.fnal.gov] (local) admin > cd w-cmsstor409-disk_itb-disk1
[cmsstor152.fnal.gov] (w-cmsstor409-disk_itb-disk1) admin > csm check *
java.lang.IllegalStateException: Still active
[cmsstor152.fnal.gov] (w-cmsstor409-disk_itb-disk1) admin > ..
[cmsstor152.fnal.gov] (local) admin > cd w-cmsstor409-disk_itb-disk2
[cmsstor152.fnal.gov] (w-cmsstor409-disk_itb-disk2) admin > csm check *
Started ...; check 'csm status' for status
[cmsstor152.fnal.gov] (w-cmsstor409-disk_itb-disk2) admin > ..
[cmsstor152.fnal.gov] (local) admin > cd w-cmsstor410-disk_itb-disk1
[cmsstor152.fnal.gov] (w-cmsstor410-disk_itb-disk1) admin > csm check *
Started ...; check 'csm status' for status
[cmsstor152.fnal.gov] (w-cmsstor410-disk_itb-disk1) admin > ..
[cmsstor152.fnal.gov] (local) admin > cd w-cmsstor410-disk_itb-disk2
[cmsstor152.fnal.gov] (w-cmsstor410-disk_itb-disk2) admin > csm check *
Started ...; check 'csm status' for status
[cmsstor152.fnal.gov] (w-cmsstor410-disk_itb-disk2) admin > ..
[cmsstor152.fnal.gov] (local) admin > logoff
dmg.util.CommandExitException: (0) Done

dCache Admin (VII) (user=admin)

[cmsstor152.fnal.gov] (local) admin > cd w-cmsstor409-disk_itb-disk1
[cmsstor152.fnal.gov] (w-cmsstor409-disk_itb-disk1) admin > csm status
FullScan Active 99561 checked; 0 errors detected
SingeScan Idle 0000C742FF12CA914F5A8495A89FFF0C7874 OK 1:ee310cc8
Scrubber Idle 0 of 0 checked; 0 errors detected
[cmsstor152.fnal.gov] (w-cmsstor409-disk_itb-disk1) admin > ..
[cmsstor152.fnal.gov] (local) admin > cd w-cmsstor409-disk_itb-disk2
[cmsstor152.fnal.gov] (w-cmsstor409-disk_itb-disk2) admin > csm status
FullScan Active 2157 checked; 0 errors detected
SingeScan Idle
Scrubber Idle 0 of 0 checked; 0 errors detected
[cmsstor152.fnal.gov] (w-cmsstor409-disk_itb-disk2) admin > ..
[cmsstor152.fnal.gov] (local) admin > cd w-cmsstor410-disk_itb-disk1
[cmsstor152.fnal.gov] (w-cmsstor410-disk_itb-disk1) admin > csm status
FullScan Active 1870 checked; 0 errors detected
SingeScan Idle
Scrubber Idle 0 of 0 checked; 0 errors detected
[cmsstor152.fnal.gov] (w-cmsstor410-disk_itb-disk1) admin > ..
[cmsstor152.fnal.gov] (local) admin > cd w-cmsstor410-disk_itb-disk2
[cmsstor152.fnal.gov] (w-cmsstor410-disk_itb-disk2) admin > csm status
FullScan Active 1449 checked; 0 errors detected
SingeScan Idle
Scrubber Idle 0 of 0 checked; 0 errors detected
[cmsstor152.fnal.gov] (w-cmsstor410-disk_itb-disk2) admin > ..
[cmsstor152.fnal.gov] (local) admin > logoff
dmg.util.CommandExitException: (0) Done

NO ERRORS!

#39 Updated by Natalia Ratnikova over 4 years ago

dCachedisk_itb contains many relatively small test files from January. all same size nad owned by root.
This interferes with the burning tests, since may of these small files are actually migrating to the pools and created load on the namespace without writing large volume of data, and thus slow down the burning tests.

Checked with Chih-Hao: he does not need these data - and it's OK to remove.

Started a somewhat "gentle" removal in the name space mounted on cmsstor155 for all TEST data.

This seems to confuse the csm process, and gives exceptions for pnfs manager time out, and also for the removed files:

cmsstor152.fnal.gov] (w-cmsstor409-disk_itb-disk1) admin > csm status
FullScan Idle CacheException(rc=10006;msg=Request to [>PnfsManager@local] timed out.) 99561 checked; 0 errors detected
SingeScan Idle 0000C742FF12CA914F5A8495A89FFF0C7874 OK 1:ee310cc8
Scrubber Idle 0 of 0 checked; 0 errors detected
[cmsstor152.fnal.gov] (w-cmsstor409-disk_itb-disk1) admin > ..
[cmsstor152.fnal.gov] (local) admin > cd w-cmsstor409-disk_itb-disk2
[cmsstor152.fnal.gov] (w-cmsstor409-disk_itb-disk2) admin > csm status
FullScan Idle CacheException(rc=10001;msg=No such file or directory: 0000DFEAA26B888B41A286886C786DD1CA84) 5435 checked; 0 errors detected
SingeScan Idle
Scrubber Idle 0 of 0 checked; 0 errors detected
[cmsstor152.fnal.gov] (w-cmsstor409-disk_itb-disk2) admin > ..
[cmsstor152.fnal.gov] (local) admin > cd w-cmsstor410-disk_itb-disk1
[cmsstor152.fnal.gov] (w-cmsstor410-disk_itb-disk1) admin > csm status
FullScan Idle CacheException(rc=10001;msg=No such file or directory: 000008BACD216CA14F84B19B3519AE64468A) 5728 checked; 0 errors detected
SingeScan Idle
Scrubber Idle 0 of 0 checked; 0 errors detected
[cmsstor152.fnal.gov] (w-cmsstor410-disk_itb-disk1) admin > ..
[cmsstor152.fnal.gov] (local) admin > cd w-cmsstor410-disk_itb-disk2
[cmsstor152.fnal.gov] (w-cmsstor410-disk_itb-disk2) admin > csm status
FullScan Idle CacheException(rc=10001;msg=No such file or directory: 000042CA2E8909234B2ABBEC27CD3E1F8103) 5090 checked; 0 errors detected
SingeScan Idle
Scrubber Idle 0 of 0 checked; 0 errors detected
[cmsstor152.fnal.gov] (w-cmsstor410-disk_itb-disk2) admin > ..
[cmsstor152.fnal.gov] (local) admin > logoff
dmg.util.CommandExitException: (0) Done

[root@cmsstor152 2015-06-15]# cat csm_status.query | ./connectChimera-disk_itb.sh | tee csm_status.query`date +%s`.log
Pseudo-terminal will not be allocated because stdin is not a terminal.

dCache Admin (VII) (user=admin)

[cmsstor152.fnal.gov] (local) admin > cd w-cmsstor409-disk_itb-disk1
[cmsstor152.fnal.gov] (w-cmsstor409-disk_itb-disk1) admin > csm status
FullScan Idle CacheException(rc=10006;msg=Request to [>PnfsManager@local] timed out.) 99561 checked; 0 errors detected
SingeScan Idle 0000C742FF12CA914F5A8495A89FFF0C7874 OK 1:ee310cc8
Scrubber Idle 0 of 0 checked; 0 errors detected
[cmsstor152.fnal.gov] (w-cmsstor409-disk_itb-disk1) admin > ..
[cmsstor152.fnal.gov] (local) admin > cd w-cmsstor409-disk_itb-disk2
[cmsstor152.fnal.gov] (w-cmsstor409-disk_itb-disk2) admin > csm status
FullScan Idle CacheException(rc=10001;msg=No such file or directory: 0000DFEAA26B888B41A286886C786DD1CA84) 5435 checked; 0 errors detected
SingeScan Idle
Scrubber Idle 0 of 0 checked; 0 errors detected
[cmsstor152.fnal.gov] (w-cmsstor409-disk_itb-disk2) admin > ..
[cmsstor152.fnal.gov] (local) admin > cd w-cmsstor410-disk_itb-disk1
[cmsstor152.fnal.gov] (w-cmsstor410-disk_itb-disk1) admin > csm status
FullScan Idle CacheException(rc=10001;msg=No such file or directory: 000008BACD216CA14F84B19B3519AE64468A) 5728 checked; 0 errors detected
SingeScan Idle
Scrubber Idle 0 of 0 checked; 0 errors detected
[cmsstor152.fnal.gov] (w-cmsstor410-disk_itb-disk1) admin > ..
[cmsstor152.fnal.gov] (local) admin > cd w-cmsstor410-disk_itb-disk2
[cmsstor152.fnal.gov] (w-cmsstor410-disk_itb-disk2) admin > csm status
FullScan Idle CacheException(rc=10001;msg=No such file or directory: 000042CA2E8909234B2ABBEC27CD3E1F8103) 5090 checked; 0 errors detected
SingeScan Idle
Scrubber Idle 0 of 0 checked; 0 errors detected
[cmsstor152.fnal.gov] (w-cmsstor410-disk_itb-disk2) admin > ..
[cmsstor152.fnal.gov] (local) admin > logoff
dmg.util.CommandExitException: (0) Done

#40 Updated by Natalia Ratnikova over 4 years ago

  • Status changed from Accepted to Feedback

Waiting for feedback from invloved parties.

#41 Updated by Natalia Ratnikova over 4 years ago

Add some more data int odcache testbed instance, using third party transfers.

The previous process did not finish because of expired proxy. Files copying was rather slow, probably because it was copying files from the 3dayslifetime area on the node. Only files in range 1..191 have been copied, while command was 1..500.

Now try to copy the rest using 3rd party transfer, and long-living proxy (190 hours), from the screen "populate_dcache_itb" on cmslpc23:

$ bash # my default shell is tcsh
bash-4.1$ for i in {192..500}; do for g in {1..4}; do srmcp "srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning/${g}g.burningtest-source" "srm://cmsstor153.fnal.gov:8443/srm/managerv2?SFN=/dcache/uscms_test/burning/${g}g.burning-$i"; done; done

#42 Updated by Natalia Ratnikova over 4 years ago

  • Status changed from Feedback to Resolved
  • % Done changed from 60 to 100

Feedback from Gerard received, documentation updated .
Procedure has been verified using cmsstor409/410 nodes, which are now put in production.

Resolving the issue.



Also available in: Atom PDF