Task #12147
Handling cmsstor437 / cmsdisk437438 issue
Description
dCache monitor is showing w-cmsstor437-disk-disk1 having total space of 58862258, instead of normal 73591133 (like w-cmsstor437-disk-disk2).
Looking at cmsdisk437438, we can not find anything wrong.
The thought is, perhaps the file system on cmsstor437 was created wrong or is corrupted.
Therefore, we want to drain w-cmsstor437-disk-disk1 first then take a look of the dis and file system.
This is a record of what has been done.
History
#1 Updated by Chih-Hao Huang almost 5 years ago
Since this is a new (fef) puppet and new dcache admin interface (2.13), it would take some time to figure out all details.
fef puppet, unlike deso puppet, uses ssh public key authentication.
#2 Updated by Chih-Hao Huang almost 5 years ago
<rep>
2:50pm phoebehannah:~/FEFGIT> git clone ssh://git@ssi-git.fnal.gov:2222/fef/sss/puppet
Cloning into 'puppet'...
X11 forwarding request failed on channel 0
remote: Counting objects: 150678, done.
remote: Compressing objects: 100% (42195/42195), done.
remote: Total 150678 (delta 104499), reused 149565 (delta 103678)
Receiving objects: 100% (150678/150678), 460.00 MiB | 758.00 KiB/s, done.
Resolving deltas: 100% (104499/104499), done.
Checking connectivity... done.
warning: remote HEAD refers to nonexistent ref, unable to checkout.
DING! phoebehannah:~/FEFGIT> cd puppet
3:01pm phoebehannah:~/FEFGIT/puppet> ls
3:01pm phoebehannah:~/FEFGIT/puppet> git checkout dcache
Checking out files: 100% (10717/10717), done.
Branch dcache set up to track remote branch dcache from origin.
Switched to a new branch 'dcache'
3:01pm phoebehannah:~/FEFGIT/puppet> cd modules/
3:01pm phoebehannah:~/FEFGIT/puppet/modules> cd dcache
3:01pm phoebehannah:~/FEFGIT/puppet/modules/dcache> cd files
3:01pm phoebehannah:~/FEFGIT/puppet/modules/dcache/files> cd etc/dcache
3:01pm phoebehannah:~/FEFGIT/puppet/modules/dcache/files/etc/dcache> ls l 1 huangch staff 145 Apr 1 15:01 exports-disk
total 160
-rw-r--r-rw-r--r- 1 huangch staff 145 Apr 1 15:01 exports-disk_itbrw-r--r- 1 huangch staff 357 Apr 1 15:01 exports-taperw-r--r- 1 huangch staff 392 Apr 1 15:01 gplazma.confrw-r--r- 1 huangch staff 50441 Apr 1 15:01 poolmanager-disk.confrw-r--r- 1 huangch staff 2679 Apr 1 15:01 poolmanager-disk_itb.confrw-r--r- 1 huangch staff 6724 Apr 1 15:01 poolmanager-tape.conf
3:01pm phoebehannah:~/FEFGIT/puppet/modules/dcache/files/etc/dcache>
However, the content in poolmanager-disk.conf is still the production dCache disk. Don't know where the new one come from
Need to change to the old fashion way.
#3 Updated by Chih-Hao Huang almost 5 years ago
- % Done changed from 0 to 10
Set w-cmsstor437-disk-disk1 to readonly and start migration
bash-3.2$ dcacheadmin-dca dCache (2.13.24) Type "\?" for help. [cmsdcadisk02] (local) admin > \c w-cmsstor437-disk-disk1 [cmsdcadisk02] (w-cmsstor437-disk-disk1@w-cmsstor437-disk-disk1Domain) admin > pool disable -rdonly Pool w-cmsstor437-disk-disk1 disabled(store,stage,p2p-client) [cmsdcadisk02] (w-cmsstor437-disk-disk1@w-cmsstor437-disk-disk1Domain) admin > save [cmsdcadisk02] (w-cmsstor437-disk-disk1@w-cmsstor437-disk-disk1Domain) admin > migration move -target=pgroup flushPools [1] INITIALIZING migration move -target=pgroup -- flushPools [cmsdcadisk02] (w-cmsstor437-disk-disk1@w-cmsstor437-disk-disk1Domain) admin > migration info 1 Command : migration move -target=pgroup -- flushPools State : RUNNING Queued : 25304 Attempts : 3 Targets : w-cmsstor387-disk-disk2,w-cmsstor387-disk-disk1,w-cmsstor470-disk-disk2,w-cmsstor470-disk-disk1,w-cmsstor430-disk-disk2,w-cmsstor463-disk-disk1,w-cmsstor440-disk-disk2,w-cmsstor440-disk-disk1,w-cmsstor450-disk-disk2,w-cmsstor453-disk-disk2,w-cmsstor443-disk-disk1,w-cmsstor453-disk-disk1,w-cmsstor443-disk-disk2,w-cmsstor450-disk-disk1,w-cmsstor463-disk-disk2,w-cmsstor430-disk-disk1,w-cmsstor436-disk-disk2,w-cmsstor436-disk-disk1,w-cmsstor446-disk-disk2,w-cmsstor446-disk-disk1,w-cmsstor456-disk-disk1,w-cmsstor456-disk-disk2,w-cmsstor466-disk-disk2,w-cmsstor466-disk-disk1,w-cmsstor384-disk-disk1,w-cmsstor384-disk-disk2,w-cmsstor426-disk-disk2,w-cmsstor426-disk-disk1,w-cmsstor390-disk-disk2,w-cmsstor390-disk-disk1,w-cmsstor380-disk-disk2,w-cmsstor380-disk-disk1,w-cmsstor393-disk-disk1,w-cmsstor393-disk-disk2,w-cmsstor427-disk-disk1,w-cmsstor427-disk-disk2,w-cmsstor383-disk-disk2,w-cmsstor383-disk-disk1,w-cmsstor437-disk-disk2,w-cmsstor447-disk-disk2,w-cmsstor447-disk-disk1,w-cmsstor457-disk-disk1,w-cmsstor457-disk-disk2,w-cmsstor467-disk-disk2,w-cmsstor467-disk-disk1,w-cmsstor382-disk-disk2,w-cmsstor469-disk-disk1,w-cmsstor469-disk-disk2,w-cmsstor449-disk-disk1,w-cmsstor382-disk-disk1,w-cmsstor439-disk-disk2,w-cmsstor429-disk-disk1,w-cmsstor439-disk-disk1,w-cmsstor429-disk-disk2,w-cmsstor391-disk-disk2,w-cmsstor391-disk-disk1,w-cmsstor381-disk-disk2,w-cmsstor381-disk-disk1,w-cmsstor449-disk-disk2,w-cmsstor438-disk-disk2,w-cmsstor448-disk-disk2,w-cmsstor448-disk-disk1,w-cmsstor425-disk-disk2,w-cmsstor468-disk-disk2,w-cmsstor468-disk-disk1,w-cmsstor425-disk-disk1,w-cmsstor438-disk-disk1,w-cmsstor392-disk-disk1,w-cmsstor428-disk-disk2,w-cmsstor392-disk-disk2,w-cmsstor428-disk-disk1,w-cmsstor385-disk-disk1,w-cmsstor385-disk-disk2,w-cmsstor445-disk-disk2,w-cmsstor435-disk-disk2,w-cmsstor455-disk-disk2,w-cmsstor435-disk-disk1,w-cmsstor455-disk-disk1,w-cmsstor445-disk-disk1,w-cmsstor465-disk-disk2,w-cmsstor465-disk-disk1,w-cmsstor454-disk-disk1,w-cmsstor421-disk-disk2,w-cmsstor421-disk-disk1,w-cmsstor454-disk-disk2,w-cmsstor441-disk-disk2,w-cmsstor444-disk-disk1,w-cmsstor444-disk-disk2,w-cmsstor441-disk-disk1,w-cmsstor431-disk-disk1,w-cmsstor431-disk-disk2,w-cmsstor464-disk-disk1,w-cmsstor464-disk-disk2,w-cmsstor386-disk-disk1,w-cmsstor434-disk-disk2,w-cmsstor386-disk-disk2,w-cmsstor434-disk-disk1,w-cmsstor424-disk-disk1,w-cmsstor424-disk-disk2,w-cmsstor452-disk-disk1,w-cmsstor433-disk-disk2,w-cmsstor433-disk-disk1,w-cmsstor423-disk-disk1,w-cmsstor452-disk-disk2,w-cmsstor389-disk-disk2,w-cmsstor389-disk-disk1,w-cmsstor423-disk-disk2,w-cmsstor388-disk-disk2,w-cmsstor388-disk-disk1,w-cmsstor442-disk-disk2,w-cmsstor451-disk-disk1,w-cmsstor432-disk-disk1,w-cmsstor442-disk-disk1,w-cmsstor451-disk-disk2,w-cmsstor432-disk-disk2,w-cmsstor422-disk-disk2,w-cmsstor422-disk-disk1 Completed : 2 files; 4415855411 bytes; 0% Total : 61717256133915 bytes Concurrency: 1 Running tasks: [2] 0000D52BBBB445B547AAB4436B4383988AAC: TASK.Copying -> [w-cmsstor391-disk-disk2@local] [cmsdcadisk02] (w-cmsstor437-disk-disk1@w-cmsstor437-disk-disk1Domain) admin > migration concurrency 1 5 [1] Concurrency set to 5
#4 Updated by Chih-Hao Huang almost 5 years ago
- Status changed from Assigned to Resolved
- % Done changed from 10 to 100
The pool w-cmsstor437-disk-disk1 was drained.
Nexsan disk was reconfigured.
The pool was put back on Tuesday April 5 and it has been running fine.