Project

General

Profile

Task #12147

Handling cmsstor437 / cmsdisk437438 issue

Added by Chih-Hao Huang over 3 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Start date:
04/01/2016
Due date:
04/08/2016
% Done:

100%

Estimated time:
8.00 h
Spent time:
Duration: 8

Description

dCache monitor is showing w-cmsstor437-disk-disk1 having total space of 58862258, instead of normal 73591133 (like w-cmsstor437-disk-disk2).
Looking at cmsdisk437438, we can not find anything wrong.
The thought is, perhaps the file system on cmsstor437 was created wrong or is corrupted.
Therefore, we want to drain w-cmsstor437-disk-disk1 first then take a look of the dis and file system.
This is a record of what has been done.

History

#1 Updated by Chih-Hao Huang over 3 years ago

Since this is a new (fef) puppet and new dcache admin interface (2.13), it would take some time to figure out all details.

fef puppet, unlike deso puppet, uses ssh public key authentication.

#2 Updated by Chih-Hao Huang over 3 years ago

<rep>
2:50pm phoebehannah:~/FEFGIT> git clone ssh://:2222/fef/sss/puppet
Cloning into 'puppet'...
X11 forwarding request failed on channel 0
remote: Counting objects: 150678, done.
remote: Compressing objects: 100% (42195/42195), done.
remote: Total 150678 (delta 104499), reused 149565 (delta 103678)
Receiving objects: 100% (150678/150678), 460.00 MiB | 758.00 KiB/s, done.
Resolving deltas: 100% (104499/104499), done.
Checking connectivity... done.
warning: remote HEAD refers to nonexistent ref, unable to checkout.

DING! phoebehannah:~/FEFGIT> cd puppet
3:01pm phoebehannah:~/FEFGIT/puppet> ls
3:01pm phoebehannah:~/FEFGIT/puppet> git checkout dcache
Checking out files: 100% (10717/10717), done.
Branch dcache set up to track remote branch dcache from origin.
Switched to a new branch 'dcache'
3:01pm phoebehannah:~/FEFGIT/puppet> cd modules/
3:01pm phoebehannah:~/FEFGIT/puppet/modules> cd dcache
3:01pm phoebehannah:~/FEFGIT/puppet/modules/dcache> cd files
3:01pm phoebehannah:~/FEFGIT/puppet/modules/dcache/files> cd etc/dcache
3:01pm phoebehannah:~/FEFGIT/puppet/modules/dcache/files/etc/dcache> ls l
total 160
-rw-r--r-
1 huangch staff 145 Apr 1 15:01 exports-disk
rw-r--r- 1 huangch staff 145 Apr 1 15:01 exports-disk_itb
rw-r--r- 1 huangch staff 357 Apr 1 15:01 exports-tape
rw-r--r- 1 huangch staff 392 Apr 1 15:01 gplazma.conf
rw-r--r- 1 huangch staff 50441 Apr 1 15:01 poolmanager-disk.conf
rw-r--r- 1 huangch staff 2679 Apr 1 15:01 poolmanager-disk_itb.conf
rw-r--r- 1 huangch staff 6724 Apr 1 15:01 poolmanager-tape.conf
3:01pm phoebehannah:~/FEFGIT/puppet/modules/dcache/files/etc/dcache>

However, the content in poolmanager-disk.conf is still the production dCache disk. Don't know where the new one come from
Need to change to the old fashion way.

#3 Updated by Chih-Hao Huang over 3 years ago

  • % Done changed from 0 to 10

Set w-cmsstor437-disk-disk1 to readonly and start migration

bash-3.2$ dcacheadmin-dca
dCache (2.13.24)
Type "\?" for help.

[cmsdcadisk02] (local) admin > \c w-cmsstor437-disk-disk1
[cmsdcadisk02] (w-cmsstor437-disk-disk1@w-cmsstor437-disk-disk1Domain) admin > pool disable -rdonly
Pool w-cmsstor437-disk-disk1 disabled(store,stage,p2p-client)
[cmsdcadisk02] (w-cmsstor437-disk-disk1@w-cmsstor437-disk-disk1Domain) admin > save
[cmsdcadisk02] (w-cmsstor437-disk-disk1@w-cmsstor437-disk-disk1Domain) admin > migration move -target=pgroup flushPools
[1] INITIALIZING migration move -target=pgroup -- flushPools
[cmsdcadisk02] (w-cmsstor437-disk-disk1@w-cmsstor437-disk-disk1Domain) admin > migration info 1
Command    : migration move -target=pgroup -- flushPools
State      : RUNNING
Queued     : 25304
Attempts   : 3
Targets    : w-cmsstor387-disk-disk2,w-cmsstor387-disk-disk1,w-cmsstor470-disk-disk2,w-cmsstor470-disk-disk1,w-cmsstor430-disk-disk2,w-cmsstor463-disk-disk1,w-cmsstor440-disk-disk2,w-cmsstor440-disk-disk1,w-cmsstor450-disk-disk2,w-cmsstor453-disk-disk2,w-cmsstor443-disk-disk1,w-cmsstor453-disk-disk1,w-cmsstor443-disk-disk2,w-cmsstor450-disk-disk1,w-cmsstor463-disk-disk2,w-cmsstor430-disk-disk1,w-cmsstor436-disk-disk2,w-cmsstor436-disk-disk1,w-cmsstor446-disk-disk2,w-cmsstor446-disk-disk1,w-cmsstor456-disk-disk1,w-cmsstor456-disk-disk2,w-cmsstor466-disk-disk2,w-cmsstor466-disk-disk1,w-cmsstor384-disk-disk1,w-cmsstor384-disk-disk2,w-cmsstor426-disk-disk2,w-cmsstor426-disk-disk1,w-cmsstor390-disk-disk2,w-cmsstor390-disk-disk1,w-cmsstor380-disk-disk2,w-cmsstor380-disk-disk1,w-cmsstor393-disk-disk1,w-cmsstor393-disk-disk2,w-cmsstor427-disk-disk1,w-cmsstor427-disk-disk2,w-cmsstor383-disk-disk2,w-cmsstor383-disk-disk1,w-cmsstor437-disk-disk2,w-cmsstor447-disk-disk2,w-cmsstor447-disk-disk1,w-cmsstor457-disk-disk1,w-cmsstor457-disk-disk2,w-cmsstor467-disk-disk2,w-cmsstor467-disk-disk1,w-cmsstor382-disk-disk2,w-cmsstor469-disk-disk1,w-cmsstor469-disk-disk2,w-cmsstor449-disk-disk1,w-cmsstor382-disk-disk1,w-cmsstor439-disk-disk2,w-cmsstor429-disk-disk1,w-cmsstor439-disk-disk1,w-cmsstor429-disk-disk2,w-cmsstor391-disk-disk2,w-cmsstor391-disk-disk1,w-cmsstor381-disk-disk2,w-cmsstor381-disk-disk1,w-cmsstor449-disk-disk2,w-cmsstor438-disk-disk2,w-cmsstor448-disk-disk2,w-cmsstor448-disk-disk1,w-cmsstor425-disk-disk2,w-cmsstor468-disk-disk2,w-cmsstor468-disk-disk1,w-cmsstor425-disk-disk1,w-cmsstor438-disk-disk1,w-cmsstor392-disk-disk1,w-cmsstor428-disk-disk2,w-cmsstor392-disk-disk2,w-cmsstor428-disk-disk1,w-cmsstor385-disk-disk1,w-cmsstor385-disk-disk2,w-cmsstor445-disk-disk2,w-cmsstor435-disk-disk2,w-cmsstor455-disk-disk2,w-cmsstor435-disk-disk1,w-cmsstor455-disk-disk1,w-cmsstor445-disk-disk1,w-cmsstor465-disk-disk2,w-cmsstor465-disk-disk1,w-cmsstor454-disk-disk1,w-cmsstor421-disk-disk2,w-cmsstor421-disk-disk1,w-cmsstor454-disk-disk2,w-cmsstor441-disk-disk2,w-cmsstor444-disk-disk1,w-cmsstor444-disk-disk2,w-cmsstor441-disk-disk1,w-cmsstor431-disk-disk1,w-cmsstor431-disk-disk2,w-cmsstor464-disk-disk1,w-cmsstor464-disk-disk2,w-cmsstor386-disk-disk1,w-cmsstor434-disk-disk2,w-cmsstor386-disk-disk2,w-cmsstor434-disk-disk1,w-cmsstor424-disk-disk1,w-cmsstor424-disk-disk2,w-cmsstor452-disk-disk1,w-cmsstor433-disk-disk2,w-cmsstor433-disk-disk1,w-cmsstor423-disk-disk1,w-cmsstor452-disk-disk2,w-cmsstor389-disk-disk2,w-cmsstor389-disk-disk1,w-cmsstor423-disk-disk2,w-cmsstor388-disk-disk2,w-cmsstor388-disk-disk1,w-cmsstor442-disk-disk2,w-cmsstor451-disk-disk1,w-cmsstor432-disk-disk1,w-cmsstor442-disk-disk1,w-cmsstor451-disk-disk2,w-cmsstor432-disk-disk2,w-cmsstor422-disk-disk2,w-cmsstor422-disk-disk1
Completed  : 2 files; 4415855411 bytes; 0%
Total      : 61717256133915 bytes
Concurrency: 1
Running tasks:
[2] 0000D52BBBB445B547AAB4436B4383988AAC: TASK.Copying -> [w-cmsstor391-disk-disk2@local]

[cmsdcadisk02] (w-cmsstor437-disk-disk1@w-cmsstor437-disk-disk1Domain) admin > migration concurrency 1 5
[1] Concurrency set to 5

#4 Updated by Chih-Hao Huang over 3 years ago

  • Status changed from Assigned to Resolved
  • % Done changed from 10 to 100

The pool w-cmsstor437-disk-disk1 was drained.
Nexsan disk was reconfigured.
The pool was put back on Tuesday April 5 and it has been running fine.



Also available in: Atom PDF