Project

General

Profile

Bug #10621

test seagate1

Added by Gerard Bernabeu Altayo almost 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Start date:
10/23/2015
Due date:
% Done:

0%

Estimated time:
Duration:

Description

This is a node with many many disks (84) and I need to test it with EOS for performance, the adding procedure is close to the original but I'll document here the specifics.

Original procedure: https://cmsweb.fnal.gov/bin/view/Storage/EOSOperationalProcedures#Install_a_new_EOS_FST_node

History

#1 Updated by Gerard Bernabeu Altayo almost 4 years ago

This node's deployment was not standard, at some point I should try to reinstall it.

The disk system is:

[1:0:0:0] disk ATA SanDisk SD6SA1M1 X231 /dev/sdaj

The whole output for all disks is:

[root@seagate1 ~]# lsscsi
[0:0:0:0] enclosu XYRATEX UD-8435-CS-6000 4026 -
[0:0:1:0] process XYRATEX DEFAULT-SD-L36H 4026 -
[0:0:2:0] process XYRATEX DEFAULT-SD-L24H 4026 -
[0:0:3:0] process XYRATEX DEFAULT-SD-L24H 4026 -
[0:0:4:0] process XYRATEX DEFAULT-SD-L36H 4026 -
[0:0:5:0] disk ATA ST6000NM0024-1HT SND2 /dev/sda
[0:0:6:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdb
[0:0:7:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdc
[0:0:8:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdd
[0:0:9:0] disk ATA ST6000NM0024-1HT SND2 /dev/sde
[0:0:10:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdf
[0:0:11:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdg
[0:0:12:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdh
[0:0:13:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdi
[0:0:14:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdj
[0:0:15:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdk
[0:0:16:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdl
[0:0:17:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdm
[0:0:18:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdn
[0:0:19:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdo
[0:0:20:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdp
[0:0:21:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdq
[0:0:22:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdr
[0:0:23:0] disk ATA ST6000NM0024-1HT SND2 /dev/sds
[0:0:24:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdt
[0:0:25:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdu
[0:0:26:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdv
[0:0:27:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdw
[0:0:28:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdx
[0:0:29:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdy
[0:0:30:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdz
[0:0:31:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdaa
[0:0:32:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdab
[0:0:33:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdac
[0:0:34:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdad
[0:0:35:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdae
[0:0:36:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdaf
[0:0:37:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdag
[0:0:38:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdah
[0:0:39:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdai
[0:0:40:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdak
[0:0:41:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdal
[0:0:42:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdam
[0:0:43:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdan
[0:0:44:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdao
[0:0:45:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdap
[0:0:46:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdaq
[0:0:47:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdar
[0:0:48:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdas
[0:0:49:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdat
[0:0:50:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdau
[0:0:51:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdav
[0:0:52:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdaw
[0:0:53:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdax
[0:0:54:0] disk ATA ST6000NM0024-1HT SND2 /dev/sday
[0:0:55:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdaz
[0:0:56:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdba
[0:0:57:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdbb
[0:0:58:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdbc
[0:0:59:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdbd
[0:0:60:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdbe
[0:0:61:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdbf
[0:0:62:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdbg
[0:0:63:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdbh
[0:0:64:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdbi
[0:0:65:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdbj
[0:0:66:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdbk
[0:0:67:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdbl
[0:0:68:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdbm
[0:0:69:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdbp
[0:0:70:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdbq
[0:0:71:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdbr
[0:0:72:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdbs
[0:0:73:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdbt
[0:0:74:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdbu
[0:0:75:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdbv
[0:0:76:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdbw
[0:0:77:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdbx
[0:0:78:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdby
[0:0:79:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdbz
[0:0:80:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdca
[0:0:81:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdcb
[0:0:82:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdcc
[0:0:83:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdcd
[0:0:84:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdce
[0:0:85:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdcf
[0:0:86:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdcg
[0:0:87:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdch
[0:0:88:0] disk ATA ST6000NM0024-1HT SND2 /dev/sdci
[1:0:0:0] disk ATA SanDisk SD6SA1M1 X231 /dev/sdaj
[7:0:0:0] cd/dvd AMI Virtual CDROM0 1.00 /dev/sr0
[8:0:0:0] disk AMI Virtual Floppy0 1.00 /dev/sdbn
[9:0:0:0] disk AMI Virtual HDISK0 1.00 /dev/sdbo

I only care about the ST600... disks for EOS FSTs, will create one FST per disk. First I need to format them with XFS (Did the 3 that puppet EOS profile expect already):

[root@seagate1 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sdaj2 103G 5.1G 93G 6% /
tmpfs 32G 160K 32G 1% /dev/shm
tmpfs 32G 38M 32G 1% /tmp
/dev/sda 5.5T 34M 5.5T 1% /storage/data1
/dev/sdb 5.5T 34M 5.5T 1% /storage/data2
/dev/sdc 5.5T 34M 5.5T 1% /storage/data3
eosmain 33T 2.5G 33T 1% /eos
[root@seagate1 ~]#

#2 Updated by Gerard Bernabeu Altayo almost 4 years ago

I have unmounted /dev/sda1,2,3 and doing the following for the mounts:

[root@seagate1 ~]# num=1; for i in `lsscsi  | grep ST600 | awk '{print $6}'`; do echo "formatting $i with label eos${num}"; mkfs.xfs -q -f $i && xfs_admin -L eos${num} $i; let num++; done

Now I am manually adding the fstab entries, generating the list with:

[root@seagate1 ~]# num=1; for i in `lsscsi  | grep ST600 | awk '{print $6}'`; do echo "LABEL=eos$num      /storage/data$num  xfs     nobarrier,inode64,defaults      1       1"; let num++; done

And creating the directories too:

[root@seagate1 ~]# num=1; for i in `lsscsi  | grep ST600 | awk '{print $6}'`; do mkdir /storage/data$num; let num++; done
mkdir: cannot create directory `/storage/data1': File exists
mkdir: cannot create directory `/storage/data2': File exists
mkdir: cannot create directory `/storage/data3': File exists
[root@seagate1 ~]# 

After a mount -a:

[root@seagate1 ~]# mount | grep -c /storage/data
84

So now I can proceed with step #3 of the procedure:

[root@seagate1 ~]# mgm=cmssrv153.fnal.gov  #For production: cmssrv222.fnal.gov
[root@seagate1 ~]# eosfstregister `grep '^ *all.manager' /etc/xrd.cf.fst | awk '{print $2}'` /storage/data spare:`mount | grep -c /storage/data`
###########################
# <eosfstregister> v1.0.0
###########################
/storage/data1 : uuid=e48a1350-a804-4e94-b8e5-f6f9978210ac fsid=undef
success:   mapped 'e48a1350-a804-4e94-b8e5-f6f9978210ac' <=> fsid=4
/storage/data10 : uuid=9edcad0b-603e-4162-8f17-62fa047278f4 fsid=undef
success:   mapped '9edcad0b-603e-4162-8f17-62fa047278f4' <=> fsid=5
/storage/data11 : uuid=574d3ef1-51b3-4e95-ace4-737002a36d7e fsid=undef
success:   mapped '574d3ef1-51b3-4e95-ace4-737002a36d7e' <=> fsid=6
/storage/data12 : uuid=820bc1c8-5378-48d4-aa0f-b504cacacb76 fsid=undef
success:   mapped '820bc1c8-5378-48d4-aa0f-b504cacacb76' <=> fsid=7
/storage/data13 : uuid=cd71a332-2e5b-48c9-844e-bd14aee681d4 fsid=undef
success:   mapped 'cd71a332-2e5b-48c9-844e-bd14aee681d4' <=> fsid=8
/storage/data14 : uuid=5d11c326-3599-40bd-8ad8-9008739c903b fsid=undef
success:   mapped '5d11c326-3599-40bd-8ad8-9008739c903b' <=> fsid=9
/storage/data15 : uuid=7934c630-bcb8-4a89-9365-a51228d0a895 fsid=undef
success:   mapped '7934c630-bcb8-4a89-9365-a51228d0a895' <=> fsid=10
/storage/data16 : uuid=b677bdf4-2857-40c4-9470-886abb54f880 fsid=undef
success:   mapped 'b677bdf4-2857-40c4-9470-886abb54f880' <=> fsid=11
/storage/data17 : uuid=2ba0952f-e8c8-4b19-a041-aee1741036e4 fsid=undef
success:   mapped '2ba0952f-e8c8-4b19-a041-aee1741036e4' <=> fsid=12
/storage/data18 : uuid=361fa98c-ae8e-4fe7-8230-c3e00360e190 fsid=undef
success:   mapped '361fa98c-ae8e-4fe7-8230-c3e00360e190' <=> fsid=13
/storage/data19 : uuid=ac1e1472-5ed0-4ca5-999c-77facb49e5a2 fsid=undef
success:   mapped 'ac1e1472-5ed0-4ca5-999c-77facb49e5a2' <=> fsid=14
/storage/data2 : uuid=509e5514-2ee4-4699-9d49-739740cad1cf fsid=undef
success:   mapped '509e5514-2ee4-4699-9d49-739740cad1cf' <=> fsid=15
/storage/data20 : uuid=cfe62855-67fc-4d7d-b38e-2a82ff724542 fsid=undef
success:   mapped 'cfe62855-67fc-4d7d-b38e-2a82ff724542' <=> fsid=16
/storage/data21 : uuid=2d5f414e-c392-4d7e-8006-ce8d952b71fe fsid=undef
success:   mapped '2d5f414e-c392-4d7e-8006-ce8d952b71fe' <=> fsid=17
/storage/data22 : uuid=51a717e8-e4c6-4fa0-a9ca-080184b16011 fsid=undef
success:   mapped '51a717e8-e4c6-4fa0-a9ca-080184b16011' <=> fsid=18
/storage/data23 : uuid=6fdcd807-d463-4e84-ba46-9436faf67bf3 fsid=undef
success:   mapped '6fdcd807-d463-4e84-ba46-9436faf67bf3' <=> fsid=19
/storage/data24 : uuid=ab133d97-a563-4a3b-ba99-3172e502fcbc fsid=undef
success:   mapped 'ab133d97-a563-4a3b-ba99-3172e502fcbc' <=> fsid=20
/storage/data25 : uuid=220af07f-ccc6-47fa-8115-92e53e57daba fsid=undef
success:   mapped '220af07f-ccc6-47fa-8115-92e53e57daba' <=> fsid=21
/storage/data26 : uuid=3359449f-e517-4426-a40d-a091c263d5c4 fsid=undef
success:   mapped '3359449f-e517-4426-a40d-a091c263d5c4' <=> fsid=22
/storage/data27 : uuid=e1b0dea2-b120-4fbc-920f-464771f8e004 fsid=undef
success:   mapped 'e1b0dea2-b120-4fbc-920f-464771f8e004' <=> fsid=23
/storage/data28 : uuid=d72b0d0c-1f53-459a-b6b1-93046974f7c4 fsid=undef
success:   mapped 'd72b0d0c-1f53-459a-b6b1-93046974f7c4' <=> fsid=24
/storage/data29 : uuid=f91f3953-988d-4c7a-91f7-fe9b67b30073 fsid=undef
success:   mapped 'f91f3953-988d-4c7a-91f7-fe9b67b30073' <=> fsid=25
/storage/data3 : uuid=5f0bf57c-9394-4f7e-9f7c-fa5261767b69 fsid=undef
success:   mapped '5f0bf57c-9394-4f7e-9f7c-fa5261767b69' <=> fsid=26
/storage/data30 : uuid=e645b24a-b645-4484-8b69-c60667f705fb fsid=undef
success:   mapped 'e645b24a-b645-4484-8b69-c60667f705fb' <=> fsid=27
/storage/data31 : uuid=5f7fb988-3671-4dfd-ac6a-af6c56d1e4d6 fsid=undef
success:   mapped '5f7fb988-3671-4dfd-ac6a-af6c56d1e4d6' <=> fsid=28
/storage/data32 : uuid=4fb889cd-56f1-462f-a3b0-360adf9d0ae8 fsid=undef
success:   mapped '4fb889cd-56f1-462f-a3b0-360adf9d0ae8' <=> fsid=29
/storage/data33 : uuid=f023d7f5-8c3c-4558-a27b-857985fd7727 fsid=undef
success:   mapped 'f023d7f5-8c3c-4558-a27b-857985fd7727' <=> fsid=30
/storage/data34 : uuid=2bd5aaa9-5117-459b-9bdf-760589e44efe fsid=undef
success:   mapped '2bd5aaa9-5117-459b-9bdf-760589e44efe' <=> fsid=31
/storage/data35 : uuid=8401bba1-9668-4bbb-b7af-2833704bb08d fsid=undef
success:   mapped '8401bba1-9668-4bbb-b7af-2833704bb08d' <=> fsid=32
/storage/data36 : uuid=82e9d328-10c4-401b-a73c-a3beebd397ad fsid=undef
success:   mapped '82e9d328-10c4-401b-a73c-a3beebd397ad' <=> fsid=33
/storage/data37 : uuid=54cbbaa1-3e0d-48ef-a679-8f222d8fc968 fsid=undef
success:   mapped '54cbbaa1-3e0d-48ef-a679-8f222d8fc968' <=> fsid=34
/storage/data38 : uuid=c2958f02-54ad-4051-a6a1-a749e877d3f4 fsid=undef
success:   mapped 'c2958f02-54ad-4051-a6a1-a749e877d3f4' <=> fsid=35
/storage/data39 : uuid=4a53d9da-179c-4c3c-ad48-b085ddade167 fsid=undef
success:   mapped '4a53d9da-179c-4c3c-ad48-b085ddade167' <=> fsid=36
/storage/data4 : uuid=9c702699-4312-451a-8a71-838fae275139 fsid=undef
success:   mapped '9c702699-4312-451a-8a71-838fae275139' <=> fsid=37
/storage/data40 : uuid=21d9416c-b299-4459-944e-4f7c785f5e97 fsid=undef
success:   mapped '21d9416c-b299-4459-944e-4f7c785f5e97' <=> fsid=38
/storage/data41 : uuid=2121a2cd-6d38-4697-a846-9aaf73b89811 fsid=undef
success:   mapped '2121a2cd-6d38-4697-a846-9aaf73b89811' <=> fsid=39
/storage/data42 : uuid=073d5723-d9e7-4801-ad43-44adaf45cc5c fsid=undef
success:   mapped '073d5723-d9e7-4801-ad43-44adaf45cc5c' <=> fsid=40
/storage/data43 : uuid=6390e88b-954c-4137-9ea0-a6b9a4285a80 fsid=undef
success:   mapped '6390e88b-954c-4137-9ea0-a6b9a4285a80' <=> fsid=41
/storage/data44 : uuid=16af1cd2-a23a-4da3-829b-63e756408d4e fsid=undef
success:   mapped '16af1cd2-a23a-4da3-829b-63e756408d4e' <=> fsid=42
/storage/data45 : uuid=34cd68cb-a8f8-4580-b129-cdd660e725f0 fsid=undef
success:   mapped '34cd68cb-a8f8-4580-b129-cdd660e725f0' <=> fsid=43
/storage/data46 : uuid=dd133c20-55a8-47cd-a186-d7b1f1ec6ddd fsid=undef
success:   mapped 'dd133c20-55a8-47cd-a186-d7b1f1ec6ddd' <=> fsid=44
/storage/data47 : uuid=e4da3fba-af8c-4132-9a42-819cf6d7ad45 fsid=undef
success:   mapped 'e4da3fba-af8c-4132-9a42-819cf6d7ad45' <=> fsid=45
/storage/data48 : uuid=bbdcee84-23b7-40ee-85b3-5bdd816e46a9 fsid=undef
success:   mapped 'bbdcee84-23b7-40ee-85b3-5bdd816e46a9' <=> fsid=46
/storage/data49 : uuid=3ad0e3ec-6168-491d-973f-ea73a2d51294 fsid=undef
success:   mapped '3ad0e3ec-6168-491d-973f-ea73a2d51294' <=> fsid=47
/storage/data5 : uuid=e23b1679-df2e-45f4-96f3-9b023fef6cd6 fsid=undef
success:   mapped 'e23b1679-df2e-45f4-96f3-9b023fef6cd6' <=> fsid=48
/storage/data50 : uuid=1dcbe415-88a3-49ed-8e48-0b24da184cf1 fsid=undef
success:   mapped '1dcbe415-88a3-49ed-8e48-0b24da184cf1' <=> fsid=49
/storage/data51 : uuid=8b796b82-0ea3-45fc-b7a5-08d3ee02643d fsid=undef
success:   mapped '8b796b82-0ea3-45fc-b7a5-08d3ee02643d' <=> fsid=50
/storage/data52 : uuid=6995a582-6052-4d19-810d-2f27e8532145 fsid=undef
success:   mapped '6995a582-6052-4d19-810d-2f27e8532145' <=> fsid=51
/storage/data53 : uuid=31ced723-b339-4be6-8ffb-8f84cf0c5c6f fsid=undef
success:   mapped '31ced723-b339-4be6-8ffb-8f84cf0c5c6f' <=> fsid=52
/storage/data54 : uuid=00fed805-c0db-494a-99c4-c8723624ebde fsid=undef
success:   mapped '00fed805-c0db-494a-99c4-c8723624ebde' <=> fsid=53
/storage/data55 : uuid=4834b8ef-2248-4aa0-bc31-3ec1f2f775e8 fsid=undef
success:   mapped '4834b8ef-2248-4aa0-bc31-3ec1f2f775e8' <=> fsid=54
/storage/data56 : uuid=9fee3cee-c0ec-4378-a45a-d542971b69fc fsid=undef
success:   mapped '9fee3cee-c0ec-4378-a45a-d542971b69fc' <=> fsid=55
/storage/data57 : uuid=b781835b-5c27-4855-ac0b-b08fe04ae2a1 fsid=undef
success:   mapped 'b781835b-5c27-4855-ac0b-b08fe04ae2a1' <=> fsid=56
/storage/data58 : uuid=4d068195-35ce-44fc-a669-03824708d578 fsid=undef
success:   mapped '4d068195-35ce-44fc-a669-03824708d578' <=> fsid=57
/storage/data59 : uuid=a59c6272-58ea-4684-940b-e58a04173aeb fsid=undef
success:   mapped 'a59c6272-58ea-4684-940b-e58a04173aeb' <=> fsid=58
/storage/data6 : uuid=9b74e290-ef08-42ba-b768-6d593561a11c fsid=undef
success:   mapped '9b74e290-ef08-42ba-b768-6d593561a11c' <=> fsid=59
/storage/data60 : uuid=9f663523-29a6-49aa-a237-d06bf2de161d fsid=undef
success:   mapped '9f663523-29a6-49aa-a237-d06bf2de161d' <=> fsid=60
/storage/data61 : uuid=98e1ea9f-6071-4c8d-836e-1994701e83d9 fsid=undef
success:   mapped '98e1ea9f-6071-4c8d-836e-1994701e83d9' <=> fsid=61
/storage/data62 : uuid=fd7491e7-b751-4da2-9e12-7e418839f001 fsid=undef
success:   mapped 'fd7491e7-b751-4da2-9e12-7e418839f001' <=> fsid=62
/storage/data63 : uuid=4c1ad4dc-d451-4218-b8a3-574be12ed27d fsid=undef
success:   mapped '4c1ad4dc-d451-4218-b8a3-574be12ed27d' <=> fsid=63
/storage/data64 : uuid=df6165c2-9590-410b-a623-ca59b16f82aa fsid=undef
success:   mapped 'df6165c2-9590-410b-a623-ca59b16f82aa' <=> fsid=64
/storage/data65 : uuid=dd5099b2-17d9-4784-bc0c-f167de3d7bf6 fsid=undef
success:   mapped 'dd5099b2-17d9-4784-bc0c-f167de3d7bf6' <=> fsid=65
/storage/data66 : uuid=c86fbb56-6ef2-45d8-8cb7-28880583612c fsid=undef
success:   mapped 'c86fbb56-6ef2-45d8-8cb7-28880583612c' <=> fsid=66
/storage/data67 : uuid=8a0a56d7-3b24-49ba-b873-0dfccd05c23c fsid=undef
success:   mapped '8a0a56d7-3b24-49ba-b873-0dfccd05c23c' <=> fsid=67
/storage/data68 : uuid=5ef439ff-dca2-4687-94f1-25b293588080 fsid=undef
success:   mapped '5ef439ff-dca2-4687-94f1-25b293588080' <=> fsid=68
/storage/data69 : uuid=fb3a24b6-66fc-45ae-ac52-f9f35bb00660 fsid=undef
success:   mapped 'fb3a24b6-66fc-45ae-ac52-f9f35bb00660' <=> fsid=69
/storage/data7 : uuid=a3087b7f-8ab6-468c-9ed0-3310d5b1eb65 fsid=undef
success:   mapped 'a3087b7f-8ab6-468c-9ed0-3310d5b1eb65' <=> fsid=70
/storage/data70 : uuid=de713767-34c1-49f6-b936-a2eff2037d82 fsid=undef
success:   mapped 'de713767-34c1-49f6-b936-a2eff2037d82' <=> fsid=71
/storage/data71 : uuid=ac7ed73f-4095-4488-b3ac-0ace7b44f990 fsid=undef
success:   mapped 'ac7ed73f-4095-4488-b3ac-0ace7b44f990' <=> fsid=72
/storage/data72 : uuid=4f251404-a9bb-4526-9313-623f3c201d08 fsid=undef
success:   mapped '4f251404-a9bb-4526-9313-623f3c201d08' <=> fsid=73
/storage/data73 : uuid=f38051b8-ba50-451a-8e3e-cafa262ef38e fsid=undef
success:   mapped 'f38051b8-ba50-451a-8e3e-cafa262ef38e' <=> fsid=74
/storage/data74 : uuid=e1730089-aee5-46b3-a9f0-751c9a290511 fsid=undef
success:   mapped 'e1730089-aee5-46b3-a9f0-751c9a290511' <=> fsid=75
/storage/data75 : uuid=207a4808-103d-423a-acc7-8305fcb51d7c fsid=undef
success:   mapped '207a4808-103d-423a-acc7-8305fcb51d7c' <=> fsid=76
/storage/data76 : uuid=25bd6c07-0d29-4ffa-9944-e8e85c8aa9a4 fsid=undef
success:   mapped '25bd6c07-0d29-4ffa-9944-e8e85c8aa9a4' <=> fsid=77
/storage/data77 : uuid=7bde8cb7-c43c-456c-a023-a830e2b69fc8 fsid=undef
success:   mapped '7bde8cb7-c43c-456c-a023-a830e2b69fc8' <=> fsid=78
/storage/data78 : uuid=271deb69-e3b3-460b-8d1c-89a67c506697 fsid=undef
success:   mapped '271deb69-e3b3-460b-8d1c-89a67c506697' <=> fsid=79
/storage/data79 : uuid=e4efb1f3-e08b-4e4e-bc73-f102994d4a22 fsid=undef
success:   mapped 'e4efb1f3-e08b-4e4e-bc73-f102994d4a22' <=> fsid=80
/storage/data8 : uuid=96fb5a35-293d-4da0-9102-bf5791c676c9 fsid=undef
success:   mapped '96fb5a35-293d-4da0-9102-bf5791c676c9' <=> fsid=81
/storage/data80 : uuid=9ee6710e-8be6-4bb5-a10c-0e25104c51d6 fsid=undef
success:   mapped '9ee6710e-8be6-4bb5-a10c-0e25104c51d6' <=> fsid=82
/storage/data81 : uuid=2e680d7d-a0c6-44dd-8913-cf5ada5a10e5 fsid=undef
success:   mapped '2e680d7d-a0c6-44dd-8913-cf5ada5a10e5' <=> fsid=83
/storage/data82 : uuid=68a0460d-7234-4e99-b258-e280b81e9aad fsid=undef
success:   mapped '68a0460d-7234-4e99-b258-e280b81e9aad' <=> fsid=84
/storage/data83 : uuid=509f0e0a-39e8-41ae-af72-c2879dd1e3b6 fsid=undef
success:   mapped '509f0e0a-39e8-41ae-af72-c2879dd1e3b6' <=> fsid=85
/storage/data84 : uuid=2f73b10a-96df-4d8f-8b7d-6f5d332a9542 fsid=undef
success:   mapped '2f73b10a-96df-4d8f-8b7d-6f5d332a9542' <=> fsid=86
/storage/data9 : uuid=36f8c8f4-7a93-4d02-a5c4-1ed77f182187 fsid=undef
success:   mapped '36f8c8f4-7a93-4d02-a5c4-1ed77f182187' <=> fsid=87
[root@seagate1 ~]# ssh $mgm eos node set ${HOSTNAME}:1095 on
[root@seagate1 ~]# ssh $mgm eos vid add gateway ${HOSTNAME}
success: set vid [  eos.rgid=0 eos.ruid=0 mgm.cmd=vid mgm.subcmd=set mgm.vid.auth=tident mgm.vid.cmd=map mgm.vid.gid=0 mgm.vid.key=<key> mgm.vid.pattern="*@seagate1.fnal.gov" mgm.vid.uid=0 ]

#3 Updated by Gerard Bernabeu Altayo almost 4 years ago

The pools failed to start, the directories were not owned by 'daemon', changed with

chown daemon.daemon /storage/data*
service eos restart

now it works and all pools show like:

seagate1.fnal.gov (1095)     67  /storage/data67            spare                        booted             rw      nodrain

And also:

[root@cmssrv153 ~]# eos node ls
#----------------------------------------------------------------------------------------------------------------------------------------------
  1. type # hostport # geotag # status # status # txgw #gw-queued # gw-ntx #gw-rate # heartbeatdelta #nofs
    #----------------------------------------------------------------------------------------------------------------------------------------------
    nodesview cmsstor150.fnal.gov:1095 online on off 0 10 120 2 3
    nodesview seagate1.fnal.gov:1095 online on off 0 10 120 3 84
    [root@cmssrv153 ~]#

We also see the pools in the spare space:

[root@cmssrv153 ~]# eos space ls
#------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  1. type # name # groupsize # groupmod #N(fs) #N(fs-rw) #sum(usedbytes) #sum(capacity) #capacity(rw) #nom.capacity #quota #balancing # threshold # converter # ntx # active #intergroup
    #------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    spaceview default 0 0 3 3 2.65 G 36.00 T 36.00 T 0 off off 20 off 2 0 off
    spaceview spare 0 0 84 0 2.94 G 503.92 T 0 0 off off 20 off 2 0 off
    [root@cmssrv153 ~]#

In order to group them the RIGHT way I would have to create 84 groups in the default space and add one disk to each.

1. Space is formed by 'groups' in which 'fs' are placed.

2. For example, 2 replicas of one file would go to 2 different fs of the same group. EOS does not ever look at what node the 'fs' belongs to. Therefore one should never place 2 'fs' (disks) of the same node in the same 'group'. Automatic registration (eosfstregister) makes sure of this to happen automatically.

I'm gonna do that:

[root@seagate1 ~]# num=1; for i in $fsids; do cmd="eos fs mv $i default.$num"; echo $cmd; ssh $mgm $cmd; let num++; done
eos fs mv 4 default.1
success: moved filesystem 4 into space default.1
eos fs mv 5 default.2
success: moved filesystem 5 into space default.2
eos fs mv 6 default.3
success: moved filesystem 6 into space default.3
eos fs mv 7 default.4
success: moved filesystem 7 into space default.4
eos fs mv 8 default.5
success: moved filesystem 8 into space default.5
eos fs mv 9 default.6
success: moved filesystem 9 into space default.6
eos fs mv 10 default.7
success: moved filesystem 10 into space default.7
eos fs mv 11 default.8
success: moved filesystem 11 into space default.8
eos fs mv 12 default.9
success: moved filesystem 12 into space default.9
eos fs mv 13 default.10
success: moved filesystem 13 into space default.10
eos fs mv 14 default.11
success: moved filesystem 14 into space default.11
eos fs mv 15 default.12
success: moved filesystem 15 into space default.12
eos fs mv 16 default.13
success: moved filesystem 16 into space default.13
eos fs mv 17 default.14
success: moved filesystem 17 into space default.14
eos fs mv 18 default.15
success: moved filesystem 18 into space default.15
eos fs mv 19 default.16
success: moved filesystem 19 into space default.16
eos fs mv 20 default.17
success: moved filesystem 20 into space default.17
eos fs mv 21 default.18
success: moved filesystem 21 into space default.18
eos fs mv 22 default.19
success: moved filesystem 22 into space default.19
eos fs mv 23 default.20
success: moved filesystem 23 into space default.20
eos fs mv 24 default.21
success: moved filesystem 24 into space default.21
eos fs mv 25 default.22
success: moved filesystem 25 into space default.22
eos fs mv 26 default.23
success: moved filesystem 26 into space default.23
eos fs mv 27 default.24
success: moved filesystem 27 into space default.24
eos fs mv 28 default.25
success: moved filesystem 28 into space default.25
eos fs mv 29 default.26
success: moved filesystem 29 into space default.26
eos fs mv 30 default.27
success: moved filesystem 30 into space default.27
eos fs mv 31 default.28
success: moved filesystem 31 into space default.28
eos fs mv 32 default.29
success: moved filesystem 32 into space default.29
eos fs mv 33 default.30
success: moved filesystem 33 into space default.30
eos fs mv 34 default.31
success: moved filesystem 34 into space default.31
eos fs mv 35 default.32
success: moved filesystem 35 into space default.32
eos fs mv 36 default.33
success: moved filesystem 36 into space default.33
eos fs mv 37 default.34
success: moved filesystem 37 into space default.34
eos fs mv 38 default.35
success: moved filesystem 38 into space default.35
eos fs mv 39 default.36
success: moved filesystem 39 into space default.36
eos fs mv 40 default.37
success: moved filesystem 40 into space default.37
eos fs mv 41 default.38
success: moved filesystem 41 into space default.38
eos fs mv 42 default.39
success: moved filesystem 42 into space default.39
eos fs mv 43 default.40
success: moved filesystem 43 into space default.40
eos fs mv 44 default.41
success: moved filesystem 44 into space default.41
eos fs mv 45 default.42
success: moved filesystem 45 into space default.42
eos fs mv 46 default.43
success: moved filesystem 46 into space default.43
eos fs mv 47 default.44
success: moved filesystem 47 into space default.44
eos fs mv 48 default.45
success: moved filesystem 48 into space default.45
eos fs mv 49 default.46
success: moved filesystem 49 into space default.46
eos fs mv 50 default.47
success: moved filesystem 50 into space default.47
eos fs mv 51 default.48
success: moved filesystem 51 into space default.48
eos fs mv 52 default.49
success: moved filesystem 52 into space default.49
eos fs mv 53 default.50
success: moved filesystem 53 into space default.50
eos fs mv 54 default.51
success: moved filesystem 54 into space default.51
eos fs mv 55 default.52
success: moved filesystem 55 into space default.52
eos fs mv 56 default.53
success: moved filesystem 56 into space default.53
eos fs mv 57 default.54
success: moved filesystem 57 into space default.54
eos fs mv 58 default.55
success: moved filesystem 58 into space default.55
eos fs mv 59 default.56
success: moved filesystem 59 into space default.56
eos fs mv 60 default.57
success: moved filesystem 60 into space default.57
eos fs mv 61 default.58
success: moved filesystem 61 into space default.58
eos fs mv 62 default.59
success: moved filesystem 62 into space default.59
eos fs mv 63 default.60
success: moved filesystem 63 into space default.60
eos fs mv 64 default.61
success: moved filesystem 64 into space default.61
eos fs mv 65 default.62
success: moved filesystem 65 into space default.62
eos fs mv 66 default.63
success: moved filesystem 66 into space default.63
eos fs mv 67 default.64
success: moved filesystem 67 into space default.64
eos fs mv 68 default.65
success: moved filesystem 68 into space default.65
eos fs mv 69 default.66
success: moved filesystem 69 into space default.66
eos fs mv 70 default.67
success: moved filesystem 70 into space default.67
eos fs mv 71 default.68
success: moved filesystem 71 into space default.68
eos fs mv 72 default.69
success: moved filesystem 72 into space default.69
eos fs mv 73 default.70
success: moved filesystem 73 into space default.70
eos fs mv 74 default.71
success: moved filesystem 74 into space default.71
eos fs mv 75 default.72
success: moved filesystem 75 into space default.72
eos fs mv 76 default.73
success: moved filesystem 76 into space default.73
eos fs mv 77 default.74
success: moved filesystem 77 into space default.74
eos fs mv 78 default.75
success: moved filesystem 78 into space default.75
eos fs mv 79 default.76
success: moved filesystem 79 into space default.76
eos fs mv 80 default.77
success: moved filesystem 80 into space default.77
eos fs mv 81 default.78
success: moved filesystem 81 into space default.78
eos fs mv 82 default.79
success: moved filesystem 82 into space default.79
eos fs mv 83 default.80
success: moved filesystem 83 into space default.80
eos fs mv 84 default.81
success: moved filesystem 84 into space default.81
eos fs mv 85 default.82
success: moved filesystem 85 into space default.82
eos fs mv 86 default.83
success: moved filesystem 86 into space default.83
eos fs mv 87 default.84
success: moved filesystem 87 into space default.84
[root@seagate1 ~]# 

The spaces create automagically.

#4 Updated by Gerard Bernabeu Altayo almost 4 years ago

Copying data, and data is moving on seagate1!

-bash-4.1$ df -h; for i in `seq 1 999`; do cp bin.tar bin.tar.$i; done; df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 213G 20G 182G 10% /
tmpfs 16G 0 16G 0% /dev/shm
/dev/sda1 976M 88M 838M 10% /boot
eosmain 492T 5.4G 492T 1% /eos

[root@seagate1 ~]# dstat
----total-cpu-usage---- dsk/total net/total ---paging-- ---system--
usr sys idl wai hiq siq| read writ| recv send| in out | int csw
0 0 100 0 0 0| 38k 158k| 0 0 | 0 0 | 287 447
0 2 90 8 0 0| 0 0 |5672k 287k| 0 0 | 10k 13k
0 0 93 7 0 0|2048k 1024k| 913B 1428B| 0 0 |1773 4912
0 1 92 7 0 0| 0 0 |2809B 568B| 0 0 |2010 5229
0 0 91 8 0 0| 0 48k|3844B 354B| 0 0 |1927 5139
1 1 96 3 0 0|4096k 2048k| 132k 91k| 0 0 |2286 5260
0 0 91 8 0 0| 0 0 |6960B 370B| 0 0 |1781 4897
0 1 90 8 0 0| 0 72k|3293k 170k| 0 0 |7418 11k
0 2 92 6 0 0|2048k 1052k|7084k 349k| 0 0 | 12k 15k
0 2 89 8 0 0| 0 0 |6953k 350k| 0 0 | 13k 16k^C
[root@seagate1 ~]#

#5 Updated by Gerard Bernabeu Altayo almost 4 years ago

Seems to work well with EOS in the sense that the data is getting spread and the performance looks decent. Focusing on more standard disk tests. Next thing will be to test network too.

Using the following script for testing:

[root@seagate1 ~]# cat testsperf.sh 
#!/bin/bash

nFS=84
testdir=/storage/data #Here is where we will add 1.nFS FS
output=/root/iozone.results
fSTAMP=`date +"%m%d%Y-%H%M%S" | tr -d '\n'`
IOFILE=$output/iostat.all.$fSTAMP.out
mkdir -p $output

# start - added by Amitoj on Jul 15, 2015
print_debug "fio started ..." 
df > $IOFILE
iostat -d 5 -m -x -t >> $IOFILE &
PID=$!
STAMP=`date`
echo my pid is $PID - $STAMP >> $IOFILE
sleep 10
# end - added by Amitoj on Jul 15, 2015

echo `date` Single FS
sync
iozone -a -g 4G -b $output/test1.fs1 ${testdir}1/iozone.test1
sync
sleep 10

echo `date` Throughput test
sync
iozone -l 84 -u 84 -r 128k -s 2G -b $output/test2.throughput -F ${testdir}{1..$nFS}/iozone.test2
sync
sleep 10

echo `date` Parallel single x #FS
sync
for i in {1..$nFS}; do 
  iozone -n 128K -g 4G -b $output/test3.fs$i ${testdir}$i/iozone.test3 
done
sync
sleep 10

echo `date` Parallel 256 streams (max for iozone)
sync
nstreams=256
a=0
filelist="" 
while [ $a -lt $nstreams ]; do
  for i in `seq 1 $nFS`; do
    if [ $a -lt $nstreams ]; then 
      filelist="$filelist ${testdir}$i/iozone.test4.$a" 
      let a++
    fi
  done
done
iozone -l $nstreams -u $nstreams -r 128k -s 2G -b $output/test4.throughput -F $filelist
sync
sleep 10

# start - added by Amitoj on Jul 15, 2015
sleep 20
STAMP=`date`
echo killing pid $PID $STAMP >> $IOFILE
kill $PID
# end - added by Amitoj on Jul 15, 2015

echo `date` END
exit 0

#Ideally I should be testing the following:
#
# after analyzing the real workload on CMS' dCache disk instance my suggestion for the IO benchmark is as follows:
# 
# ----
# 
# Assuming a solution like the SATABeasts we have right now and given all this data I'd make the benchmark such that it does:
# 
# - 603 parallel 2GB data file transfers per server at a ratio of 200 reads per write.
# - 3 Writes should be at full speed (no cap, as much as the system can deliver)
# - 600 reads reads should be capped as follow:
# -- 5 reads at full speed (no cap, as much as the system can deliver)
# -- 5 reads capped at 25MB/s (125MB/s)
# -- 130 reads capped at 3MB/s (390MB/s)
# -- 160 reads capped at 0.5MB/s (80MB/s)
# -- 300 reads capped at 0.05MB/s (15MB/s)
# 
# This totals 610MB/s in capped reads, the rest of available capacity should be use by the non-capped transfers.
# ----
# 
# You can see more details and pointers to the source of info in https://cdcvs.fnal.gov/redmine/issues/8994
[root@seagate1 ~]# nohup bash ./testsperf.sh &
[1] 79118
[root@seagate1 ~]# nohup: ignoring input and appending output to `nohup.out'

[root@seagate1 ~]# 

#6 Updated by Gerard Bernabeu Altayo almost 4 years ago

I've fixed a few bugs in the script, still need to implement the special mix of write/reads but will do that after analyzing current results because we may already have enough valid information.

Now running the last version of the test on seagate1, cmsstor150 (old satabeast), cmsstor411 (new E60).

[root@cmsstor411 ~]# cat testsperf.sh 
#!/bin/bash
#
# Data partitions should be mounted as /storage/data[1-9]+
#
# Requirements: yum install http://pkgs.repoforge.org/iozone/iozone-3.424-2.el6.rf.x86_64.rpm
#

nFS=`mount | grep -c /storage/data`
nstreams=256
testdir=/storage/data #Here is where we will add 1.nFS FS
output=/root/iozone.results
fSTAMP=`date +"%m%d%Y-%H%M%S" | tr -d '\n'`
IOFILE=$output/iostat.all.$fSTAMP.out
mkdir -p $output

# start - added by Amitoj on Jul 15, 2015
print_debug "fio started ..." 
df > $IOFILE
iostat -d 5 -m -x -t >> $IOFILE &
PID=$!
dstat >> ${IOFILE}.dstat &
PID1=$!
STAMP=`date`
echo "my pid is $PID - $STAMP" >> $IOFILE
sleep 10
# end - added by Amitoj on Jul 15, 2015

echo `date` Single FS full iozone without caching
sync
iozone -+u -a -g 4G -b $output/test1.fs1.xls -p -U ${testdir}1 ${testdir}1/iozone.test1
sync
sleep 10

echo "`date` Incremental One stream per FS Throughput test with 70GB file (nocaching)" 
sync
filelist="" 
for i in `seq 1 $nFS`; do
  filelist="$filelist ${testdir}$i/iozone.test2" 
done
iozone -+u -l 1 -u $nFS -r 128k -s 70G -b $output/test2.1streamperfs.throughput.xls -F $filelist
sync
sleep 10

echo "`date` Parallel 256 streams (max for iozone) across all partitions" 
sync
nstreams=256
a=0
filelist="" 
while [ $a -lt $nstreams ]; do
  for i in `seq 1 $nFS`; do
    if [ $a -lt $nstreams ]; then 
      filelist="$filelist ${testdir}$i/iozone.test3.$a" 
      let a++
    fi
  done
done
iozone -+u -l $nstreams -u $nstreams -r 128k -s 2G -b $output/test3.throughput -F $filelist
sync
sleep 10

# start - added by Amitoj on Jul 15, 2015
sleep 20
STAMP=`date`
echo killing pid $PID $STAMP >> $IOFILE
kill $PID $PID1
# end - added by Amitoj on Jul 15, 2015

echo `date` END
exit 0

#Ideally I should be testing the following:
#
# after analyzing the real workload on CMS' dCache disk instance my suggestion for the IO benchmark is as follows:
# 
# ----
# 
# Assuming a solution like the SATABeasts we have right now and given all this data I'd make the benchmark such that it does:
# 
# - 603 parallel 2GB data file transfers per server at a ratio of 200 reads per write.
# - 3 Writes should be at full speed (no cap, as much as the system can deliver)
# - 600 reads reads should be capped as follow:
# -- 5 reads at full speed (no cap, as much as the system can deliver)
# -- 5 reads capped at 25MB/s (125MB/s)
# -- 130 reads capped at 3MB/s (390MB/s)
# -- 160 reads capped at 0.5MB/s (80MB/s)
# -- 300 reads capped at 0.05MB/s (15MB/s)
# 
# This totals 610MB/s in capped reads, the rest of available capacity should be use by the non-capped transfers.
# ----
# 
# You can see more details and pointers to the source of info in https://cdcvs.fnal.gov/redmine/issues/8994
[root@cmsstor411 ~]# [root@cmsstor411 ~]# rm -rf iozone.results nohup.out; nohup bash ./testsperf.sh &
-bash: [root@cmsstor411: command not found
[1] 23377
[root@cmsstor411 ~]# nohup: ignoring input and appending output to `nohup.out'

#7 Updated by Gerard Bernabeu Altayo almost 4 years ago

Test on cmsstor411 done, moving the results to seagate1 so that the box can be reshoot:

[root@cmsstor411 ~]# tar -czf cmsstor411.iostat.test.tgz testsperf.sh nohup.out iozone.results
[root@cmsstor411 ~]# less nohup.out 
[root@cmsstor411 ~]# scp cmsstor411.iostat.test.tgz seagate1:/root/
cmsstor411.iostat.test.tgz                                                                                                                         100% 1293KB   1.3MB/s   00:00    
[root@cmsstor411 ~]# 

on seagate1 it was still running test2, so I changed order or test2 to do the 256stream first and then limited test to to up to 10*70GB stream tests.

#8 Updated by Gerard Bernabeu Altayo over 3 years ago

  • Status changed from New to Resolved

server has been returned, all the tests have been summarized in https://fermipoint.fnal.gov/organization/cs/scd/sci_comp_acq/Pages/Technical-Evaluations.aspx



Also available in: Atom PDF