Project

General

Profile

Necessary Maintenance #9844

Check and recover missing files in Disk instance

Added by Natalia Ratnikova over 4 years ago. Updated over 4 years ago.

Status:
Assigned
Priority:
Normal
Start date:
08/19/2015
Due date:
% Done:

0%

Estimated time:
Spent time:
Stakeholders:
Duration:

Description

dCache lazy restore monitoring shows these files, which were created before the disk/tape separation (notice short PNFS IDs):

0007000000000000F334A970 0.0.0.0/0-*/* N.N. 08.19 13:27:33 3 0 Suspended (pool unavailable) 08.19 13:27:33
/dcache/uscmsdisk/store/mc/Summer12_DR53X/QCD_HT-100To250_TuneZ2star_8TeV-madgraph-pythia/AODSIM/PU_S10_START53_V7A-v1/00001/509AEE4E-B70F-E211-AFCD-001A92971B36.root
0007000000000000F338CEF0 0.0.0.0/0-*/* N.N. 08.19 12:56:31 4 0 Suspended (pool unavailable) 08.19 12:56:31
/dcache/uscmsdisk/store/mc/Summer12_DR53X/QCD_HT-100To250_TuneZ2star_8TeV-madgraph-pythia/AODSIM/PU_S10_START53_V7A-v1/00001/0E5ABBC6-F40F-E211-8EB5-0026189438B5.root
0007000000000000FE854FF8 0.0.0.0/0-*/* N.N. 08.19 11:12:46 4 0 Suspended (pool unavailable) 08.19 11:12:46
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-70To100_8TeV/AODSIM/PU_S10_START53_V7C-v1/30000/A099FE24-F186-E211-B1F4-0026189437ED.root
0007000000000000FE8CA5F0 0.0.0.0/0-*/* N.N. 08.19 12:58:37 4 0 Suspended (pool unavailable) 08.19 12:58:37
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-70To100_8TeV/AODSIM/PU_S10_START53_V7C-v1/20001/0EAFD8AB-4387-E211-9C69-00248C0BE013.root
0007000000000000FE8D7CD0 0.0.0.0/0-*/* N.N. 08.19 13:14:27 4 0 Suspended (pool unavailable) 08.19 13:14:27
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-70To100_8TeV/AODSIM/PU_S10_START53_V7C-v1/20001/3A1D03AC-4C87-E211-B079-002618FDA207.root

All pools are currently ONLINE. Check files locations on disk instance pools.
If none found, restore from the tape instance replicas, following:

https://cmsweb.fnal.gov/bin/view/Storage/MissingFileRecovery
no_location_files.list (436 KB) no_location_files.list Natalia Ratnikova, 08/19/2015 05:12 PM

History

#1 Updated by Natalia Ratnikova over 4 years ago

Check
- phedex transfers quality for outgoing transfers from T1_US_FNAL_Disk
- any red lines in transfer details page indicating stuck transfers
- check recent transfer errors in PhEDEx - none for the missing files,

conclusion: phedex transfers are not affected

( Note, our local phedex logs are only useful for incoming transfers, not outgoing)

Next check billing DB to see who is trying to access these data.

Below is an example [1].

Conclusion this is user's job trying to read files via xrootd.

[1]
/PU_S10_START53_V7A-v1/00001/66956ECA-AA0F-E211-A05C-0030486792B4.root] cms.cms11@enstore 3076352 0 {0:""}
08.19 14:10:02 [door:XrootdLFNs-cmssrmdisk@xrootdLFNs-cmssrmdiskDomain:request] ["/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=hyunyong/CN=693411/CN=Hyunyong Kim":50125:5063:129.93.183.89] [0007000000000000F338C8D0,0] [/dcache/uscmsdisk/store/mc/Summer12_DR53X/QCD_HT-100To250_TuneZ2star_8TeV-madgraph-pythia/AODSIM/PU_S10_START53_V7A-v1/00001/B67B5AA1-F20F-E211-8162-0018F3D09660.root] cms.cms11@enstore 6368418 0 {0:""}
08.19 14:12:06 [door:XrootdLFNs-cmssrmdisk@xrootdLFNs-cmssrmdiskDomain:request] ["/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=hyunyong/CN=693411/CN=Hyunyong Kim":50125:5063:128.211.155.34] [0007000000000000F3390368,0] [/dcache/uscmsdisk/store/mc/Summer12_DR53X/QCD_HT-100To250_TuneZ2star_8TeV-madgraph-pythia/AODSIM/PU_S10_START53_V7A-v1/00001/A275645D-F60F-E211-A42E-0018F3D09702.root] cms.cms11@enstore 5342363 0 {0:""}
[root@cmsdcacheadmindisk ~]# grep /dcache/uscmsdisk/store/mc/Summer12_DR53X/QCD_HT-100To250_TuneZ2star_8TeV-madgraph-pythia/AODSIM/PU_S10_START53_V7A-v1/00001/ /var/log/dcache/2015/08/billing-2015.08.19 | wc -l
358

#2 Updated by Natalia Ratnikova over 4 years ago

Follow instructions in

https://cmsweb.fnal.gov/bin/view/Storage/DCache22Procedures#Files_in_the_namespace_with_zero_replicas_on_pools

to check for all files with zero locations on the pools .

Found 3418 files without location.
Including the files in the ticket subject.. .
see list in the attached no_location_files.list

#3 Updated by Natalia Ratnikova over 4 years ago

Got SCC results from CERN lxplus.
Trimmed PFNs into LFNs in no_location_files.list (attached).

None of the files are orphans:

[cmsdev33 12:12] for f in `cat no_location_files.list| sed 's?/dcache/uscmsdisk??'`; do grep $f FNAL_Disk_SCC_orphan_list.txt ; done 2>&1 | tee no_location_files_listed_as_orphans
[cmsdev33 12:12]

#4 Updated by Natalia Ratnikova over 4 years ago

For the record:

the chimera-list produced dump which lists files with their corresponding pool locations, does not include the files from the no_location_files.list. This may indicate problem in the tool.
The dump was created on cmschimeradiskbackup.

#5 Updated by Natalia Ratnikova over 4 years ago

Now 43 files are missing, all of them are in the no_location_files.list:

/dcache/uscmsdisk/store/mc/Summer12_DR53X/GJets_HT-200To400_8TeV-madgraph/AODSIM/PU_S10_START53_V7A-v1/0000/3096BD1E-EFDA-E111-B97D-002618943933.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/GJets_HT-200To400_8TeV-madgraph/AODSIM/PU_S10_START53_V7A-v1/0000/8692B6BF-F3DA-E111-9569-002618943900.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/GJets_HT-200To400_8TeV-madgraph/AODSIM/PU_S10_START53_V7A-v1/0000/C234B20A-1FDB-E111-A09C-002618943920.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/GJets_HT-200To400_8TeV-madgraph/AODSIM/PU_S10_START53_V7A-v1/0000/CC705084-20DB-E111-8A8C-002618943832.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/GJets_HT-200To400_8TeV-madgraph/AODSIM/PU_S10_START53_V7A-v1/0000/16BB74A7-2BDB-E111-A7D4-003048679012.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/QCD_HT-100To250_TuneZ2star_8TeV-madgraph-pythia/AODSIM/PU_S10_START53_V7A-v1/00001/509AEE4E-B70F-E211-AFCD-001A92971B36.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/QCD_HT-100To250_TuneZ2star_8TeV-madgraph-pythia/AODSIM/PU_S10_START53_V7A-v1/00001/0E5ABBC6-F40F-E211-8EB5-0026189438B5.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/QCD_HT-100To250_TuneZ2star_8TeV-madgraph-pythia/AODSIM/PU_S10_START53_V7A-v1/00002/B89B5726-3A10-E211-9190-003048FFCB6A.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-100_8TeV-madgraph/AODSIM/PU_S10_START53_V7C-v1/30000/E0EB9158-B672-E211-BD31-003048D3FC94.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-100_8TeV-madgraph/AODSIM/PU_S10_START53_V7C-v1/30000/36C7F581-D972-E211-8C34-00261894397A.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-100_8TeV-madgraph/AODSIM/PU_S10_START53_V7C-v1/30000/A20451F4-DF72-E211-A2FB-003048FFD7C2.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-100_8TeV-madgraph/AODSIM/PU_S10_START53_V7C-v1/20000/0EE3F9DF-E272-E211-A1B8-0026189438E2.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-100_8TeV-madgraph/AODSIM/PU_S10_START53_V7C-v1/30000/C2613955-EC72-E211-AD6F-003048FFD71A.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-100_8TeV-madgraph/AODSIM/PU_S10_START53_V7C-v1/30000/3AE5A9DB-F972-E211-AB50-003048678B3C.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-100_8TeV-madgraph/AODSIM/PU_S10_START53_V7C-v1/20000/BAE21319-0273-E211-A673-003048FFD7BE.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-100_8TeV-madgraph/AODSIM/PU_S10_START53_V7C-v1/30000/E43A329E-0473-E211-893C-003048678AE4.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-100_8TeV-madgraph/AODSIM/PU_S10_START53_V7C-v1/30000/E28EA6C3-0973-E211-9FD2-003048678F84.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-100_8TeV-madgraph/AODSIM/PU_S10_START53_V7C-v1/30000/E29E08B1-0E73-E211-BAF6-002618FDA277.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-100_8TeV-madgraph/AODSIM/PU_S10_START53_V7C-v1/20000/F038AFE2-1073-E211-8FE7-00261894386C.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-100_8TeV-madgraph/AODSIM/PU_S10_START53_V7C-v1/30000/C03CE780-2573-E211-B150-0026189438BC.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-100_8TeV-madgraph/AODSIM/PU_S10_START53_V7C-v1/20000/6886093F-3A73-E211-8741-003048678A80.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-100_8TeV-madgraph/AODSIM/PU_S10_START53_V7C-v1/30000/64EB8B24-4473-E211-83D2-002618FDA208.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-100_8TeV-madgraph/AODSIM/PU_S10_START53_V7C-v1/20000/2463E6A1-5C73-E211-991C-002618943986.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-100_8TeV-madgraph/AODSIM/PU_S10_START53_V7C-v1/30000/38A51FF8-6C73-E211-AED3-00248C0BE005.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-100_8TeV-madgraph/AODSIM/PU_S10_START53_V7C-v1/20000/48A14785-F673-E211-B47E-002618943951.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-70To100_8TeV/AODSIM/PU_S10_START53_V7C-v1/20000/787DA9CE-B386-E211-A6F6-00261894393A.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-70To100_8TeV/AODSIM/PU_S10_START53_V7C-v1/20000/50FC70A9-B586-E211-A888-002618FDA287.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-70To100_8TeV/AODSIM/PU_S10_START53_V7C-v1/20000/D2C9114C-DC86-E211-887A-002618943964.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-70To100_8TeV/AODSIM/PU_S10_START53_V7C-v1/20000/983A9C4F-E086-E211-B5CF-0030486791F2.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-70To100_8TeV/AODSIM/PU_S10_START53_V7C-v1/30000/E0F1E4C2-ED86-E211-8F26-00261894388B.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-70To100_8TeV/AODSIM/PU_S10_START53_V7C-v1/30000/A099FE24-F186-E211-B1F4-0026189437ED.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-70To100_8TeV/AODSIM/PU_S10_START53_V7C-v1/30000/F0620921-EF86-E211-9AE9-003048FFD730.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-70To100_8TeV/AODSIM/PU_S10_START53_V7C-v1/20000/3209A552-F386-E211-80C9-003048678FF4.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-70To100_8TeV/AODSIM/PU_S10_START53_V7C-v1/20000/8A350B86-F886-E211-862E-0025905964C0.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-70To100_8TeV/AODSIM/PU_S10_START53_V7C-v1/20000/740C679C-0687-E211-978E-003048679182.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-70To100_8TeV/AODSIM/PU_S10_START53_V7C-v1/20000/1CCA038C-2387-E211-8021-002618943919.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-70To100_8TeV/AODSIM/PU_S10_START53_V7C-v1/20001/0EAFD8AB-4387-E211-9C69-00248C0BE013.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/ZJetsToNuNu_PtZ-70To100_8TeV/AODSIM/PU_S10_START53_V7C-v1/20001/3A1D03AC-4C87-E211-B079-002618FDA207.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/VBF_HToBB_M-125_8TeV-powheg-pythia8/AODSIM/PU_S10_START53_V7C-v1/10000/EC5266FA-43A1-E211-8AE3-0026189438EB.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/TTJets_SemiLeptMGDecays_TuneP11TeV_8TeV-madgraph-tauola/AODSIM/PU_S10_START53_V19-v1/20000/FCE4D80F-53C2-E211-975F-00248C0BE014.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/TTJets_SemiLeptMGDecays_TuneP11TeV_8TeV-madgraph-tauola/AODSIM/PU_S10_START53_V19-v1/20000/6A98FC97-6CC2-E211-9650-003048678FFA.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/TT_CT10_AUET2_8TeV-powheg-herwig/AODSIM/PU_S10_START53_V19-v1/10000/2E3DD8C8-16CB-E211-AD8D-00261894386A.root
/dcache/uscmsdisk/store/mc/Summer12_DR53X/TT_CT10_AUET2_8TeV-powheg-herwig/AODSIM/PU_S10_START53_V19-v1/10003/66E2989F-62DD-E211-A317-003048FFD7C2.root

#6 Updated by Natalia Ratnikova over 4 years ago

Three users are affected right now:

[root@cmsdcacheadmindisk 08]# for f in `grep ^/dcache/ /tmp/lazymon_output.2015-08-20 `; do grep $f billing-error-2015.08.*; done | cut -d\" -f2 | sort -u
/C=DE/O=GermanGrid/OU=RWTH/CN=Deboarh Duchardt
/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=gdeleon/CN=766922/CN=Gregorio Iii Tabbu De Leon
/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=hyunyong/CN=693411/CN=Hyunyong Kim

#7 Updated by Natalia Ratnikova over 4 years ago

Summary for Gerard:

1) T1_US_FNAL_Disk instance is currently failing to serve 43 files to the users via xrootd protocol , see previous log entries #5, #6.
All files have 'old' PNFS IDs. i.e. they were stored at FNAL before the disk-tape separation.

2) None of these files appear in the list of orphans produced by Data Transfer team, i.e. all these files were valid files in our PhEDEx node at the time Jorge ran the check.

3) There are 3402 files that have no location on the disk. The list is obtained with a query in log #2 , results are attached to this ticket.

According to email from Chih-Hao back in May there were 207 no-location files and he dealt with all of them.



Also available in: Atom PDF