Project

General

Profile

Datasets targeted for removal in Aug 2017

In response to a request from the Data Storage Services group, MicroBooNE is making an effort to remove old datasets that are not being used within any analysis and that have been superseded by more recently processed processing. The MicroBooNE conveners were surveyed several times in July and August and the summary of the responses is recorded at the bottom of the wiki page.

Reconstructed Monte Carlo Datasets to be deleted in Aug 2017

Format/Data Tier Dataset Version num of files volume (TB) Deleted
MC Reconstructed
< v04_12_xx (mcc6 and earlier) 18376 87 Y
v04_36_xx (mcc7 phase 1) 10205 32 Y
v06_16_00[_xx] 2091 15 Y
v06_17_00 390 4 Y
v06_18_01 3453 43 Y
v06_19_00 2808 45 Y
v06_22_00[_xx] 2341 12 Y
v06_23_00 22659 179 Y
v06_25_00 11131 83 Y

Det Sim Monte Carlo Datasets to be deleted in Aug 2017

Format/Data Tier Dataset Version num of files volume (TB) Deleted
Detector Simluared
< v04_12_xx (mcc6 and earlier) 0 0 Y
v06_16_00[_xx] 2059 8 Y
v06_17_00 504 3 Y
v06_18_01 4800 41 Y
v06_19_00 5618 35 Y
v06_22_00[_xx] 35463 153 Y
v06_23_00 0 0 Y
v06_25_00 19605 83 Y

Reconstructed data to be deleted

Run Range Dataset Version num of files volume (TB) Deleted
Runs 4952-6998
v04_36_00_xx 15700 49 Y
v06_19_00 5267 18 Y
v06_23_00 8082 26 Y
v06_25_00 4239 14 Y
v06_26_01 (Cincinnati) 28633 101 Y
v06_26_01_04 (mcc8.0 optical filtered) 22909 30 Y

Swizzled data to be deleted

Run Range Dataset Version num of files volume (TB) Deleted
Runs 1-3419
Excluding 3077, 3161, 3165, 3166, 3191, 3271, 3273, 3276, 3277, 3300, 3301 v04_05_00 13627 7 Y
v04_13_01 2258 1 Y
v04_16_00 917 0 Y
v04_17_00 6906 3 Y
v04_18_00 4718 2 Y
v04_19_00 20414 9 Y
v04_22_00 9621 15 Y
v04_26_02 1556 3 Y
Runs 3420-4951
v04_26_02 174 0 Y
v04_26_03_xx 16173 5 Y
v04_26_04_02 9886 17 Y
v04_26_04_03 1531 2 Y

Original dataset volume catalog

The complete list of datasets was originally cataloged in the google doc listed below. Note that this lists the volume of datasets as of July 1, 2017. Only datasets marked in RED are targeted for deletion.

https://docs.google.com/spreadsheets/d/1XcaNkr_hku2SNYAWzp3Rj3S0B45ULuVTupTYUsRWQiU/edit?usp=sharing

Convener and Analyzer responses collected via email.

From the Oscillation Group (nothing on the list needs to be saved):

I reached out to a number of analyzers (Bobby, David C., Andy M., Colton, Pandora LEE team) and received three responses that the datasets are not needed (Bobby, David C., Andy M.) and no response from the other two (Colton, Pandora LEE team).  I told them to get back to me by today.  I'm 99% sure that the others don't need the older datasets.

I will let you know if I get a response from the others today, but if you need to move on with dataset deletion immediately, I am pretty sure that it's fine.

Best,
Mike

From the Cross Section Group (We will not delete v06_26_02 processed data):
Hi Matt, Kirby,

Raquel has requested to keep, for a short while, the datasets using v06_26_02.  I think this was because there was a misunderstanding of what that version was, but as she is currently using it, can we keep it around?

Thanks,
Andy

From the Detector Physics Group (The runs requested by Matthias will not be deleted, but nothing else needs to be excluded):

Hi Jyoti,

We have some laser runs in the before beam epoch (Run Number 1-3419). The run numbers I would like to keep are the following: 3077, 3161, 3165, 3166, 3191, 3271, 3273, 3276, 3277, 3300, 3301

Thanks, Matthias

From Chris Barnes:

We can now get t0-tagged tracks from production from the mcc8.2 release. Those are available in the following directory: /pnfs/uboone/data/uboone/reconstructed/reco_extbnb_data_mcc8/.

However, are these Run 1 or Run 2 events?  We are planning on performing the cosmic portion of the SCE analysis in the bulk with Run 1 events.  

I'll defer to Mike Mooney on whether, given if the reconstructed events in that directory are from Run 1, we will certainly not need the v04_36_00_xx reconstructed data.

Chris
Hi Mike,

I haven’t heard any specific request from analyzers. 

For sometime, I got worried about raw swizzled files for special runs (7196-7276 & 7590-7611) taken in last summer shutdown, but just realized that these are not marked red in your spreadsheet, so we should be good.

Thanks for checking again!
- Jyoti

Detailed summary of commands deleting datasets