Project

General

Profile

Copy user files to tape using sam4users » History » Version 5

« Previous - Version 5/8 (diff) - Next » - Current version
Heidi Schellman, 11/21/2019 12:33 PM


Copy user files to tape using sam4users

Based on Adam Lister's instructions for MicroBooNE:
https://cdcvs.fnal.gov/redmine/projects/uboonecode/wiki/Storing_files

This procedure renames a set of files to have unique names and creates a sam dataset that allows you to catalog and access those files.

You can then "archive" those files - they will be moved to the tape_backed pool.

You can access them via the sam dataset or by listing the dataset and doing sam locate-file <filename>. You may need to prestage the dataset to get them back if not used for a while.

Setup permissions:

- start a screen session
- setup dunetpc in the usual way.
- setup fife utilities

setup fife_utils

- Make sure your token wont expire

export KRB5CCNAME=FILE:/tmp/$USER`date +%s`
setup kx509

## Gets a full-week renewable token
kinit -r7d 

## setup background loop to refresh kerberos cred every 12 hours for a week
(for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14; do sleep 43200; kinit -R; done)&

How do do it:

- Make list of input files:

ls -d /path/to/some/files/*.root > input.list

- add dataset to SAM:

-f option defines file list, can also just past files by wildcard

nohup sam_add_dataset --name=tjyang_diffusion_off_sce_off_cosmics_v08_27_01 -f input.list 

One can also define files in a directory, e.g.

nohup sam_add_dataset --name=tjyang_diffusion_off_sce_off_cosmics_v08_27_01 -d /path/to/a/directory 

Remember to note the name, you can also find your definitions later by doing

 samweb list-definitions | grep <username> 

archive dataset

this means that when you pre-stage the dataset it gets put in to the "read/write pool", which basically looks like scratch to the analyser.

-N option defines number of copy threads.

Note from tjyang: I had some difficulty here. I assume dCache was being a little bit dodgy. Just keep an eye on it, if it fails, resubmit the command above. SAM is smart enough to know what's been sent to archive and what hasn't. For me I had 600 files, the first 250 took 3 days, and the last 350 to several hours so the time seems to be pretty variable.
You can check the status of the jobs by going to http://samweb.fnal.gov:8480/station_monitor/ and clicking through to DUNE/dune and looking for your name.

sam_archive_dataset --name="tjyang_diffusion_off_sce_off_cosmics_v08_27_01" -N 4 

- You can de-attach your screen session with control-A control-D
- Come back later and use screen -r to re-attach the session
- ...and now check that your files have only one location enstore:/pnfs/dune/archive/sam_managed_users/tjyang/data/c/b/6/e in the archive:
      

samweb list-files "defname:tjyang_diffusion_off_sce_off_cosmics_v08_27_01" 
samweb locate-file <some_file_name>

should output a single location, i.e:
      
enstore:/pnfs/dune/archive/sam_managed_users/tjyang/data/c/b/6/e

- In the screen session before exiting, delete the kerberos ticket
      
kdestroy