Archiving data and access to data on tapes

File Sets Identified for Archival

List of the file needed to be archived:

1. /nova/prod/: There is recent surge in the occupancy of this area, following data sets have been identified in priority order to be archived:

2. /nova/../users/ : These directories belong to the users who have already left the collaboration. These also needs to be archived:

182G        /nova/app/users/denis
178G        /nova/app/users/timdkut
176G        /nova/app/users/betan009
2078G        /nova/ana/users/nsmayer

3. /nova/data/mc/.ToBeArchived ~ permission obtained November 3rd, 2014

For some of these files, this location is declared to SAM and that location also needs to be deleted. Directories that include files like this are:
S13-01-15 (A few of these files are already on tape; their blue arc location ISN'T recorded, so for this handful, can just delete)
S13-07-22 (These files seem to all be declared to SAM, but only have the bluearc location. So these need the full treatment - they need to be archived and new file location told to SAM, and the current file location needs to be removed)
S13-10-11 (These files seem to all be declared to SAM, but only have the bluearc location. So these need the full treatment - they need to be archived and new file location told to SAM, and the current file location needs to be removed)
S13-12-13 (These files are already on tape; these don't need to move, just delete file and SAM location)

4. /nova/prod/mc/.ToBeArchived ~ pending offline permission, set to be obtained November 3rd, 2014

File Transfer details (BlueArc to dCache)

Files on bluearc which is no longer in use, needs to be archived on pnfs. The first such file which is moved to pnfs is /nova/prod/mc/S13-06-05/cosmics/fd/ which is about 6.7 TB. The pnfs location of this file is /pnfs/nova/archives/2014-APR/mc/S13-06-05/cosmics/fd/xx where xx is the file number. The files on bluearc were divided into further smaller directories according to their run numbers. The sub-directory with fewer files make file operation much more efficient than putting all files together in one directory. The above file from bluearc were divided into a list of 100 sub-directories containing list of 111 files each.

Copy, Verify and Remove

To do operations like copy, verify and remove a script is written which runs by putting arguments copy, verify and remove in the command line. The files to be moved from bluearc needs to be checked with pnfs whether it already exist there or not. If it already exists then its copying is skipped and copy starts for the next file.
After copying the file to the pnfs it is verified by checking the size of the file. When the verification is done it is removed from the bluearc. The following steps are involved while doing file operations:

1. Login to any VMs (novagpvm's)
2. Go to the directory where script is written and use the following commands to get access to file operation as 'novapro'
2. ksu novapro (This gives one permission to do operations with files)
3. source novaprosource

After these steps one is ready to run scripts for file operations.

Details of files move to dCache off the bluearc

S13-06-05/cosmics/fd ~ 6.7 TB

The file on Bluearc (/nova/prod/mc/S13-06-05/cosmics/fd) ~6.7 TB have been moved to dCache (/pnfs/nova/archives/2014-APR/mc/S13-06-05/cosmics/fd/xx) where xx is the file number. These files have been divided into smaller parts to make file operation easier.

S14-02-05/cosmics/fd ~ 7.1 TB

Following the similar procedures the files these files are also transferred to dCache (/pnfs/nova/archives/2014-MAY/mc/B13-10-23/cosmics/fd/).

B13-10-23/cosmics/fd ~ 16 TB

Again the files from this area was split into a set of smaller files for quicker operations. Following files were removed from the BlueArc without copying it to the dCache:

Following files was copied to the dCache first and then it was removed off the BlueArc:

These files were copied to: /pnfs/nova/archives/2014-APR/mc/S13-06-05/cosmics/fd/
This area has directories with run numbers for the ease of access.

S13-10-11/cosmics/fd ~ 5.1 TB

The same strategy was used for this data set as well. This time the files have been kept in the sub folders of the iterations as they were in bluearc.


These files are archives in dCache at:

It has three sub-directories:

The folder /pnfs/nova/archives/2014-JULY/mc/S13-10-11/cosmics/fd/Oct10ReadoutSim_ChanMask11342 is empty now because the directories 1 to 10 in /pnfs/nova/archives/2014-JULY/mc/S13-10-11/cosmics/fd/ should have been inside it. Working on to put them back to their correct location and update this space again. The google spread sheet is attached.

S13-06-13/cosmics/fd ~ 5.8 TB

All the files from:

have been moved to:


The Google spread sheet is attached.

/nova/ana/calibration/FarDet/ ~ (7.1) TB

All the files *pchits and *pchitstop files from /nova/ana/calibration/FarDet/S13-09-04/ are to be transferred to dCache (/pnfs/nova/archives/2014-JULY/mc/calibFiles/S13-09-04/). The files of reference directory is kept in reference directory (/pnfs/nova/archives/2014-JULY/mc/calibFiles/S13-09-04/reference) and the reference.MANGLED files are in reference.MANGLED directory(/pnfs/nova/archives/2014-JULY/mc/calibFiles/S13-09-04/reference.MANGLED).
Since novapro can not delete these files from current bluearc (/nova/ana/calibration/FarDet/S13-09-04/) location, Gavin have to delete them to get all these free space. The Google spreadsheet is attached.

S12-12-12/genie/{fd, nd} ~ 13TB

The FD and ND (/nova/prod/mc/S12-12-12/genie/) files have been archived to dCache (/pnfs/nova/archives/2014-AUG/mc/S12-12-12/genie/).

mdc_S12-06-17 ~ 2.4TB

The FD and ND (/nova/ana/caf/mdc/) files have been archived to dCache (/pnfs/nova/archives/2014-Oct/mdc/). Few of the files could not be verified for its size dCache. It is copied again in /pnfs/nova/archives/2014-Oct/mdc/uncopied.

/nova/ana/caf/base/ ~ 12 TB

The FD and ND (/nova/ana/caf/base/) files is archived to dCache (/pnfs/nova/archives/2014-Oct/base/)

/nova/ana/MOVED/chadj ~ 114 G (permission from Mark Messier on Oct 13, 2014)

These files have been copied to /pnfs/nova/archives/2014-Oct/MOVED/chadj/ and same directory structure have been maintained. The logs folder and the labview folders were tarred and copied to the /pnfs/nova/archives/2014-Oct/MOVED/chadj/logs and /pnfs/nova/archives/2014-Oct/MOVED/chadj/labview respectively. As these files belong to chadj novapro can't delete it. Andrew Norman has agreed to delete these. Waiting for him to delete.

/nova/prod/mc/S13-02-26 ~ 12 TB, except NDOS files

It has been archived to /pnfs/nova/archives/2014-Nov/mc/S13-02-26/.

86G /nova/data/mc/.ToBeArchived/S12.06.17_MDC_reco

It has been archived to /pnfs/nova/archives/2014-Nov/mc/S12.06.17_MDC_reco/.

13G /nova/data/mc/.ToBeArchived/S12-10-04_to_FTS/

It has been archived to /pnfs/nova/archives/2014-Nov/mc/S12-10-04_to_FTS/cosmics/ndos/.

2.2 TB /nova/Ana/users/cerretan to /pnfs/nova/archives/2015-May/cerretan

It has been archived to /pnfs/nova/archives/2015-May/cerretan.

8.7 TB /nova/data/mc/S12-11-16/

All these files just needed a remove and have been done.

2.0 TB /nova/prod/mc/S12-12-12/ to /pnfs/nova/archives/2015-June/mc/S12-12-12/

These files have been archived to /pnfs/nova/archives/2015-June/mc/S12-12-12/

