Project

General

Profile

Tape and Cache

The primary storage system at Fermilab is the dCache/Enstore mass storage system. This system is essentially a large collection of high performance disk/raid pools which sit in front of the magnetic type systems.

  • dCache (standing for disk cache) is the disk portion of the system
  • Enstore is the tape system

When you want to retrieve/store a file you read/write to the dCache system and then the dCache system interacts with the tape robots (if it has to). This makes life easy for you, and at the same time keeps the tape robots happy. In this model dCache is the "disk cache" layer of the system and enstore is the "archival storage system".

Writing a File

When a file is written (using a supported protocol) to dCache, the file is stored on a high speed disk pool. The file is then scheduled to be written to tape at some time in the future. When the time comes the file(s) are written to tape. At this point there are two replicas of the file (one on the disk and one on the tape). If space on the disk becomes tight (and the file is old and unused), the disk replica is expired (deleted) and the space reclaimed. But the tape copy remains untouched.

Reading a File

When it is time to read a file from dCache, the file is accessed using a supported protocol. When this happens the dCache/Enstore system does a number of different things. First the system checks to see if a copy of the requested file is currently sitting on any of the disk pools. If there is a copy available (a "cache hit"), then that cached copy is returned with little to no delay. If the file DOES NOT reside in the cache (a "cache miss"), then the system contacts the tape robots and schedules the file to be restored from tape. This can be a very time consuming operation. Tape restores can take a few minutes, or many hours depending on how busy the tape robot system is with other requests. When the file is restored from tape, a copy is placed in the cache pools and that copy is then returned to the user who is trying to access the file. Subsequent accesses to the file will then use the cached copy until it expires.

Cache Status

What should you do when you need a file?

Sometimes you need a specific file(s) and sometimes you just need any file. If you need a specific file then you are stuck retrieving that file even if it is on tape. If you want any old file then it would be better if you picked one that was already on disk so that you could avoid needing to triggering a tape restore. The way to do this is to check the "locality" status of a file.

cache_state.py

The script novaart:source:trunk/NovaGridUtils/bin/cache_state.py is the preferred way to check the cache status of a single file, set of files given by a wildcard, or SAM dataset definition.

For a single file known to SAM:

$ cache_state.py fardet_r00020925_s05_t02_S15-03-11_v1_data.artdaq.root

CACHED

For a single file in a disk area (same file as above, but with full path):

$ cache_state.py /pnfs/nova/production/raw2root/S15-03-11/fardet/000209/20925/02/fardet_r00020925_s05_t02_S15-03-11_v1_data.artdaq.root

CACHED

Multiple files can be specified via a wildcard:

$ cache_state.py /pnfs/nova/production/raw2root/S15-03-11/fardet/000209/20925/02/*.root
Finding locations for 64 files:
 0%  9%  18%  28%  37%  46%  56%  65%  75%  84%  93%  100%
Checking 64 files:
 0%  100%
Cached: 64 (100%)       Tape only: 0 (0%)

SAM dataset definitions are also supported:

$ cache_state.py -d prod_artdaq_fd_cosmic_prod3_subset8of12 -ss latest -s 100
Retrieving file list for SAM dataset definition name: 'prod_artdaq_fd_cosmic_prod3_subset8of12'...  done.
Finding locations for 321 files:
 0%  9%  19%  29%  39%  49%  59%  69%  79%  89%  99%  100%
Checking 321 files:
 0%  31%  62%  93%  100%
Cached: 12 (4%) Tape only: 309 (96%)

Here we have requested the most recent snapshot of this definition with -ss latest and a "stride" (every 100th file) with -s 100.
Please examine cache_state.py --help for more options.

Via filesystem access to pnfs

This is done by interacting with the /pnfs filesystem in a very specific manner.

To start assume you have a file that has been copied to some directory path in pnfs. To check the status you can examine a very specific file in pnfs which will tell you the state.

# Retrieve the cache information about a file
cat '<path>/.(get)(<filename>)(locality)'
#
# Example:
cat './caf/000134/13464/.(get)(fardet_r00013464_s19_t00_FA14-10-28_v1_data.caf.root)(locality)'
NEARLINE

cat '/pnfs/nova/rawdata/FarDet/000210/21006/.(get)(fardet_r00021006_s00_DDContained.raw)(locality)'
ONLINE_AND_NEARLINE

The string "NEARLINE" means that the file is on tape. The string "ONLINE" means the file is in the cache. The string "ONLINE_AND_NEARLINE" means it is both in the cache and on tape.

Note: This method of cache status checking is quite slow. Prefer the cache_state.py tool noted in the previous section.

Via XRootD

WARNING: the method below no longer works properly and is unsupported

Newer versions of the tools used to access the filesystem via the XRootD access protocol have a stat command, which contains a "status" flag telling about the cache status of a file. Unfortunately, the version of the xrootd tools in NOvA's art bundle for version 1.17.06 (xrootd 3.3.4d, current as of July 2017) is too old. I have built a newer version (4.6.1) in /nova/app/users/jwolcott/xrootd/, which can be used in a $PATH, or, more usefully, in a $PYTHONPATH:

export PYTHONPATH=/nova/app/users/jwolcott/xrootd/lib/python2.7/site-packages:$PYTHONPATH
$ python
Python 2.7.10 (default, Jun  1 2015, 15:53:12)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-11)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import XRootD.client
>>> c = XRootD.client.FileSystem("root://fndca1.fnal.gov:1094")
>>> fn = "/pnfs/fnal.gov/usr/nova/production/raw2root/S15-03-11/fardet/000209/20923/02/fardet_r00020923_s59_t02_S15-03-11_v1_data.artdaq.root" 
>>> s = c.stat(fn)
>>> XRootD.client.flags.StatInfoFlags.OFFLINE
8
>>> help(XRootD.client.flags.StatInfoFlags)
Help on class Enum in module XRootD.client.flags:

class Enum(__builtin__.object)
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |  
 |  BACKUP_EXISTS = 128
 |  
 |  IS_DIR = 2
 |  
 |  IS_READABLE = 16
 |  
 |  IS_WRITABLE = 32
 |  
 |  OFFLINE = 8
 |  
 |  OTHER = 4
 |  
 |  POSC_PENDING = 64
 |  
 |  X_BIT_SET = 1
 |  
 |  reverse_mapping = {1: 'X_BIT_SET', 2: 'IS_DIR', 4: 'OTHER', 8: 'OFFLIN...
>>> s[1].flags & XRootD.client.flags.StatInfoFlags.OFFLINE
0
>>> s[1].flags & XRootD.client.flags.StatInfoFlags.OFFLINE == 0
True

NOvA plans to switch to art v2 soon, and we will request that a newer version of xrootd and the associated Python bindings be bundled with it. In the meantime, the Python bindings for the newer version have been worked into the batch script in the next section.