Project

General

Profile

ANNIE raw data archiving

ARCHITECTURE

ANNIE DAQ

The ANNIE DAQ system runs as root on annie-daq01.
ANNIE DAQ is responsible for producing valid /data/archive/*.root files.
For use with FTS, it should also generate json files with necessary metadata.
( file name, length, checksum, start/end times, Run, Subrun, first/last event )

ANNIE Archiver on annie-daq01

The ANNIE Archiver processes run as annie on annie-daq01. We may change this to annieraw in July.
They move data files via Kerberized ftp to the FTS Dropbox area /pnfs/annie/persistent/raw
See details later in this page
Scripts in bin, logs in /data/logs/

script log action
archdriver runs archdo and archftp
archdo archiver/archdo.log gets list of files needing archival
archiver archiver/archiver.log copies files to the dCache dropbox
rsyncstored rsyncstored.log updates storage status files from anniegpvm01
preen preen.log removed files which are safely stored
probe checks file integrity

crontab

@reboot sleep 3600 ; ${HOME}/bin/archdriver
28 00  * * *         ${HOME}/bin/rsyncstored

FTS on anniegpvm01

FTS is a standard service for moving files from a dropbox to dCache/Enstore storage.
It needs a dropbox, and SAM metadata for the files.
It provides a nice web page for easy monitoring.
The Data Handling group will help us set up SAM and FTS.

Until SAM and FTS are set up, Annie is running scripts emulating FTS as annieraw on anniegpvm01.
Scripts and logs are in ${HOME}/archiver

script log action log
fts-dd fts-dd.log writes files to /pnfs/annie/raw
stored stored.log writes file mf5sum to /annie/data/users/annieraw/stored when on tape
probar checks file integrity
badraw badraw.log moves file to /pnfs/annie/BAD/raw

crontab

08 01 * * * ${HOME}/archiver/fts-dd 
08 00 * * * ${HOME}/archiver/stored

Draft plan 2016/05/14 Arthur Kreymer -

To get us started, we will initially copy raw data
  • from annie@annie-daq01:/data/archive
  • to /pnfs/annie/persistent/raw
  • md5 sums, for eventual use, in /annie/app/users/annieraw/raw/md5

I have taken the liberty of removing : from the file names when copying.
The colon and similar characters will trip up scripts,
and cause database issues later, so should be avoided.

Until automatic processes are set up, scripts are running from /home/annie/bin.

Starting 2016/05/23, we run the archiving scripts every hour in the background with

set nohup ; /home/annie/bin/archdriver &

The log is /data/logs/archiver/archiver.log

Scripts in /home/annie/bin on annie-daq01

archdriver

This calls
  • archdo - to find files in need of archiving
  • archiver - to copy files to /pnfs/annie/persistent/raw
  • sleep for 1 hour, rinse and repeat.

archdo

Makes a /data/archiver/do entry for any /data/archive files no yet under /data/archiver/do*
The files must be unmodified for at least 10 minutes.

This tells the archiver to move the files.
This is a short term hack, the DAQ coould eventually do this when files are closed.

archiver

Copies the files in /data/archiver/do to /pnfs/annie/persistent/raw

On 05/14 this a symlink to archscp which copies with scp through anniegpvm02.
A service principal authorizes the copies -

The script uses /data/archscp/do doing done directories to track work.

Recently kerberized ftp was installed on annie-daq01.
The principal is registered with dCache mapping to account annieraw per RITM0394953

Arthur Kreymer is working on adapting the Minos archftp script for use here,
providing protection aginst multiple runs, and proper error recovery.

rsyncstored

Updates /data/archiver/stored status file from anniegpvm01, via rsync

preen

Removes a number of files from annie-daq01:/data/archive which have md5sum's listed in /data/archiver/stored.
Verifies the local checksum before removing the file.
Acts on files older than a given age, by default 7 days
By default it previews the action.

preen 10+7 # test 10 files over 7 days old
preen 10 do # really remove them

A log is kept at /data/logs/preen.log
Output is also tee'd to stdout for interactive sessions

  • Added feature 2016/08/11
    • can specify an age (days) requirement as well as a count.
    • Format :
      preen NUM+AGE
      
    • NUM defaults to 9999, AGE defaults to 7
  • Crontab
    • Starting 2016/08/16 preen is running via cron
      28 01  * * *         ${HOME}/bin/preen +14 do
      
    • Initially preen +14, more aggressive later if needed

probe

This lists all local and offline copies of a given file,
with file length and checksums ( md5sum and ecrc ).
Use this to see whether a file was corrupted on the way to /pnfs/annie/raw.
Use badraw to set aside /pnfs/annie/raw files.

stash ( obsolete 2016-06-14 )

We prefer to remove files from /data/archive when they are on tape.
Tape is not set up yet (2016-05-26), so meanwhile we will live with a copy in /pnfs/annie/persistent/raw,
and another copy on annie-daq02

The stash script copies to annie-daq02:/data/archive :
  • files in /data/archive
  • and listed in /data/archiver/done
  • and not yet having a checksum in /data/archiver/stashed

The log is /data/logs/stash.log
File md5ums are checked on both ends, and recorded in /data/archiver/stashed/${FILE}

Scripts in ${HOME}/archiver on anniegpvm01/02

fts-dd

This is a pale shadow of what FTS will do eventually.
FTS is standard supported code, with web status available.

For FTS we need SAM metadata, so SAM needs to be set up
RITM0394415 05/12 SAM for ANNIE - no action 06/14

Presently, given a count, fts-dd will copy that many new files from /pnfs/annie/persistent/raw to /pnfs/annie/raw.
It writes ecrc and md5sum checksums to /annie/data/users/annieraw/ftscrc and ftssum.
The log is fts-dd.log

First run was 2016/06/04, on anniegpvm01.

Data rates seem to be around 100 MB/sec.
I would prefer to be doing third party ftp, as supported ifdh cp.
But I do not know how convert our annieraw project pricipal to a corresponding SSL proxy.

set nohup ; ${HOME}/archiver/fts-dd &

stored

This writes named files with md5 sums to to /annie/data/users/annieraw/stored
for any file that is confirmed to be stored in Enstore.
The log is stored.log

probar

This lists file sizes and checksums in persistent and Enstore areas.
Called by the DAQ system probe script
Used prior to removal of bad Enstore files with badraw.

badraw

This moves data files from /pnfs/annie/raw to /pnfs/annie/BAD/raw.
This script self-logs to badraw.log

badper

This moves data files from /pnfs/annie/persistent/raw to /pnfs/annie/persistent/BAD/raw.
This script self-logs to badper.log

History

  • 2016/09/01 fts-dd added pid file to prevent multiple runs
  • 2016/09/01 samweb server was deployed, fts server was requested
  • 2016/09/05 deployed probe and probar on DAQ and offline
  • 2016/09/05 deployed badraw offline
  • 2016/09/05 fts-dd checks md5sum in /pnfs/annie/raw, removes bad copies

Plans

Fairly soon we will
  • move the annie-daq01 scripts from the annie to the annieraw account
  • commit the scripts to git
  • add SAM metadata
  • decide on a directory structure for longer term storage under /pnfs/annie/raw (tape)