Project

General

Profile

Nova FTS Guide

The File Transfer System (FTS) has been installed and is currently running in for the NOVA experiment. The FTS is run on machines that are a part of the data acquistion (DAQ) computing cluster and is currently being used to process raw data files and DAQ log files for the general DAQ system.

Installation

The FTS system and its external dependencies can be installed via the UPS/UPD mechanism. This process allows new instances of the FTS to be brought up quickly and minimizes version conflicts with existing packages that may be installed on a system.

The current FTS installation relies on the following external products which are available from UPS/UPD for the Linux 2.6 i386 and x86_64 architectures:

ups depend FileTransferService for deps

  • python v2_4_6
  • twisted_python v11_0
  • sam v8_8_2
  • sam_config v7_1_7
  • sam_ns_ior v7_2_3
  • samgrid_batch_adapter v7_2_7

The SAM station system:

  • sam_station v8_8_20 Linux64bit+2.6

To generate a current listing of dependencies for the FTS system, you can directly query the UPS database with the following command:

> ups depend FileTransferService

FTS Setup

Because the FTS uses UPS with a proper dependency chain, all that is required to initialize and setup the FTS software system is a single ups setup command:

> setup FileTransferService

Once the system has been setup, the FTS daemon can be started.

The FTS system currently relies on several external tools for extracting metadata from the files that it processes. These packages are experiment specific and must be made available via the standard execution path, and must be able to link any additional libraries via the library paths.

To setup the NOvA DAQ specific software that the FTS will use, source the standard online scripts and optionally define a test release to use for local modifications. In the case of the installation of the FTS on novadaq-datadisk-01 the initialization sequence is:

source /nova/novadaq/setup/setup_novadaq_nt1.sh cd /home/novaraw/build/dev_tools/ srt_setup -a

This will setup the novadaq environment and correctly set the $SRT_PUBLIC_CONTEXT and the $SRT_PRIVATE_CONTEXT environment variables and push them onto the correct paths.

Then to initialize the FTS and SAM products do the following:

export PRODUCTS=$HOME/sam/db:$PRODUCTS export SETUPS_DIR=$HOME/sam/etc

Which add the UPS database with the SAM products to the ups paths. Now actually setup the products:

setup ups setup sam setup FileTransferService

Then you can start the actual deamon using the config files (e.g. fts_config.txt) found in the config directory (/home/novaraw/fts_rundata/)

start_fts /home/novaraw/fts_rundata fts_config.txt

The FTS system will spin up and start processing files.

Transfer Delays

When the FTS system starts, it will immediately check for new files in the directories that have been registered with it. If the file is new, and has not already been transferred to an appropriate location, then the FTS will start the data migration process. HOWEVER, to prevent the FTS from attempting to transfer a file that is still open by a DAQ process (e.g. a file still being written by the datalogger, or a log file that is being held open by an active DAQ system) the FTS provides a "hold-off" setting that allows the user to select a mandatory delay between the modification time of the file and the current system time of the machine that the FTS is running on.

By default this hold-off is set to 25hrs to ensure that files being synced between the DAQ spool disks are not accidentally sent to archival storage while incomplete.

This hold off can be adjusted through the "scan-delay" configuration parameter. See Configuration.

Automatic FTS Start Up

FTS is designed to be an "always running" system that operates independently of other DAQ systems. The FTS does not require the normal NOVA DAQ systems to be running to accomplish its jobs of migrating files. As a result most users will have very little actual interaction with the FTS system aside from noting that is running and not reporting errors.

For the current deployment of the FTS at the NOvA near detector computing cluster, the FTS is installed on the second data spool disk (novadaq-ctrl-datadisk-02.fnal.gov) and is automatically started via the cron facility.

The actual startup sequence is placed in the crontab for the novaraw user and is executed on reboot.

Because the FTS relies on parts of the NOvA DAQ software distribution, it requires that the area that houses the SRT releases be mounted prior to FTS system startup. In particular on the NOvA cluster, /nova/novadaq needs to have been mounted prior to attempting to start the FTS. The crontab entry for the FTS on datadisk-02 has the following form which takes into account this requirement:

# Crontab for user novaraw @reboot while ! [ -d /nova/novadaq/ ]; do sleep 60; done && (other commands to execute)

In particular both the SAM station and FTS are started this way with the crontab entries:

@reboot while ! [ -d /nova/novadaq/ ]; do sleep 60; done && . /home/novaraw/setup_sam_prd.sh && ups start sam_bootstrap >& /dev/null # The below setup puts a local build dir on the path, as we need the MetaDataRunTool. Eventually it will become part of the release @reboot while ! [ -d /nova/novadaq/ ]; do sleep 60; done && . /home/novaraw/setup_sam_prd.sh && setup FileTransferService && start_fts /home/novaraw/fts_rundata fts_config.txt >& /dev/null

Current crontab is installed as novaraw

@reboot while ! [ -d /nova/novadaq/ ]; do sleep 60; done && . /home/novaraw/setup_sam_prd.sh && ups start sam_bootstrap >& /dev/null
# The below setup puts a local build dir on the path, as we need the MetaDataRunTool. Eventually it will become part of the release
@reboot while ! [ -d /nova/novadaq/ ]; do sleep 60; done && . /home/novaraw/setup_sam_prd.sh && setup FileTransferService && start_fts /home/novaraw/fts_rundata fts_config.txt >& /dev/null

File processing

Under the current configuration that is installed for the NOvA DAQ, the FTS system performs the following combination of functions:

  1. Copies all raw nova data files to central disk storage (Bluearc)
  2. Copies all raw nova data files to archival tape storage (Enstore)
  3. Copies all log nova daq log files to archival tape storage (Enstore)
  4. Bundles data files and log files into appropriate tar balls to meet the Enstore storage requirements
  5. Deletes log files which have been copied to archival storage and are more than 60 days old

The system also catalogs all files that undergo copy/transfer into the SAM metadata catalog.

In order to process each type of file the FTS system is required to have a method of generating or extracting metadata related to the file. For the current system the following tools are used to obtain the metadata:

Raw Data:

  • MetaDataRunTool from the NOvADAQ MetaDataTools package.

Log Files:

  • Internal generation from the FTS system itself

Note: the executable name MetaDataRunTool is currently hardwired into the FTS system. This should be moved to a config file to allow easy configuration of new data types and metadata extraction tools.

File Locations and filename patterns

Currently the FTS system looks in the following file tree locations for new files:

  • /daqlog/NDOS/
  • /data2/NDOS/

Only files that match predefined patterns are considered valid targets for the FTS to transfer and catalog. Under the current configuration the following filename patterns are used:

  • Data files: *.raw
  • Log Files: *.log *.gz

The system can additionally be configured to handle other types of files by adding additional filename masks to the appropriate sections of the FTS configuration file. See Configuration.

Filename patterns can also be excluded from FTS processing using a similar mechanism. Under the current setup the following filename patterns are excluded from FTS processing:

For raw data:

  • Excluded: SingleEventAtom*.raw

For Log files:

  • Excluded: ospl-*.log
  • Excluded: ospl-*.log.gz
  • Excluded: css-*.log
  • Excluded: alarmServer*.log

Nova FTS Guide

The File Transfer System (FTS) has been installed and is currently running in for the NOVA experiment. The FTS is run on machines that are a part of the data acquistion (DAQ) computing cluster and is currently being used to process raw data files and DAQ log files for the general DAQ system.

Installation

The FTS system and its external dependencies can be installed via the UPS/UPD mechanism. This process allows new instances of the FTS to be brought up quickly and minimizes version conflicts with existing packages that may be installed on a system.

The current FTS installation relies on the following external products which are available from UPS/UPD for the Linux 2.6 i386 and x86_64 architectures:

ups depend FileTransferService for deps

  • python v2_4_6
  • twisted_python v11_0
  • sam v8_8_2
  • sam_config v7_1_7
  • sam_ns_ior v7_2_3
  • samgrid_batch_adapter v7_2_7

The SAM station system:

  • sam_station v8_8_20 Linux64bit+2.6

To generate a current listing of dependencies for the FTS system, you can directly query the UPS database with the following command:

> ups depend FileTransferService

FTS Setup

Because the FTS uses UPS with a proper dependency chain, all that is required to initialize and setup the FTS software system is a single ups setup command:

> setup FileTransferService

Once the system has been setup, the FTS daemon can be started.

Uses ups products:

setup FileTransferService

ups depend FileTransferService for deps

python v2_4_6
twisted_python v11_0
sam v8_8_2
sam_config v7_1_7
sam_ns_ior v7_2_3
samgrid_batch_adapter v7_2_7

sam_station v8_8_20 Linux64bit+2.6

For raw files there is a 25hr delay

Start Up

Startup is controlled from crontab

Current crontab is installed as novaraw

@reboot while ! [ -d /nova/novadaq/ ]; do sleep 60; done && . /home/novaraw/setup_sam_prd.sh && ups start sam_bootstrap >& /dev/null
# The below setup puts a local build dir on the path, as we need the MetaDataRunTool. Eventually it will become part of the release
@reboot while ! [ -d /nova/novadaq/ ]; do sleep 60; done && . /home/novaraw/setup_sam_prd.sh && setup FileTransferService && start_fts /home/novaraw/fts_rundata fts_config.txt >& /dev/null

Currently handles:

log files to tape (only), deleted after 60 days
raw datafile to bluearc and to tape and are NOT deleted

Relies on the following metadata extractor (MetadataRunTool) for rawdata files.

Log files don't rely external programs.

Looking in...

/daqlogs/NDOS/*[.log,.gz]
/data2/NDOS/*.raw

Retry Files

curl -d filename=myfile http://novadaq-ctrl-datadisk-02.fnal.gov:8888/fts/retryFiles

Remove a file

Remove first the file.

Run the retry. This will clear the file from the lists.

Status

General Status is available from:

http://novadaq-ctrl-datadisk-02.fnal.gov:8888/fts/status

/fts/status?format=json

To get a json formated return

Reboot

In the case of a reboot, if /nova/novadaq mount is not present when it tries to start then the start script get's confused.

Same is true of bluearc mounts, but here it just lists them in the errors on the status page.
Same is true of the daqlogs source area

Config

home directory in:
fts_rundata

fts_config.txt has the configuration files

Need to restart fts to have changes take affect.

Restarting

setup FileTransferService
stop_fts /home/novaraw/fts_rundata
start_fts /home/novaraw/fts_rundata fts_config.txt
(path then config file)

Tarballs

Currently tarballs are being written to the local disk for paranoia

they are in:

/data2/merge_logs/archive/
/data2/merge_raw/archive/

If there is a problem then this director may file up
/data2/merged_raw/build_tar/

For log files they are being built on /scratch in:

/scratch/mergedlogs/