SAM material, probably not needed.

2. SAM History

SAM for Tevatron Data Handling
o Forerunner of LHC data management (but important differences)
o Metadata catalog (file name, size, events, run info, luminosity, MC details, ...)
Lesson: keep metadata out of the filename
o Dataset creation and queries
Lesson: need to balance ease of use and flexibility (experts and non-experts)
o Replica catalog (where all file copies are located)
o Coordinates and manages data movement to jobs (bbftp, gridftp, dccp, SRM,
future XRootD)
Lesson: be ready to adopt new protocols - flexibility is key
o Cache management (now using dCache)
Lesson: don’t do it yourself once something more standard meets requirements
o File consumption and success tracking, recovery
Lesson: this is an important feature (especially with opportunistic resources)

Cache Management

  1. Optimizes access to tape. SAM groups and schedules tape accesses to minimize multiple mounts (e.g. dCache knows what’s requested, SAM knows what’s going to be requested).
  2. Throttles tape access by cache node and project to protect tape access from overload.
  3. Protects cache access from overload: Throttles cache access by node and project
  4. Protects cache from high turnover: Carve up the cache into groups. E.g. production jobs reading lots of files once (e.g. high turnover) do not interfere with skims that need a long cache lifetime for analysis jobs [we use this sparingly, but effectively]
  5. Provides automatic file placement and expiration. With SAM, there is no need to pre-place or pin files to a cache. Regular usage brings files in from tape. Popularity will keep them in the cache. Non-usage will eventually make them disappear. This is very successful and is human effort free.
  6. Automatically pre-fetches files into a cache before jobs start and while jobs are busy, within throttling and other limit.
  7. supports many protocols: SAM can transfer files between a wide variety of storage elements (allows for opportunistic use of resources)

Query Language

SAM provides a query language for retrieving sets of files by their metadata values which has been enhanced in the new version of SAM to take advantage of experience gained. This allows simple queries like “run_number 13501 and file_format raw” which returns all raw data files from run 13501. It also allows more complex queries such as “run_number 13501 and file_format raw and notisparentof: (application reconstruction and version S12.02.14)” which returns all raw files from run 13501 which do not have a derived file reconstructed with the specified version of the software.

Flexible datasets can be created as stored queries. These datasets are evaluated dynamically; if more files that match the query criteria have been added since the dataset was created they will be included when it is used. End users can define their own datasets. They aren’t restricted to a predefined set created by an administrator.

When running a data processing task, the task starts with the job submission system notifying SAM a particular dataset is required by starting a project for it. Staging of the files can begin at this point, if required. Once each job has begun execution, it attaches itself to the project and begins requesting files. When a job finishes with one file, it notifies the system and is given to another process. This continues until all files have been processed.

SAM provides on-site near-line direct access to files on tape, or on and off site access of files through a disk cached front end to the tape storage. The tape storage system is called Enstore, which was developed by Fermilab. Files get written to disks and then migrated to Enstore tapes. For file read requests, if the files do not reside in the disk cache, they first get retrieved from Enstore.

Steps in Processing Files
  1. Users develop a query to produce the files wanted, saving it as a dataset definition.
  2. The project, a task processing that dataset definition, retrieves the files if needed and tells the analysis process where the file is located. (Users don’t explicitly list the files names, and don’t need to know where they are.)
  3. Multiple processes can run for the same project, sharing the files between them automatically.

SAM is the data handling system used by multiple current and future Fermilab experiments. It builds on long experience from the Tevatron, and has been updated to use modern technologies. Little used or unnecessary features were removed, and the CORBA Remote Procedure Call interface was replaced with an HTTP interface based on REST principles. SAM is not part of an integrated system, the Fermilab FabrIc for Frontier Experiments (FIFE) which provides common tools and systems for scientific data processing.

SAM is experiment agnostic, providing flexibility. The high level of automation reduces the administrative effort required from experiments.

Picture from: Adam Lyon

Features that SAM brings to cache management

File requests within Art

At some point Art demands a file to read. The input file loop is as follows:

From the base URL and the process ID, Art constructs the URL corresponding to SAMWeb’s “getNextFile”. Art performs the http POST request which may produce one of several results from SAMWeb:

SAMWeb returns error code 400-499. These codes represent unrecoverable errors (e.g. the SAM Project does not exist). Art will follow its standard fatal error procedure.

SAMWeb returns error code 500-599. The codes represent potentially recoverable errors (e.g. The DB server could not be contacted). Art should retry the “getNextFile” POST Request a maximum number of times. Once that limit is exceeded, the error is deemed unrecoverable and Art will follow its standard fatal error procedure.

SAMWeb returns code 202 (try again later) along with a suggested wait time. SAMWeb returns this code if it knows a file is not immediately available. Art should wait the specified amount of time and repeat the POST call.

SAMWeb returns code 204 (no more files). This code is returned when the file pool is exhausted. Art should treat this condition as reaching the end of its file list and perform the appropriate actions.

SAMWeb returns a URI corresponding to the file location.

The last case, returning a URI, means that the file has been delivered and Art can retrieve it. The file URI can be of types file://, gridftp://, srm://, etc. Only file:// will be used initially. The file will have to be copied from its delivery point to the local node (many experiments have tried streaming the file directly to the application, but that has never been more efficient than simply copying the file to a local location). The copy procedure may be complex and depends on the storage type (e.g. using “cpn”, “gridftp”, or “srm” tools).

To encapsulate the mechanisms for moving the file to the local node, REX/DH will supply an Art Service providing a method that, when given the file URL, will move the file to the local node and return a local file path (or error code) that Art can use to open the file. Art should block while waiting for this method to return.

If the move failed, Art (not the transfer service) will make an http POST call to SAMWeb’s “updateFileStatus” with the argument “skipped”, indicating that the file transfer failed. Art should then follow its file load failure procedure.

If the move was successful, Art (not the transfer service) will make an http POST call to SAMWeb’s “updateFileStatus” with the argument “transferred”, indicating that the file was transferred successfully (and allowing SAM to unlock the source location of the file). Art then processes the file normally.

Art closes the input file when it has exhausted its data, some successful condition is reached (e.g. maximum number of desired events), or there was a processing failure. Art then makes an http POST request to SAMWeb’s “updateFileStatus” with an argument of “consumed” if the processing was successful or “skipped” if the processing failed.

Art now calls SAMWeb’s “getNextFile” as described above, and the loop continues.

Processing completion

At some point, Art will complete its processing and exit with success or a failure code.

Returning results to the user

Art may produce output files as a result of processing. When an output file is closed, Art will call a Service written by REX/DH with a method for notifying it that an output file is available for transfer and its path. Art will block while waiting for the service method to return. The service method should in most instances return almost immediately, deferring transfer to later or delegating transfer to another process. Art will never delete output files it has created.

Art will also call the service when it opens and closes an input file. The idea here is that the service can then keep track of which input files correspond to each output file (while it is agreed that this mechanism is less error-prone, REX/DH would rather have Art itself keep track of the input file to output file correspondence and fill the meta-data accordingly).

REX/DH needs to determine how this Service will work. Options include notifying another process on the machine and transferring the output file in parallel with the Art processing, or appending the output path to a list and then moving the files after Art processing completes.

Job completion

When the output files are transferred to their destination, the job wrapper script makes an http POST call to SAMWeb’s “setStatus” with an argument of “completed” if the job was successful or “bad” if this job would need recovery.

The analysis job ends. When all of the analysis jobs end, a final job is started to end the SAM Project with a call to SAMWeb’s “endProject”.

Basic SAM concepts

Communication with the SAM catalog is handled through the web interface. (This used to be a CORBA service.)

Installing and configuring SAM

Installing SAM has a few prerequisites: a user account on the system called sam with a particular UID; prior installation of the Fermilab product distribution software (ups/upd); system configuration to call ‘ups startup’ during boot time for those systems which will run production SAM servers. Then, it is necessary to install the clients, the servers, and at least one file transfer protocol understood by SAM.

Use cases of SAM

SAM is in use in production by DØ for several different use cases. The DØ online system and several offsite Monte Carlo production centers deploy SAM File Storage Servers, using these to store collider and simulation data into ENSTORE (the Fermilab mass storage system) via SAM. These data are then accessed by the Fermilab DØ systems and by remote DØ systems running SAM stations. The onsite stations are purely Linux systems (the desktop cluster CluED0), mixed Irix-Linux systems (CAB, the reconstruction farm), and the large Irix SMP (d0mino). The CluED0 station is used primarily for small-scale analysis jobs; CAB for large-scale analysis jobs; and the d0mino station for high-throughput applications (picking individual events out of large datasets, distributing large datasets to remote stations). Remote analysis stations have been established at many remote sites; about 20 such stations are active now, with varying configurations.

SAM on Grids

SAM stations can be united in a grid with a submission system using Condor and Globus grid tools, as mentioned above. The future plans for SAM include enhancing its main components to permit even more use of Grid tools, including virtual organization tools, technology for creating run time environments on general-use clusters, and multi-layer caching strategies.

Installing and configuring SAMGrid

Installing a client, submission, monitoring, or execution site for grid-aware SAM usage is similar to installing SAM. The client sites (where the user submits a job that is sent to a submission site) are lightweight, requiring only a JIM product. Other types of site require the Globus security infrastructure, an XML database installation, and the relevant SAM and JIM products. As is the case for SAM, the Fermilab product distribution software is used. Again, sites with firewalls will need to open particular ports.
SAM Operations

An important part of a data handling system is its operations model. For SAM, the model is a three-level hierarchy of monitoring and response. An experiment using SAM supplies shifters who monitor the production SAM systems using Web and command line tools supplied by SAM and the experiment. Shifters report problems which they cannot solve to an expert-on-call from the SAM team. The expert reports problems which are bugs or design issues to the developer(s) of the affected components(s).