SAM web cookbook » History » Version 1

Version 1/14 - Next » - Current version
Gavin Davies, 04/24/2014 08:03 PM

SAM Web Cookbook (Nova Edition)

Right now SAM web client does not get setup by default so one must set it up themselves (this will change in due course!)
Setup SAM Web Client:

export PRODUCTS=/grid/fermiapp/products/common/db/:$PRODUCTS
setup sam_web_client <version>

Version is optional, use it if you want a specific version of the client (instead of the latest production one)

Set your experiment with an environment variable (this saves you typing later):

export EXPERIMENT=nova

If you don't have a certificate, get one (based off your kerberos credentials):


All of this can be automated by putting it in your login files (i.e. .bash_profile or .bashrc)

Glossary and Conventions

The following shorthands are used:

In the following, anything with a set of angled brackets denotes a variable. i.e. <run number> would be insert your own personal run number you were interested in.

Anything with a dollar sign in front of it denotes a shell variable, i.e. $BASE_QUERY

  • BASE_QUERY is the data tier and detector. It is assumed to be set like:
export BASE_QUERY="data_tier raw AND online.detector fardet" 

To get help from samweb type:

samweb --help-commands

Beginner Recipes (boiling water)

export BASE_QUERY="data_tier raw AND online.detector fardet" 

To save some typing.

Queries are NOT case SeNsItIvE

To save typing some parts of earlier queries are denoted as $QUERY_XXX

Any where you see a "list-files" you can replace it with a "count-files" to just return a count instead of actual file names.

List Files from a data tier and detector

*samweb list-files "data_tier <tier> AND online.detector <det>"


samweb list-files "data_tier raw AND online.detector fardet" 

Will return 459,532 files (today)

samweb list-files "data_tier raw AND online.detector ndos" 

Will return 91,360 files (today)

From here on we use $BASE_QUERY for this.

List Files from a Run

samweb list-files "$BASE_QUERY and online.runumber <runnumber>"


samweb list-files "$BASE_QUERY and run_number <runnumber>"

First one is DAQ specific, the other is more general.


samweb list-files "$BASE_QUERY and run_number 13114" 

List Files from a Time Period

Files created between two times:

samweb list-files "$BASE_QUERY and start_time > '2014-01-30T23:29:00' and start_time < '2014-01-31T00:30:00'"


samweb list-files "$BASE_QUERY and start_time > '2014-01-30T23:29:00' and start_time < '2014-01-31T00:30:00'" 

List Files from a specific trigger stream

You want only a given stream.

samweb list-files "$BASE_QUERY and run_number <run_no> and data_stream <stream>"


samweb list-files "$BASE_QUERY and run_number <run_no> and Online.Stream <stream>"

For DAQ files only.

Stream is a number. Streams are fully configurable, but in general in early 2014 they looked like:

  • 0 = NuMI
  • 1 = Booster Beam
  • 2 = Min Bias
samweb list-files "$BASE_QUERY and run_number 13114 and data_stream 0

List Files from DAQ Partition

You want only a specific DAQ Partition

samweb list-file "$BASE_QUERY and Online.Partition <partno>"

*samweb list-file "$BASE_QUERY and Online.Partition 1"*

List Metadata associated with a file:

File names do not have paths, just base names (all files in SAM are unique)

samweb get-metadata <filename>

samweb get-metadata fardet_r00013114_s20_t00.raw

You get output like:

                    File Name: fardet_r00013114_s20_t00.raw
                      File Id: 4877797
                    File Type: importedDetector
                  File Format: raw
                    File Size: 6908296
                          Crc: 74650857 (adler 32 crc type)
               Content Status: good
                        Group: nova
                    Data Tier: raw
                  Application: online datalogger 33
                  Event Count: 110
                  First Event: 171026
                   Last Event: 179507
                   Start Time: 2014-02-14T01:34:14
                     End Time: 2014-02-14T01:37:43
                  Data Stream: 0
             Online.ConfigIDX: 0
          Online.DataLoggerID: 1
     Online.DataLoggerVersion: 33
              Online.Detector: fardet
            Online.DetectorID: 2
             Online.Partition: 1
          Online.RunControlID: 0
     Online.RunControlVersion: 0
            Online.RunEndTime: 1392341863
             Online.RunNumber: 13114
               Online.RunSize: 1727074
          Online.RunStartTime: 1392337488
               Online.RunType: 0
                Online.Stream: 0
         Online.SubRunEndTime: 1392341863
       Online.SubRunStartTime: 1392341654
                Online.Subrun: 20
           Online.TotalEvents: 110
         Online.TriggerCtrlID: 0
        Online.TriggerListIDX: 0
Online.TriggerPrescaleListIDX: 0
        Online.TriggerVersion: 0
 Online.ValidTriggerTypesHigh: 0
Online.ValidTriggerTypesHigh2: 0
  Online.ValidTriggerTypesLow: 0
                         Runs: 13114.0020 (online)
               File Partition: 20

List files with some other parameter or parameters

samweb list-file "$BASE_QUERY and Parameter.name_1 <value> and Parameter.name_2 <value>"

*samweb list-file "$BASE_QUERY and Online.TotalEvents > 123 and Online.DataLoggerVersion = 33"*

Get File locations

samweb locate-file <filename>

samweb locate-file ndos_r00015701_s07_cosmic.raw

Response will be a list of locations:
  • Locations starting with "novadata" are bluearc central disk.
  • Locations starting with "enstore" are dCache/Enstore locations (disk cache, tape backed)

Get Ancestors of a File

samweb file-lineage <children/descendants> <filename>

Children are files derived directly from the input file

samweb file-lineage children fardet_r00013096_s14_t00.raw

samweb file-lineage <parents/ancenstors> <filename>

samweb file-lineage parents fardet_r00013096_s14_t00_numi_S14-01-20_v1_data.daq.root

Intermediate Recipes (Poaching eggs)

Get a list of all currently defined fields

Go to:
Current Nova Experiment Dimensions

Get a list of Non-DAQ data files (e.g. Laser Scans) matching a search

samweb list-file "data_tier laser_scan AND laser_scan.block_number = 23 AND laser_scan.layer_number > 4"

Listing Files with children matching a selection

List raw files who have been processed through a different stage

samweb list-file "$BASE_QUERY and isparentof: (data_tier <stage> AND Parameter.name_1 <value>)"

samweb list-files "$BASE_QUERY and isparentof: ( data_tier artdaq AND daq2rawdigit.base_release 'S14-01-20' )" 

Listing Files that match a filename patern

This is to match parts of the file name

samweb list-file "file_name like fardet%DDenergy%" 

Listing Files with parents matching a selection

With BASE_QUERY2="data_tier artdaq AND online.detector fardet"

samweb list-file "$BASE_QUERY2 and ischildof: ( data_tier raw AND Online.Subrun < 20)

Listing Files with no physical locations

samweb list-files "$BASE_QUERY AND availability: virtual"

samweb list-files "$BASE_QUERY AND availability: virtual" 

Listing Files with physical locations

samweb list-files "$BASE_QUERY AND availability: physical"

samweb list-files "$BASE_QUERY AND availability: physical" 

Retrieving Files with a physical location

You can retrieve files either individually or with a query pattern (multiple files).

Retrieve a single file

ifdh_fetch <filename>

ifdh_fetch fardet_r00012006_s61_t02.raw

Note: you must have a valid certificate (i.e. run kx509)

Retrieve a group of files

ifdh_fetch `ifdh translateContraints <dimensions string>`

ifdh_fetch `ifdh translateConstraints "data_tier raw AND online.detector fardet and run_number 12006.51"`

Note: Here ifdh is used to do the lookup of the files and then the resulting names are passed to the fetch.

Verifying that your file was transfer correctly

Check the checksum against the tape copy (no json parser installed)

# From Database
samweb get-metadata fardet_r00012006_s35_t02.raw | grep "Crc" | cut -d ':' -f 2 | cut -d ' ' -f 2
# From file on disk
samweb file-checksum fardet_r00012006_s35_t02.raw | cut -d '"' -f 4

If you have a json parser available then just use that to parse the output instead of using "cut"

samweb get-metadata fardet_r00012006_s35_t02.raw --json | jq '.crc.crc_value'

Advanced Recipes (Hollandaise sauce)

Pre-stage a dataset

For large datasets that are definitely on tape (i.e. haven't been used for a long time) you will want to pre-stage the data to the dCache area before submitting jobs.

Recovering a project

samweb project-recovery -e nova --useFileStatus=0 --useProcessStatus=0

which yields:

(snapshot_id 15312 minus (project_name and consumed_status consumed))

samweb create-definition <new_definition_name> "(snapshot_id 15312 minus (project_name and consumed_status consumed))" 

Constructing a Good Runs List

Combining it all (Eggs Benedict)