How to automate production using POMS

An example to do this is taken from the Decoder Files stage for the Raw Data.


Step-by-step instructions

  • Click on the cs_split_type field on the campaign editor
  • Select the drainign(k) as the the Split Type.
    This is a new snapshot based splitter, which yields up to k files not yet processed each time. Files already processed are tracked with a snapshot stored in cs_last_split, which can be cleared to "reset" the splitter and start over. Current set-up: k=10, this is sufficient to process the data as < 10 raw data files are produced within an hour.
  • Combining the draining(k) split type and schedule future jobs feature of POMS, this stage is now being run automatically every hour and will process up to 10 files at each submission. Files that were not consumed by samweb will automatically be added to the next round of grid submission. You can monitor these jobs by going into the Campaign Stage Submission page
  • When some files takes > 1 hour to process, this will cause the files to not be marked as consumed by samweb. This will lead to duplication of files, as the same files are processed by two campaigns that are 1 hour apart. To mitigate the duplication issues, we can make use of SAMLite or SAM for User Datasets
    setup fife_utils

    then use the sam_dataset_duplicate_kids command:
    Usage: sam_dataset_duplicate_kids [options] --dims dimensions 
     Check files in dims for duplicate children of same parent
      -h, --help            show this help message and exit
      -v, --verbose         
      -e EXPERIMENT, --experiment=EXPERIMENT
                            use this experiment server defaults to $SAM_EXPERIMENT
                            if not set
      --dims=DIMS           dimension query for files to check
                            metadata field to include in comparisons
      --mark_bad            mark as 'bad' in content_status
      --retire_file         retire duplicate files
      --delete              delte duplicate files

    sam_dataset_duplicate_kids --retire_file --delete --dims "ischildof:( 'DataXportTesting_03Feb2020' and run_number > 2070)" 

    The command above will delete the duplicates from its physical location in dcache or enstore areas, as well as retiring them (removed the metadata information which can be used to query the samweb database), and the screen output will look as follows:
    parent data_dl1_run2090_2_20200812T171912.root:
      duplicates of hist_data_dl1_run2090_2_20200812T171912_20200813T171629_decoder.root:
        hist_data_dl1_run2090_2_20200812T171912_20200814T184446_decoder.root (deleted)(deleted)(retired)
      duplicates of data_dl1_run2090_2_20200812T171912_20200814T184445_decoder.root:
        data_dl1_run2090_2_20200812T171912_20200813T171628_decoder.root (deleted)(deleted)(retired)
    parent data_dl1_run2093_6_20200812T174814.root:
      duplicates of data_dl1_run2093_6_20200812T174814_20200814T184509_decoder.root:
        data_dl1_run2093_6_20200812T174814_20200813T202801_decoder.root (deleted)(deleted)(retired)
      duplicates of hist_data_dl1_run2093_6_20200812T174814_20200813T202801_decoder.root:
        hist_data_dl1_run2093_6_20200812T174814_20200814T184509_decoder.root (deleted)(deleted)(retired)
    parent data_dl1_run2093_5_20200812T174636.root:
      duplicates of hist_data_dl1_run2093_5_20200812T174636_20200814T184526_decoder.root:
        hist_data_dl1_run2093_5_20200812T174636_20200813T172142_decoder.root (deleted)(deleted)(retired)
      duplicates of data_dl1_run2093_5_20200812T174636_20200813T172142_decoder.root:
        data_dl1_run2093_5_20200812T174636_20200814T184526_decoder.root (deleted)(deleted)(retired)

    Note that DataXportTesting_03Feb2020 is the metadata parameter used to inquire the raw data files/input files to the decoder stage.
    If you don't see any output after launching the command, this means that the dataset are free of duplicated files


Useful Links: