Project

General

Profile

Feature #17894

More split types... split-list, n-files-per

Added by Marc Mengel over 2 years ago. Updated about 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Target version:
Start date:
10/12/2017
Due date:
% Done:

100%

Estimated time:
Scope:
Internal
Experiment:
-
Stakeholders:
Duration:

Description

Adding two more dataset splitting types:

  • split_list -- the base_dataset value is actually a list of datasets, and we need to split it on commas to get a list of the datasets to run for each submission. This can also be used to make a set of non-dataset jobs that need a different parameter, like energy levels for montecarlo. This will not declare a dataset for the launches, just pick it from the list.
  • n_files_per(n) -- make a dataset of the base dataset "with offset n*i limit n" for the i-th submission, so we have n files per submission (with possibly the last one coming up short).

Also, make sure we have the dataset split types written up in the wiki and linked from the edit page.

History

#1 Updated by Marc Mengel over 2 years ago

I merged into develop my branch that was splitting out split-types into loaded modules.

The ones listed here are present.

We can add new ones at runtime, even.

#2 Updated by Marc Mengel over 2 years ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100

#3 Updated by Marc Mengel over 2 years ago

Got tests added and working for this now..

9b7eefc6
5ee5444f

We now clock all the split types through a few cycles to make sure they work.
Downside is the test relies on a dataset existing with assorted properties...
I should have a script that will recreate that dataset.

#4 Updated by Anna Mazzacane about 2 years ago

  • Status changed from Resolved to Closed


Also available in: Atom PDF