Updated dimensions syntax

Basic clauses

Basic clauses consist of dimension operator value. dimension consists of alpha numeric and underscores characters, or category.parameter terms (where category and parameter are generic parameters).

Operators are =, !=, <, <=, >, >=, in, not in. The operator can be omitted, in which case it means the same as =. in and not in only apply to comma lists (for which = and != also work.
When comparing with string values in the database, the characters % or ? in the value will cause a wildcard comparison (this works even for values in a list). Wildcard characters can be escaped with \.

value is either a single term, or a list of terms separated by ,, or a range separated by -. The terms can be numbers (digits with a possible .), dates in the form dd-mmm-yyyy, unquoted strings, which must start with either a letter or a digit, possibly followed by alphanumerics or any of the characters _%?.- [ this is a slight simplification - the complexities are required to distinguish between strings containing a - and numeric or date ranges ], or quoted strings, which use either " or ' quotes and can contain any characters except the closing quote.


Basic clauses can be combined with not, and, or, or minus (with that order of precedence). [ @minus@ gives the set difference of its left and right sides. Since the left and right sides have to be fully evaluated to do this, it's usually less efficient than combining with and not where the right hand clause can work as a filter. ]

Special operators

Special operators contain a colon. They introduce special behaviour into the query.

defname:<existing dataset definition> inserts an existing definition into the query.

isparentof:( <clauses> ), ischildof:( <clauses> ) returns the files which are the parents of (or the children of) the clauses provided.

availability:<argument> changes the way files are filtered by availability. The argument is a comma separated list of values. Allowed values are any, default, virtual, nonvirtual, active, retired, good, bad. default is equivalent to nonvirtual,active,good. If the query does not specify any availability criteria, then the default values are applied to the overall result.


"file_name abc,def": files named "abc" or "def"
"file_name abc% and run_number 1234-5678": files with a name starting with "abc" and a run number between 1234 and 5678
"data_tier raw and run_number 1234-5678 and not isparentof:( application reco and version v10 )": all raw files with run number between 1234 and 5678 which do not have a child with application name "reco" and version "v10"