The IDOC: Integrated Data and Operating Center supports search expressions modelled after what CDS' Vizier service provides. Our implementation is not yet complete. If you need a given feature, please let us know.

The rough idea is that you can specify the values you search for with certain operators, depending on the type of the field you are matching against.

For numeric expressions, you can use the following operators:

- No operator: this is equivalent to =.
- Comparisons: The operators <, <=, >, >=, =, != are prefixed and behave like their mathematical counterparts.
- Enumeration: You can enumerate values you are interested in using the comma (",") operator.
- Range: To specify a range of acceptable values, use the ".." operator. Note that you must surround this operator with whitespace; alternatively, you can give "value and error" with the "+/-" operator.
- Logical operators: The operators ! (not), & (and), and | (or) provide simple logic. The precedence is in the given sequence. If you need to do fancy things here, contact us and we will add support for grouping (parentheses)

`50`

or`=50`

– select only values exactly 50`!=50`

– select only values different from 50`< 60.0`

;`> 4e-8`

;`>= -.5`

;`<= -5.e13`

– selects values smaller, greater, greater or equal to their operands.`50. .. 80.5`

– selects values between 50 and 80.5 inclusive.`50 +/- 10`

– selects values between 40 and 60 inclusive.`40, 50, 50.5, 60`

– select any of the enumerated values.`!40, 50, 50.5, 60`

– select anything other than the enumerated values.`40 | 100 +/- 5`

– select values equal to 40 or between 95 and 105.

- Equality is translated to between min and max
- Input intervals, whether given as ranges or using +/-, are compared for overlap with the interval given by min/max.
- Comparisons are a bit funky: < is translated to “smaller than the
lower bound”, <= to “smaller than the
*upper*bound”. Similarly, > is “larger than the upper bound” and >= “larger than the lower bound”. This particular behaviour we might reconsider in the future.

For those into such things, here is the grammar we currently use to parse numeric expressions (the base nonterminal is expr).

preOp ::= "=" | ">=" | ">" | "<=" | "<" rangeOp ::= ".." pmOp ::= "+/-" | "\xb1" (this is the ± character) orOp ::= "|" andOp ::= "&" notOp ::= "!" commaOp ::= "," preopExpr ::= [preOp] floatLiteral rangeExpr ::= floatLiteral rangeOp floatLiteral valList ::= floatLiteral { commaOp floatLiteral } pmExpr ::= floatLiteral pmOp floatLiteral simpleExpr ::= rangeExpr | pmExpr | valList | preopExpr notExpr ::= [notOp] simpleExpr andExpr ::= notExpr {andOp + notExpr} expr ::= andExpr {orOp + expr}

floatLiteral is a usual C decimal integer or float literal.

Dates support the same operators as numeric operands, except that the "error" in expressions with +/- is a simple float (in days). Dates are given in one the the VO's preferred ISO formats, i.e., YYYY-MM-DD or YYYY-MM-DDTHH-MM-SS. You can also give simple floating point numbers as dates, which will be interpreted as Julian years if between 1000 and 3000, as MJD if between 1e4 and 1e5, and JD if between 2e6 and 4e6. If you hit midnight when giving JD (i.e., number.5) or MJD (number.0), DaCHS will interpret the specification as “whole day” and expand the range accordingly.

In the VO, timezones are frowned upon. Hence, you will usually deal with UTC or a related time scale (like TT). For the reference positions, you will have to inspect the metadata.

Times without dates are not yet supported. In general, we try to interpret dates without times in a sensible way; for instance, if you just give a date, all records with a timestamp on that date will match.

See also the examples for numeric expressions.

`<2003-04-06`

– select values earlier than April 6th, 2003.`2003-04-06 +/- 4`

– select values four days around April 6th, 2003.`1980.233`

– select values for the exact time 1980-03-23T14:28:40 (this will rarely be a good idea)`1980.233 +/- 1`

– interpreted as a Julian year: select values between 1980-03-22T14:28:40 and 1980-03-24T14:28:40 (i.e., a tolerance remains in days even for julian years)`54221`

– interpreted as MJD: select values at any time on 2007-05-01 (this is because it hits midnight; 54221.5 would match the exact moment 2007-05-01T12:00:00 instead)`2454222.0 .. 2454225.0`

– interpreted as JD: match values between 2007-05-01-12-00-00 and 2007-05-05-12-00-00

- Equality is translated to between min and max; this, for now, does
*not*expand days, i.e., for interval comparisons, 1855-01-01 is exactly midnight on January 1st, 1855 (in the scale and for whatever position the data happens to adopt). - Input intervals, whether given as ranges or using +/-, are compared for overlap with the interval given by min/max.
- Comparisons are a bit funky: < is translated to “smaller than the
lower bound”, <= to “smaller than the
*upper*bound”. Similarly, > is “larger than the upper bound” and >= “larger than the lower bound”. This particular behaviour we might reconsider in the future.

The grammar is identical to the one of numeric expressions, except that floatLiteral is dateLiteral with the exception of pmExpr that is

pmExpr := dateLiteral pmOp floatLiteral

here. The dates themselves currently follow the regular expression
`[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]`

. This will
be relaxed in the future.

**Note:** Contrary to what Vizier does, the default on string
expressions on this site is to match expressions literally (i.e., the default
operator is == rather than ~).

Some of the operators below work on *patterns*. A pattern
contains the meta characters [, ], *, and ?; all others are "normal"
characters matched literally. The * matches zero or more characters, the
? exactly one; characters in square brackets match any character enumerated
within them. You can use ranges like "A-Z" within square brackets. If the
first character in the square brackets is a caret (`^`

), all
characters except the ones listed are matched.

The following operators can be used when matching strings – when we talk about literals as operands, metacharacter interpretation is suppressed (i.e., the strings are matched literally), otherwise we talk about patterns.

Note that both patterns and literally interpreted strings must match
the *whole* data field. This is substantially different from the
usual regular expression engines (but corresponds to what you may know
from filename patterns).

`==`

– matches if your input matches the entire input field literally, i.e., preserving case. This is the default when no operator is given, but it may be necessary to use the operator if, e.g., your pattern would be interpreted as an expression.`=~`

– Like ==, but ignoring case.`~`

– selects pattern matches of operand ignoring case.`=`

– case-sensitive version of`~`

.`!~`

– reversal of`~`

(i.e., matches when`~`

doesn't).`!`

– reversal of`=`

.`!=`

– selects all but literal matches of the operand.`>=`

,`>`

,`<=`

,`<`

– selects strings larger or equal, larger, etc., than the operand. Comparisons happen according to ASCII value (i.e., in the C locale).`=,`

and`=|`

– these start enumerations, i.e., they match when a literal in the following list is matched. There are two versions of these so you can include either the comma or the vertical bar in your literals. If you need both, you are out of luck.

Here is a table of expressions and their matches:

Expression | Data | ||||||||
---|---|---|---|---|---|---|---|---|---|

M4e | M4ep | m4e | A4p | O4p | M* | m|a | x,a | =x | |

M4e | X | ||||||||

=x | |||||||||

== =x | X | ||||||||

!= =x | X | X | X | X | X | X | X | X | |

==M4e | X | ||||||||

=~m4e | X | X | |||||||

=~m4 | |||||||||

~* | X | X | X | X | X | X | X | X | X |

~m* | X | X | X | X | X | ||||

M* | X | ||||||||

!~m* | X | X | X | X | |||||

~*p | X | X | X | ||||||

!~*p | X | X | X | X | X | X | |||

~?4p | X | X | |||||||

~[MO]4[pe] | X | X | X | ||||||

=[MO]4[pe] | X | X | |||||||

>O | X | X | X | X | |||||

>O5 | X | X | X | ||||||

>=m | X | X | X | ||||||

<M | X | X | |||||||

=|M4e| O4p| x,a | X | X | X | ||||||

=,x,a,=x,m|a | X | X |

Maybe some comments are in order:

`=x`

matches nothing since the leading = is interpreted as an operator ("match as pattern case-insensitively"), and there is nothing matching the pattern x in the sample (in this case , only x and X would match).`== =x`

is the simplest way to search for "=x" – its analogues work for all other metacharacters as well.`=~m4`

matches nothing, because the pattern has to match the*entire*string, and all strings in the sample have some annexes to their variations of m4.`M*`

only matches "M*" since the default on our system is literal matching. This expression would have behaved completely differently on Vizier.

The following grammar describes the parsing of string expressions:

simpleOperator ::= "==" | "!=" | ">=" | ">" | "<=" | "<" simpleOperand ::= Regex(".*") simpleExpr ::= simpleOperator + simpleOperand commaOperand ::= Regex("[^,]+") barOperand ::= Regex("[^|]+") commaEnum ::= "=," commaOperand { "," commaOperand } exclusionEnum ::= "!=," commaOperand { "," + commaOperand } barEnum ::= "=|" barOperand { "|" + barOperand } enumExpr ::= exclusionEnum | commaEnum | barEnum patLiterals ::= CharsNotIn("[*?") wildStar ::= "*" wildQmark ::= "?" setElems ::= CharsNotIn("]") setSpec ::= "[" + setElems + "]" patElem ::= setSpec | wildStar | wildQmark | patLiterals pattern ::= patElem { patElem } patternOperator ::= "~" | "=" | "!~" | "!" patternExpr ::= patternOperator pattern stringExpr ::= enumExpr | simpleExpr | patternExpr | nakedExpr