(, ), *, ;. The characters can either by placed inside of quotes, or may ``escaped''
with a backslash (\) character. All of the rdb commands that use expressions can also read
them from a file, where the characters don't have to be quoted. An added
benefit in putting expressions in a file is that you can put comments in
there. Anything after a # character is ignored.
Comparison operators may be of the form: gt, ge, lt, le,
eq, ne, mat, nmat. For example,
NAME eq Hobbs
The first six are the usual operators, e.g. name eq Hobbs
or COUNT gt 100. The last two stand for 'match' and 'non-match' and are used for pattern
matching. They are exactally the same as using the Perl operators =~ or !~ respectively, except that pattern matching can be specified easier in
expressions, as in:
NAME mat /[Hh]obbs/ <<< First letter either case NAME mat /hobbs/i <<< any combination of case NAME nmat /[aeiou]/i <<< names without vowels
A warning message is produced on STDERR if either of mat or nmat
is used with a numeric type column, but the execution continues. It does
not check the =~ or !~ forms.
Logical constructors or and and may be used. Special values of null to indicate an empty data value or any to indicate any value are available.
Sometimes the expression parser doesn't recognize a value appropriately and casts it to the wrong type. This usually happens while trying to use a complicated Perl expression. There are three operators which will attempt to cast the following constant expression as the requested type. It will not cast a string as a number, though.
_fX Leave it verbatim
_fN It's a number
_fS Make it a string
You can use this feature to do some neat things:
label_text = _fX '("Label0", "Label1", "Label2")' \[ label \]
Certain functions are recognized as well:
abs,
atan2,
cos,
exp,
hex,
int,
log,
oct,
rand,
sin,
sqrt,
subst,
lc,
uc.
subst is special; it is the perl s string substitution operator, and requires the use of the =~ assignment operator. For a self referential substition, use the following
form:
zap =~ subst /foo/bar/
To assign a substition of another column, use:
\( zap = zig \) =~ subst /foo/bar/
all other. (Yes, it is clumsy).
Assignment expressions have the form
NAME <assignment op> <expression>
Valid assignment operators are =, =~, ++, --, +=, -=,
/=, *=, etc. Multiple expressions are separated by a semicolon.
Selection expressions have the form
<expression>
and return true or false.
Since column names and reserved words are parsed by the program, do not put the entire expression in a single pair of quotes as that will prevent the parsing. Also note that column names and reserved words need to be surrounded by blank spaces if they are not individually quoted. For example either form below is fine:
NAME eq "L Brown"
"NAME" "eq" "L Brown"
but do not use this form:
"NAME eq L Brown"
Note that in some rare cases there could be a column name that is identical to a data value specified in an expression using another column name that might cause a problem (this actually happened). For example if two column names are 'N' and 'T', and column 'N' has a data value of 'T', to select all rows where column 'N' is equal to 'T' the normal command would be:
table N eq T
Unfortunately the 'T' in the expression gets translated to 'column name T', not used as 'data value T'. That is, the expression askes for all rows where the data value of column N equals the data value of column T, legal, but not what was wanted. There is a simple workaround however. The 'T' in the expression can be escaped with a backslash to prevent the translation to a column name, as in the revised command:
table N eq '\T'
Thus either meaning can be specified, as desired.
if - elsif else clauses as well. This is most useful when expressions are in files. For
example:
if ( type eq A ) {
name = NEW ;
amt = count * 2 ;
type = 'AAA' ;
} elsif ( type eq B ) {
name = OLD ;
amt = count + 1000 ;
type = 'ZZZ' ;
} else {
name = FOO ;
amt = count + 2000 ;
type = 'YYY' ;
}
Note that the statements inside the decision blocks must be enclosed in curly braces. You must put spaces between all of the tokens, as on the command line, but you don't have to quote characters dangerous to the shell. Currently, all quotes around strings are treated as part of the string (yes, that's a ``feature''), and there's no way of using strings with whitespace in them.
$, the rest of the token is presumed to be a valid Perl variable name, and
is passed untouched to Perl. For example, lets say there are numerical
columns c, b, and g in the rdb table foo.rdb. One could compute
c = ( b * c ) ** 2 * ( b * g )
with the following command.
compute -x \$a = b \* c \; \$b = b \* g \; c = \$a \* \$a \* \$b
The -x option to compute will make it print out the Perl script that will be used
to generate the output. Note that rdb columns and Perl variables have
different name spaces, so you can mix them. The Perl routine in this
example is:
$. = 0;
while(<STDIN>)
{
chop ;
@F = split( /\t/, $_, 3 );
$a = $F[1] * $F[0] ;
$b = $F[1] * $F[2] ;
$F[0] = $a * $a * $b ;
print join( "\t", @F ), "\n" ;
}
Note that the rdb columns show up as elements of the array $F, while the rdb variables in the compute expression show up literally.
There are two caveats. One shouldn't use any form of $F[n]. If one actually wishes to precede a string with the character $
(for instance if you wish to check for the string $ave $ave $ave!), simply precede the first $ with a backslash character \, and it will be accepted as a string, rather than as a Perl variable name.
String eq '\$ave $ave $ave!'
_NR - the record number (starts at 1)