NAME

RDB - object methods for dealing with rdb files


SYNOPSIS

  use RDB;

  $rdb = new RDB;
  $rdb->open( 'foo.rdb' ) || die;

  $rdb = new RDB 'foo.rdb' or die;

  $rdb = new RDB \*STDIN or die;

  $rdb = new RDB ( 'name' => 'S', 'id' => 'N' );

  $rdb->init( );
  $rdb->init( 'name' => 'S', 'id' => 'N' );

  $rdb->add_col( 'slap' => 'S', 'gurgle' => 'N' );
  $rdb->add_col( $other_rdb );

  $rdb->delete_col( 'gurgle' );

  @defs = ( 'name' => 'S', 'id' => 'N' );
  $rdb->init( \@defs );
  
  $rdb->init( $other_rdb );

  $rdb->rewind;

  $rdb->bind( { col1 => \$col1, col2 => \$col2 } );
  while ( $rdb->read( ) ) { print $col1, $cols, "\n"; }

  while( $rdb->read( \%data ) ) { ... }
  while( $rdb->read( \@data ) ) { ... }

  while( $verbatim_line = $rdb->read_line ) { print $verbatim_line };

  $rdb->write_hdr( );
  $rdb->write_hdr( \*STDOUT );

  $rdb->write( \%data );
  $rdb->write( \@data );
  $rdb->write( @data );

  $rdb->set( \%attr );
  $rdb->set( { AlwaysBind => 1 } );
  
  @header_var_names = $rdb->vars;

  $foo = $rdb->getvar( 'foo' );
  $rdb->setvar( foo => 33 );
  $hdrvars = $rdb->getvars;
  print $hdrvars->{foo};
  $rdb->delvar ( 'foo' );


DESCRIPTION

This module eases use of RDB data files. It creates RDB objects which contain the necessary information for interpreting and manipulating RDB files.


Constructor

new [file or filehandle, [mode]]|\@defs]

new is the constructor, and must be called before any other methods are invoked. It creates an RDB object. It can optionally be passed a filename to be opened and an optional mode or a reference to a glob (which is interpreted as an already open file handle). It then invokes the RDB::open method on the file/file handle. If a mode is not specified, it is opened with mode <. If the passed argument is a reference to an array, RDB::init is invoked with that argument.


Object action methods

bind( \%bindhash [, \%attrs ] )

bind simplifies the processing of rdb files by allowing the automatic assignment of values read from the rdb file to Perl variables or arrays. Each time that the read method is called with no arguments, it will update the variables specified in preceding calls to bind. bind takes a hash of columns to be bound; the keys are the column names, their values are references to either scalars or arrays. In the former case, the scalar will be assigned the column's value. In the latter case, the column's value is pushed onto the end of the array. ( Note that the argument to bind is a hash just to enforce the correct number of items.) For example,

        $rdb->bind( { col1 => \$col1, col2 => \$col2 } );
        while ( $rdb->read( ) )
        {
          print "$col1, $cols\n";
        }

Or, using arrays,

        my ( @col1, @col2 );
        $rdb->bind( { col1 => \@col1, col2 => \@col2 } );
        1 while ( $rdb->read( ) );
        for( $i = 0 ; $i < @col1 ; $i++ )
        {
          print $col1[$i], ' ', $col2[$i], "\n";
        }

If the same column is specified in succeeding calls to bind, the new binding will override the previous binding.

However, if the same column should be bound to multiple variables, the Override attribute may be reset using the second argument to bind:

        $rdb->bind( { col1 => \$col1, col2 => \$col2 } );
        $rdb->bind( { col1 => \$col1_copy }, { Override => 0 } );

The column col1 will now be written to both $col1 and $col1_copy.

close

explicitly close an rdb file. This usually need not be called, as the file will be closed when the RDB object is destroyed.

init( @defs|\@defs|$rdb)

Initialize the rdb object with a set of columns. A column is specified by both a name and a definition. Definitions technically consist of four parts: the column name, it's type, output alignment, and description. The latter are optional and are usually omitted. Column types are one of N, S, or M, for numeric, string, and month data. Alignment is one of < or >.

init is passed either an array (or list), an array reference, or a reference to another RDB object. In the latter case, the column definitions of the other object are duplicated. In the former cases, the array must contain column name and definition pairs.

The definition may take any of the following forms:

Any of these forms may be mixed:

  $rdb->init( c1 => 'N', 
              c2 => [ 'N', 32 ],
              c3 => { type => 'N', desc => 'What A Nice Column' } );
init_tpl( $file_name | \$tpl_string )

Initialize an RDB object from an RDB header template. If the passed argument is a scalar, it should contain the name of a file containing the template. If it's a reference it should be a reference to a scalar containing the template. An RDB header template is description of the header in the following format.

Each column is specified on a separate line, and contains up to four white space delimited fields:

  1. an optional field containing the column's zero based index. If not specified, the ordering of the field in the template is used. For example,

       fee S
       fie N
       fo  N
       fum N
            
    is equivalent to
    

       0 fee S
       1 fie N
       2 fo  N
       3 fum N
    

    Be careful when mixing lines with and without an index:

         fee S
       2 fie N
         fo  N
         fum N
    

    is equivalent to

       0 fee S
       2 fie N
       2 fo  N
       3 fum N
    

    which will result in an error. Indices must be unique.

    There's a further degeneracy which must be avoided:

       3 N S
    

    Is that an index of 3, a name of N and a type of S, or is that a name of 3, a type of N and a description of S? It is parsed as the former. To get the latter interpretation, you'll have to include an index field.

  2. the column name. it may appear in quotes.

  3. the column type. it may include the column width as a prefix

  4. an optional column description

Comment lines may be present, and are indicated by a leading # character.

For example,

  # P-to-H Decenter parameters derived from XRCF HSI off-axis images
  # (single shell); used pitch=0, yaw=-20 arcmin data.
  #
   0               fee  6S      what i get paid
   1               fie 10N      upon you
   2               fo  10N      fight or no?
   3               fum  9N      ble
write_tpl( $filename | $fh)

Write an RDB template for the current RDB object. The argument may be a scalar, it which case it should contain the name of a file to which to write the template, or a filehandle.

add_col( @defs|\@defs|$rdb)

Add new columns to the rdb object. See the description of the init() method for the specification of the column names and definitions. Existing columns are not duplicated; their definitions are changed to the passed type.

delete_col( @cols )

delete the specified columns from the object. This is only applicable to RDB files open for writing, and only before the RDB header has been written out.

        $rdb->delete_col( 'a', 'b' );
set( \%attr )

set specifies the values of various attributes for the object. The passed reference should point to a hash which may contain the following keys:

AlwaysBind

If this is set, e.g.,

  $rdb->set( { AlwaysBind => 1} )

if RDB::bind has been called to set up bindings between columns and Perl variables, the Perl variables will always be updated, regardless of which form of RDB::read is called.

open( file or filehandle [, mode] )

open connects to a file (if it is passed a scalar) or to an existing file handle (if it is passed a reference to a glob). If mode is not specified, it is opened as read only, otherwise that specified. Modes are the standard Perl-ish ones (see the Perl open command). If the mode is read only or read/write, it reads and parses the RDB header. It returns the undefined value upon error.

read( [\%data|\@data] )

Read in the next line from the rdb database, storing the columns into either a hash keyed off of the column names (if passed a reference to the hash), an array (if passed a reference to the array), or into scalars specified by previous calls to the bind method (if read is called with no arguments). It returns the undefined value upon end of file. It does not check to ensure that there are enough columns in the input. For example:

        $rdb->read(\%data);
        print "foo = $data{foo}\n";

        $rdb->read(\@data);
        print "The first column has value $data[0]\n";

        $rdb->bind( { foo => \$foo } );
        $rdb->read();
        print "Foo = $foo\n";
read_line

read_line reads a line from the rdb file without parsing it (even to chop off the end). It returns undef upon end of file.

rename( \%renamehash )

rename is passed a hash of columns to be renamed; the keys are the old column names, their values are the new names. It's a hash just to enforce the correct number of items. For example,

        $rdb->rename( { oldcol => 'newcol', foocol => 'boocol' } );
reopen

reopen reopens a file that has previously been opened and closed, positioning the filepointer to where it was before it was closed. It retains the previous access mode. It does not reopen files passed to the original call of rdb::open as references. It returns the undefined value upon error.

rewind

Rewind the file back to the first data position (i.e., after the header). Obviously this only works if the file is truly a file, and not a pipe.

write( @data|\@data|\%data )

Write the passed data to the rdb file. If an array or a reference to an array is passed, it must have the correct number of columns, and must be in the same order as the columns in the rdb file. If a reference to a hash is passed, the data are extracted from the hash.

write_hdr( [<filehandle>] )

Write the RDB header to the passed file handle, if present, or to the filehandle associated with the RDB object. Header lines containing header variables will be updated with the most recent value. New header variables are appended to the end of the header. write_hdr is automatically called for you if you try to write or close the object.


Object data methods

Once the object is created, you can access the object's attributes using the following functions:

add_comments( @comments )

Append the passed list of comments to the header comment lines. The comments should neither begin with a leading pound sign nor end with a newline. This method doesn't add any leading white space to the comment, so you may wish to do that for the sake of legibility. If the comment line defines a header variable, the first character must be a :. You can later change it's value with setvar() or re-read it with getvar().

col

This returns a list containing the names of the columns in the RDB table if the calling routine is expecting a list, otherwise it returns a reference to the list of columns.

        @cols = $rdb->col;
        $cols_ref = $rdb->col;
comments( [@comments] )

This returns a list containing the header comment lines. The leading pound signs and trailing newline are removed.

        @comments = $rdb->comments;
        $rdb->comments( @replacement_comments );

comments takes as an optional argument a list containing new comments. These will replace the existing ones. This method doesn't add any leading white space to the comment, so you may wish to do that for the sake of legibility. To delete the comments, pass it undef:

        $rdb->comments( undef );
defs( [$name | @names] )

This method is deprecated. Use type instead.

defn( [$name | @names] )

If called with no arguments, defn returns a hash containing the column definitions, keyed off of the column names. The optional arguments are names of columns for which to return a definition. In a scalar context it returns the definition for the first argument; in an array context it returns an array of definitions. If a column doesn't exist, its definition is given as the undefined value.

A definition is returned as a hash reference. The hash has keys type, width, align, and desc. Don't change the contents of the hashes returned!!

fh

This returns the filehandle to which the RDB file is attached, or the undefined value if it hasn't yet been attached. To attach a filehandle to an existing RDB object, use the RDB::open method.

        $fh = $rdb->fh;
file

returns the filename or handle passed to the new or open method.

ncols

This returns the number of columns in the file.

        $ncols = $rdb->ncols;
pos

If called with no arguments, pos returns a hash relating the column names to their zero-indexed position, keyed off of the column names. The optional arguments are names of columns for which to return a position. In a scalar context it returns the position for the first argument; in an array context it returns an array of positions. If a column doesn't exist, its position is given as the undefined value.

        %pos = $rdb->pos;
        $pos_of_col = $rdb->pos( $col );
        @pos{@cols} = $rdb->defs(@cols);
vars

Returns the names of the header variables.

getvar( $var )

Returns the value of the header variable $var if it exists. Otherwise it returns undef.

getvars

Returns a reference to an hash containing all of the header variables.

setvar( $var, $value )

Set the header variable $var to $value. The variable is created if it doesn't exist.

delvar( $var )

Delete the header variable $var. This also deletes the header comment line which defines it.

type( [ $name | @names ] )

If called with no arguments, type returns a hash containing the column types, keyed off of the column names. The optional arguments are names of columns for which to return a type. In a scalar context it returns the type for the first argument; in an array context it returns an array of types. If a column doesn't exist, its type is given as the undefined value.

        %types = $rdb->type;
        $type_of_col = $rdb->type($col);
        @types{@cols} = $rdb->type(@cols);
expr( reference or string )

This is deprecated and will be removed in the future.

Given an rdb expression (see the rdb_expr documentation), it returns a string which evaluates the expression. If passed a reference, the reference must be to an array of tokens. If passed a scalar, the scalar is parsed into tokens. Note that the whitespace requirements for the passed scalar is the same as that in the rdb_expr documentation.

The returned expression assumes that an input row of the RDB table is available in the F list.

For example, given the following RDB table,

  Name    Address Zip
  S       S       N
  me      here    20
  you     there   30

the function call

  print $rdb->expr( 'Name eq "Bill plus jane" and Zip eq 3' );

produces the following eval'able string.

  $F[0] eq 'Bill plus jane' && $F[2] == 3


AUTHOR

Diab Jerius ( djerius@cfa.harvard.edu )