Project

General

Profile

DatabaseDocumentation

Database

NOTE: if you are not able to read or write to the database, please contact Jon Paley

Table of Contents

Help! Who to contact?

If at any point you get hung up with a database-related issue, feel free to contact Jon Paley for help.

Available Databases

There are currently three NOvA databases that most people will care about:

  1. development
    1. dbhost: ifdbdev.fnal.gov
    2. dbname: nova_dev
    3. dbport: 5433
  2. production
    1. dbhost: ifdbprod.fnal.gov
    2. dbname: nova_prod
    3. dbport: 5433
  3. hardware
    1. dbhost: ifdbprod.fnal.gov
    2. dbname: nova_hardware
    3. dbport: 5432 (default)

A general reader account is set up on these databases that will allow anyone to connect via a password from anywhere in the world. If all you need to do is perform queries on the database, then you are encouraged to connect via this account. Otherwise, database connections require a valid kerberos ticket for the FNAL.GOV realm. Only those who will be inserting and updating data into tables will be given an account on the nova_prod database. The nova_dev database is pretty open (still need to request write access).

Environment Variables

It is recommended that the following environment variables are defined:

  • $NOVADBHOST
  • $NOVADBNAME
  • $NOVADBPORT
  • $NOVADBUSER
  • $NOVADBPWDFILE

Eg, in the C-shell ([t]csh), one would type:

> setenv NOVADBHOST ifdbprod.fnal.gov
> setenv NOVADBNAME nova_prod
> setenv NOVADBPORT 5433
> setenv NOVADBUSER nova_reader
> setenv NOVADBPWDFILE [path/to/text/file]

The above values give read-only access to the production database.

In bash, instead one would type (for example):

> export NOVADBHOST=ifdbprod.fnal.gov

The $NOVADBPWDFILE variable is the location of password file for read-only access. PLEASE do not hardcode the actual password in any of your code! The contents of the file should be a single line with the password for the read-only account. If you do not know the password, please contact a dB expert. In order to enable read-only access, the following conditions must be met:

  • $NOVADBUSER = nova_reader
  • $NOVAPWDFILE is defined and the file exists with the correct password

Other useful, but perhaps not critical, environment variables are:

  • $NOVADBCACHEDIR: directory where failed SQL queries are cached in a simple text file. The name of the file is "NAME".cache, where "NAME" is the name of the dB table. SQL queries typically fail when there is no connection to the dB server.
  • $NOVADBNCNXTRIES: sets the number of tries allowed to connect to the dB. The default is 10. One can set this by either setting this env. variable, or setting it within the C++ code. Settings in code override environmental variable settings.

Direct dB Access

To directly access the database, one would type (assuming the NOvA environment has been properly set up):

>psql -h $NOVADBHOST -d $NOVADBNAME -p $NOVADBPORT -U nova_reader --password

for read-only access.

If you need direct access with whatever dB write privileges you have been granted, then:

> setenv NOVADBHOST ifdbprod.fnal.gov
> setenv NOVADBNAME nova_prod
> setenv NOVADBPORT 5433
> setenv NOVADBUSER $USER

and

>psql -h $NOVADBHOST -d $NOVADBNAME -p $NOVADBPORT -U $NOVADBUSER

for kerberos-authenticated access.

The scope of this documentation does not cover PostgreSQL commands in detail (eg, there is already excellent PostgreSQL documentation online), however, here are some quick examples:

  • to set the search path correctly for the detector you are interested in (eg. FarDet):
    nova_prod-> set search_path to FarDet
    
  • to see a list of tables:
    nova_prod=> \dt 
                          List of relations
     Schema |              Name              | Type  |   Owner    
    --------+--------------------------------+-------+------------
     public | apd_high_voltages              | table | nova_admin
     public | asic_register_settings         | table | nova_admin
     public | daq_resources_by_run           | table | nova_admin
     public | dcm_application_parameters     | table | nova_admin
     public | dcm_data_dev_parameters        | table | nova_admin
     public | dcm_fpga_firmware_locations    | table | nova_admin
     public | dcm_fpga_parameters            | table | nova_admin
     public | dcm_system_parameters          | table | nova_admin
     public | dcs_fault                      | table | nova_admin
     public | dcs_hardlimit                  | table | nova_admin
     public | dcs_report                     | table | nova_admin
     public | dso_data_regulator_settings    | table | nova_admin
     public | feb_enable_masks               | table | nova_admin
     public | feb_fpga_firmware_locations    | table | nova_admin
     public | feb_pulser_parameters          | table | nova_admin
     public | named_global_configurations    | table | nova_admin
     public | named_subsystem_configurations | table | nova_admin
     public | pixel_enable_masks             | table | nova_admin
     public | pixel_offsets                  | table | nova_admin
     public | pixel_thresholds               | table | nova_admin
     public | pot_by_run                     | table | nova_admin
     public | runs                           | table | nova_admin
     public | subruns                        | table | nova_admin
     public | timing_system_settings         | table | nova_admin
     public | trigger_by_run                 | table | nova_admin
    (25 rows)
    
    
  • to see the details of a table:
    nova_prod=> \d runs;
                       Table "public.runs" 
        Column     |            Type             | Modifiers 
    ---------------+-----------------------------+-----------
     run           | integer                     | not null
     nsubruns      | integer                     | 
     nevents       | integer                     | 
     partition     | integer                     | 
     runtype       | text                        | 
     config        | text                        | 
     configid      | integer                     | not null
     detector      | text                        | not null
     tstart        | timestamp without time zone | not null
     tstop         | timestamp without time zone | 
     shifter       | text                        | 
     begin_comment | text                        | 
     end_comment   | text                        | 
     inserttime    | timestamp without time zone | not null
     insertuser    | text                        | not null
     npot          | real                        | 
     nactivechan   | integer                     | 
    Indexes:
        "runs_pkey" PRIMARY KEY, btree (run, detector)
    
    
  • to execute a "simple" query:
    nova_prod=> select run,nsubruns,config,tstart,nactivechan from runs order by run desc;
    
      run  | nsubruns |       config        |       tstart        | nactivechan 
    -------+----------+---------------------+---------------------+-------------
     11841 |          | InitialGlobalConfig | 2011-03-28 17:46:05 |        8360
     11840 |          | InitialGlobalConfig | 2011-03-28 16:23:24 |        8360
     11839 |        1 | InitialGlobalConfig | 2011-03-28 15:24:48 |        8360
     11838 |       19 | InitialGlobalConfig | 2011-03-27 19:48:27 |        8360
     11837 |       25 | InitialGlobalConfig | 2011-03-26 19:48:03 |        8360
     11836 |        1 | InitialGlobalConfig | 2011-03-26 19:43:51 |        8360
     11835 |        1 | InitialGlobalConfig | 2011-03-26 19:23:18 |        8360
     11834 |          | InitialGlobalConfig | 2011-03-26 17:13:10 |        8360
     11833 |        1 | InitialGlobalConfig | 2011-03-26 16:23:04 |        8372
     11832 |        2 | InitialGlobalConfig | 2011-03-26 15:14:19 |        8372
     11831 |        1 | InitialGlobalConfig | 2011-03-26 15:07:07 |        8372
     11830 |        1 | InitialGlobalConfig | 2011-03-26 14:05:31 |        8372
     11829 |        1 | InitialGlobalConfig | 2011-03-26 13:20:57 |        8372
     11828 |        9 | InitialGlobalConfig | 2011-03-26 04:42:27 |        8210
     11827 |        1 | InitialGlobalConfig | 2011-03-26 04:21:42 |        8210
     11826 |        4 | InitialGlobalConfig | 2011-03-26 01:00:08 |        8210
     11825 |        2 | InitialGlobalConfig | 2011-03-25 23:18:34 |        8210
     11824 |        1 | InitialGlobalConfig | 2011-03-25 22:57:09 |        8210
     11823 |          | InitialGlobalConfig | 2011-03-25 21:47:39 |        8382
     11822 |        1 | InitialGlobalConfig | 2011-03-25 21:15:29 |        8382
     11821 |        1 | InitialGlobalConfig | 2011-03-25 20:31:44 |        8382
     11820 |        3 | InitialGlobalConfig | 2011-03-25 17:26:55 |        8382
     11819 |        1 | InitialGlobalConfig | 2011-03-25 16:27:23 |        7590
     11818 |        1 | InitialGlobalConfig | 2011-03-25 14:54:47 |        7590
     11817 |        1 | InitialGlobalConfig | 2011-03-25 14:31:09 |        8382
     11816 |       15 | InitialGlobalConfig | 2011-03-24 23:17:15 |        8382
     11815 |        1 | InitialGlobalConfig | 2011-03-24 22:54:18 |        8382
     11814 |       28 | InitialGlobalConfig | 2011-03-24 22:13:34 |        8382
     11813 |       18 | InitialGlobalConfig | 2011-03-24 21:59:21 |        8382
     11812 |        2 | InitialGlobalConfig | 2011-03-24 21:50:25 |        8382
     11811 |        2 | InitialGlobalConfig | 2011-03-24 20:14:25 |        8382
     11810 |       23 | InitialGlobalConfig | 2011-03-24 20:01:24 |        8382
     11809 |       30 | InitialGlobalConfig | 2011-03-24 19:42:44 |        8382
     11808 |        6 | InitialGlobalConfig | 2011-03-24 14:05:46 |        8382
     11807 |        9 | InitialGlobalConfig | 2011-03-24 05:02:03 |        8141
     11806 |        1 | InitialGlobalConfig | 2011-03-24 04:27:52 |        8026
     11805 |        1 | InitialGlobalConfig | 2011-03-24 04:22:06 |        8026
    --More--
    

The NOvA C++ DBI

The NOvA Database Interface (DBI) is currently a C++ package designed for both the online and offline software to use to read and write to the database. The interface is supposed to be simple and flexible while also providing optional controls over how data are read and written.

The first step of making use of the DBI is to either create a new table definition file or to mamke use of an existing one. Tables are defined via XML. In general, table definition files are stored in an appropriate NovaDatabase/tables subdirectory. Even if you do not have to create a new table, it is useful to understand how a table is defined when trying to extract data from the database. Specifically, to make use of a table, one must know the names of the columns in the table.

Defining a Table in XML (<dBTable>)

The NOvA DAQ database interface is XML based. The dB XML schema is defined in NovaDAQDatabase/config/NovaDAQDatabase.xsd. An example of a table definition in XML is in NovaDatabase/tables/RunHistory/Runs.xml.

A table definition in XML requires:
  • a table name ("tableName" attribute),
  • a database host ("dbHost" attribute),
  • a database name ("dbName" attribute),
  • a list of columns via a sequence of <dBColumn> tags, and
  • a primary key via the <dBPKey> tag
Optionally, one may also include:
  • a database port using the "dbPort" attribute
  • a list of unique combinations of columns, which would require that each row of the table have unique values for each set of columns
  • a list of checks that a column value must conform to (eg, "runnum>0")
  • a list of rows via a sequence of <dBRow> tags that contain data, and the API allows one to use only that data. This may be useful for testing.

NOTE: the values of the database host, name, port and user in the XML table definition file may be overloaded by the $NOVADB* environment variables.

Within the <dBTable> tag, several other optional attributes may be used to configure the table:

  • Table type "dbType": there are three kinds of tables on may define:
    • RunSubrun: rows are uniquely determined by a range of runs/subruns along with the columns listed in the <dBPKey> tag.
    • Timestamp: rows are uniquely determined by a range of timestamps along with the columns listed in the <dBPKey> tag.
    • Generic: rows are uniquely determined by the columns listed in the <dBPKey> tag.
  • Override dB data with XML-defined rows ("xmlOverDB")
  • Disable connections to the database ("ignoreDB"): if you'd like to disable read/writes to the database
  • Create/Update history: if you'd like to keep track of who and/or when rows were either inserted or updated, add the option attributes addInsertTime="true", addInsertUser="true", addUpdateTime="true" and/or addUpdateUser="true" attributes to <dBTable>. This will result in a corresponding column to be added to the table; the DBI will handle populating these columns whenever a row is added or modified. Note that inserttime may later be used as a useful validity context.

Columns (<dBColumn>)

A column in a table may be of basically any data type. A column must have [unique] name (within the table), defined in the name attribute. The data type is defined in the colType attribute.

  • Supported data types: short, int, long, bigint, float, double, bool, timestamp, string, autoincr. A couple of notes:
    • "bigint" = "long". bigint is used in SQL, long is probably more familiar to those working in C++
    • "autoincr" is a column whose value is automatically incremented upon insert. Do_not set this value, it is done for you (the dbi should not allow this column to be set directly anyway)
    • if there is a new data type that you feel needs to be supported, please contact Jon Paley]
  • Allow NULL: if the value of the column must be set before writing to the database or external file, add the null="false" attribute to the <dBColumn> tag.
  • Foreign key: if the column is a foreign key into another table, give the name of the table to the fKeyTable attribute. [NOTE: this feature is currently disabled.]

Primary Key (<dBPKey>)

A primary key is a set of columns that uniquely define a row in the table. Although Postgresql does not require a table to have a primary key, it is unwise to create a table without one defined. Therefore, the C++ dbi requires that one is defined in the XML definition of the table. A primary key is defined by adding a list of columns (previously defined) between the <dBPKey> tag.

Unique combinations of columns (<dBUnique>)

One may further constrain a table such that one or more combinations of column values be unique in each row of a table. As an example, image that one is using a column of type autoincr as a primary key. But the table has columns for DCM, FEB, APD Pixel and a timestamp that must be unique in each row. One may ensure that this combination of columns is unique for each row by adding the column names to a <dBUnique> tag. Note that the uniqueness is only enforced upon insert/update into the database.

Checks on column values (<dBCheck>)

Say we want to constrain the values of a column to adhere to some rule. For example, we want all run numbers to be greater than 10000. Then we can add a line in the XML table definition file that reads:
<dBCheck constraint="runnum>10000"/>

Rows (<dBRow>)

Table rows may be defined in the XML file as either base values or perhaps for testing purposes. The format is something like:

<dBRow>
  <dBCol name="myInt"       value="929"/>
  <dBCol name="myTimeStamp" value="2010-09-29 09:29:29"/>
  <dBCol name="myFloat"     value="929.929"/>
  <dBCol name="myString"    value="929.929"/>
</dBRow>
A couple of things to note:
  • regardless of the data type of the column, the "value" must be given between quotes
  • if the table is of type "RunSubrun", then <dBRow> should look something like <dBRow runmin="10001" runmax="10423">
  • if the table is of type "Timestamp", then <dBRow> should look something like <dBRow tmin="2010-11-29 09:38:23" tmax="2010-11-30 01:56:12">
  • NOT all columns in a row need to be listed with the <dBRow> tag

Some More Concepts

At this point we assume that you have a valid XML table definition file. Before we go into details of the C++ DBI, there are a few more concepts to explain:

  • "DISTINCT" columns: If you wish to have the DBI return only one row per unique set of column values, add to the list of "DISTINCT" columns via the Table::AddDistinctColumn() methods. "DISTINCT" columns may later be removed from the list via the Table::RemoveDistinctColumn() methods.
  • "SELECT" columns: If you wish to have only a subset of columns of a table returned from a dB query, add to the list of "SELECT" columns. By default this list is empty and dB queries returns all columns in the table.
  • Connections to the dB: By default, one does not need to explicitly create a connection to the database; one is created and destroyed for the user everytime a read or write is done. However, creating and destroying connections is very inefficient if you will be reading or writing frequently. You may therefore override the default behavior by getting a "persistent" connection to the database. It is then the user's responsibility to close the connection (although this is also done when the table is destroyed).

How to...

  • read from the database using the read-only account: make sure the appropriate environment variables are set:
    setenv NOVADBHOST ifdbprod.fnal.gov
    setenv NOVADBNAME nova_prod
    setenv NOVADBPORT 5433
    setenv NOVADBUSER nova_reader
    setenv NOVADBPWDFILE /nova/app/db/nova_reader_pwd
    
  • create a table object: Table* t = new Table("/NovaDAQDatabase/tables/RunHistory/Runs.xml");
  • get a persistent connection to the dB: t->GetConnection() (returns false if no connection is made)
  • close a persistent connection to the dB: t->CloseConnection() (returns false if no connection exists)
  • add a row to a table:
Row* r = t->NewRow();
r->Set("ColA","ColAData");
r->Set("ColB","ColBData");
t->AddRow(r);
delete r;
  • load data from database into table: t->LoadFromDB();
  • write data from table into database: t->WriteToDB(); NOTE: only new or modified rows are written to the dB.
  • load data from comma-separated-value (csv) file into table: t->LoadFromCSV("/path/to/data.csv");
  • write data from table into csv file: t->WriteToCSV("/path/to/data.csv");
  • update an existing entry:
t->LoadFromDB();
t->GetRow(3).Update("ColA","NewColAData");
t->WriteToDB(); 
  • update rows from a csv file and write to dB:
t->UpdateFromCSV("/path/to/data.csv");
t->WriteToDB()
  • only load rows from the dB that fall within some range. The code below would only read in entries from the database that fall within the run range [1001,1010]
t->Clear();
t->SetValidityRange("run","1001","1010"); // "run" is a column in the table
t->LoadFromDB();
  • only load rows from the dB for a distinct set of columns. The code below assumes the table has columns called "dcmid", "febid" and "pixelid". The primary key is some combination of these three columns as_well as some other distinguishing columns (eg, a timestamp).
t->Clear();
t->AddDistinctColumn("dcmid");
t->AddDistinctColumn("febid");
t->AddDistinctColumn("pixelid");
t->LoadFromDB();
  • load rows from the dB in some sort of order:
t->Clear();
t->AddOrderColumn("run");
t->AddOrderColumn("subrun");
t->LoadFromDB();
  • load rows from the dB according to when they were inserted:
t->Clear();
t->AddOrderColumn("inserttime");
t->LoadFromDB();
  • load rows from the dB according to when they were updated:
t->Clear();
t->AddOrderColumn("updatetime");
t->LoadFromDB();

(WARNING: "updatetime" could be NULL if the row was never updated, so this could give unexpected results...)

My offline code won't compile/run when I add dB code!

Be sure to add the following lines near the bottom of the GNUmakefile in your package:

include SoftRelTools/arch_spec_novadb.mk

override LIBLIBS += -lNovaDatabase -lNovaDAQUtilities -L${XERCESCROOT}/lib -lxerces-c -L${PGSQL_DIR}/lib -lpq

Using NovaCSVtoDB to Load Data into a Database Table

  1. Put the data you wish to load into the dB in a comma-separated-value (csv) file (here we name the file "table.csv").
  2. Execute the command: > NovaCSVtoDB table.xml table.csv

Creating a table in the Database

In general, table creation in the production database is restricted to just a few users. Anyone will be granted permissions to create tables in the development database for testing purposes.

Assuming you have a XML table definition file, execute the following:

> createTableInDB "full/path/to/table.xml"