Project

General

Profile

How to get data out of the archive in your own program

Intro

The quick answer to this is that you need to use PostgreSQL in some way. Specific code examples are provided below. First some background.

There are a number of tables in the archiver database. In MicroBooNE, the database is slowmoncon_archive on the server hosted at smc-priv in the LArTF cluster, which is readable by the smcreader user. Eventually we will be making that available on the uboonegpvm cluster too.

The following tables are in the slowmoncon_archive database:

          List of relations
 Schema |     Name      | Type  | Owner 
--------+---------------+-------+-------
 public | array_val     | table | gahs
 public | chan_grp      | table | gahs
 public | channel       | table | gahs
 public | enum_metadata | table | gahs
 public | num_metadata  | table | gahs
 public | retent        | table | gahs
 public | sample        | table | gahs
 public | severity      | table | gahs
 public | smpl_eng      | table | gahs
 public | smpl_mode     | table | gahs
 public | status        | table | gahs

The two most important tables are channel and sample. The channel table has one entry per channel.

        Table "public.channel" 
    Column    |          Type          
--------------+------------------------
 channel_id   | bigint                 
 name         | character varying(100)
 descr        | character varying(100) 
 grp_id       | bigint                 
 smpl_mode_id | bigint                 
 smpl_val     | double precision       
 smpl_per     | double precision       
 retent_id    | bigint                 
 retent_val   | double precision       

The most important fields in the channel table are name and channel_id.

The sample table has one entry per sample.

       Table "public.sample" 
   Column    |            Type             
-------------+-----------------------------
 channel_id  | bigint                      
 smpl_time   | timestamp without time zone 
 nanosecs    | bigint                      
 severity_id | bigint                      
 status_id   | bigint                      
 num_val     | integer                     
 float_val   | real                        
 str_val     | character varying(120)      
 datatype    | character(1)                
 array_val   | bytea                       

The most critical fields in the sample table are channel_id and smpl_time. Numeric values are usually stored in float_val. The status_id and severity_id fields may also be important to you, depending on your application.

Example SQL queries

Query from channel table: select channel_id from channel where name = 'uB_ArPurity_PM01_0/LIFETIME';
Result:

 channel_id 
------------
       4299

Query from sample table: select smpl_time, float_val from sample where channel_id = 4299 and smpl_time >= '2015-08-18 00:00' and smpl_time < '2015-08-19 00:00';
Result:

        smpl_time          | float_val  
----------------------------+------------
 2015-08-18 01:38:24.917632 |        0.1
 2015-08-18 05:43:39.776906 | 0.00416998
 2015-08-18 09:48:49.990459 |        0.1

Query combining the above two queries into one: select smpl_time, float_val from sample where channel_id = (select channel_id from channel where name = 'uB_ArPurity_PM01_0/LIFETIME') and smpl_time > '2015-08-18 00:00' and smpl_time < '2015-08-19 00:00';

         smpl_time          | float_val  
----------------------------+------------
 2015-08-18 01:38:24.917632 |        0.1
 2015-08-18 05:43:39.776906 | 0.00416998
 2015-08-18 09:48:49.990459 |        0.1

There are many good references on SQL in general and PostgreSQL in particular.

Example Python code

import psycopg2
import time
import datetime

def demo():
    #-- channel name
    name = 'uB_ArPurity_PM01_0/LIFETIME'

    #-- get start and stop of 24-hour interval on hour boundary
    now_hour = 3600*int(time.time()/3600)
    time1 = datetime.datetime.fromtimestamp(now_hour-86400)
    time2 = datetime.datetime.fromtimestamp(now_hour)

    #-- connect to db and make the query
    #   If password isn't given, psycopg2 will obtain it from
    #   $HOME/.pgpass if it exists.
    conn = psycopg2.connect(host="smc-priv", user="smcreader",
                            database="slowmoncon_archive")    
    cur = conn.cursor()
    cur.execute("select smpl_time, float_val from sample " 
         "where channel_id = (select channel_id from channel where name = %s)" 
         "and smpl_time > %s and smpl_time < %s;",
                (name, time1, time2) )

    #-- print it out
    print '|_. ', ' |_. '.join( d[0] for d in cur.description), '| '
    for row in cur:
        print '| ', ' | '.join( str(v) for v in row ), ' |'
    print

if __name__ == "__main__":
    demo()

Example C++/ROOT code

#include <TSQLServer.h>
#include <TSQLResult.h>
#include <TSQLRow.h>
#include <iostream>
#include <fstream>

void demo()
{
  const char * name = "uB_ArPurity_PM01_0/LIFETIME";
  string pass_string;
  ifstream("secretfile.txt") >> pass_string;
  // open the server. username and password must be given too.
  // It will use ~/.pgpass if the third argument to Connect is "",
  // but it will crash if the third argument is NULL.
  TSQLServer *srv = TSQLServer::Connect("pgsql://smc-priv/slowmoncon_archive",
                    "smcreader",
                    pass_string.c_str());
  if (!srv) {
    std::cerr << "Connection failed." << endl;
    return;
  }
  // build the request and submit it
  // caution: example below does not protect against "SQL injection",
  // so do not accept the value of the name variable from an untrusted source.
  // (E.g., what if someone gives you name = "'; drop tables all;")
  TSQLResult* res = srv->Query( Form(
     "select smpl_time, float_val from sample " 
     "where channel_id = (select channel_id from channel where name = '%s')" 
     "and smpl_time >= '2015-08-18 00:00' and smpl_time < '2015-08-19 00:00';",
     name ) );
  // get the result
  TSQLRow * row;
  while ( (row = res->Next()) != NULL ) {
    std:: cout << row->GetField(0) << "\t" << row->GetField(1) << std::endl;
  }
}