Project

General

Profile

SQLite help

[ In what follows, all lines prefixed with $ correspond to command-line invocations. ]

Various art services and facilities use SQLite for storing information. Files produced by SQLite are database files, where each database contains tables with instances (rows) of tuples of information (columns).

To support SQLite, the sqlite3 library is packaged as a relocatable UPS product with all bundles of art. Although querying SQLite databases is simple, it may take a little getting used to. SQLite databases can be queried interactively or at the command line. Any database file can be explored interactively by typing:

$ sqlite3 my_database_file.db

SQLite version 3.8.8.2 2015-01-30 14:30:45
Enter ".help" for usage hints.
sqlite>

For the simple commands below, the command-line interface is sufficient, and we will only present those options. However, the interactive sqlite3 facility offers more flexibility for more complicated queries. Note that all commands appearing below in single quotes as 'SOME_SQL_COMMAND;' below can be used in the interactive shell.

[ SQLite is largely case-insensitive. However, for clarity, we use upper-case characters below to indicate SQLite-specific syntax. ]

References

For individuals interested in the ROOT or Python interfaces to SQLite, please see:

Basic commands

The below output has been produced using the sqlite3 database file timeTracker.db, which has been attached to this page for your own perusal.

Print tables in database

$ sqlite3 timeTracker.db '.tables'

TimeEvent TimeModule TimeReport

We will use the TimeEvent table for the rest of the commands below.

Print schema information (columns and types) for a table

A first helpful facility is to know what kind of information is stored in a given table. This can be determined by:

$ sqlite3 -header -column timeTracker.db 'PRAGMA TABLE_INFO(TimeEvent);'

cid name type notnull dflt_value pk
---------- ---------- ---------- ---------- ---------- ----------
0 Run integer 0 0
1 Subrun integer 0 0
2 Event integer 0 0
3 Time numeric 0 0

Each row shows the column ID (cid), name, and type of quantity represented. In addition, the entries notnull, dflt_value, and pk denote whether the column must contain non-null entries ('0' if no), the default value, and whether the specified column represented the primary key of the table ('0' if no), respectively. By default, the primary key is the 'rowid' column which is typically suppressed in table printouts.

Print contents of table

The standard querying command follows the form 'SELECT something FROM someTable;', as shown here:

$ sqlite3 -header -column timeTracker.db 'SELECT * FROM TimeEvent;'

Run Subrun Event Time
---------- ---------- ---------- -----------
1 0 1 0.004208872
1 0 2 0.000320013
1 0 3 0.000262831
1 0 4 0.000260539
1 0 5 0.000260799
1 0 6 0.000260244
. . . .
. . . .
. . . .
1 0 998 0.000258625
1 0 999 0.000258888
1 0 1000 0.000263218

When the '*' wildcard is used in the 'SELECT' statement, the data corresponding to all columns is printed. The primary key 'rowid' can be included through the command: 'SELECT rowid,* FROM someTable;', which can be helpful when mapping entries in one table to those in another. (This mapping is used, for example, when the sqlite messagefacility destination is enabled.)

Print select columns of table

Here is an example where the primary key column and 'Time' column data are presented side-by-side.

$ sqlite3 -header -column timeTracker.db 'SELECT rowid,Time FROM TimeEvent;'

rowid Time
---------- -----------
1 0.004208872
2 0.000320013
3 0.000262831
4 0.000260539
5 0.000260799
6 0.000260244
. .
. .
. .
998 0.000258625
999 0.000258888
1000 0.000263218

Sorting data based on values of one column

Data can also be sorted by using the 'ORDER BY some_column_name' clause. We also order the data in descending order by appending 'DESC' to the end of the statement.

$ sqlite3 -header -column timeTracker.db 'SELECT * FROM TimeEvent ORDER BY Time DESC;'

Run Subrun Event Time
---------- ---------- ---------- -----------
1 0 1 0.004208872
1 0 201 0.003826711
1 0 801 0.003619156
1 0 601 0.00309264
1 0 401 0.002953438
1 0 2 0.000320013
. . . .
. . . .
. . . .
1 0 571 0.000242226
1 0 587 0.000241376
1 0 578 0.000241189

Printing data only for instances with a given set of column values

In this example, we use the TimeModule table. We first print out the data:

$ sqlite3 timeTracker.db 'SELECT * FROM TimeModule;'

Run SubRun Event Path ModuleLabel ModuleType Time
---------- ---------- ---------- ---------- ----------- ----------------------- ----------
1 0 1 p1 prod TestTimeTrackerProducer 2.1139e-05
1 0 1 p1 filt TestTimeTrackerFilter 6.402e-06
1 0 1 p1 TriggerResu TriggerResultInserter 0.00021647
1 0 1 end_path mod1 TestTimeTrackerAnalyzer 0.00014066
1 0 1 end_path mod2 TestTimeTrackerAnalyzer 1.952e-06
1 0 2 p1 prod TestTimeTrackerProducer 2.006e-06
1 0 2 p1 filt TestTimeTrackerFilter 1.604e-06
. . . . . . .
. . . . . . .
. . . . . . .
1 0 1000 end_path mod1 TestTimeTrackerAnalyzer 7.757e-06
1 0 1000 end_path mod2 TestTimeTrackerAnalyzer 1.371e-06

Here, the timing is printed out for the execution of each module event function. (The ModuleLabel value is truncated based on the preconfigured length of the column printout. This can be adjusted when examining the database file interactively.) It is often desirable to see the time for a given set of values corresponding to a set of columns. This can be done through the command:

$ sqlite3 timeTracker.db 'SELECT Run,Subrun,Event,Time FROM TimeModule WHERE Path="p1" AND ModuleLabel="filt" AND ModuleType="TestTimeTrackerFilter";'

Run SubRun Event Time
---------- ---------- ---------- ----------
1 0 1 6.402e-06
1 0 2 1.604e-06
1 0 3 1.58e-06
1 0 4 1.586e-06
1 0 5 1.604e-06
1 0 6 1.547e-06
. . . .
. . . .
. . . .
1 0 999 1.567e-06
1 0 1000 1.481e-06

The output can be ordered as well by appending the 'ORDER BY' clause to the end of the above statement.

Some SQLite statistics helpers

Find minimum/average/maximum value of column

$ sqlite3 timeTracker.db 'SELECT MIN(Time) FROM TimeEvent;' # outputs: 0.000241189
$ sqlite3 timeTracker.db 'SELECT AVG(Time) FROM TimeEvent;' # outputs: 0.000277560085
$ sqlite3 timeTracker.db 'SELECT MAX(Time) FROM TimeEvent;' # outputs: 0.004208872

MemoryTracker and TimeTracker queries

The MemoryTracker and TimeTracker services can be configured to output their information to an SQLite database. As of art 2.04.00, several queries are provided by art for users to obtain summary information from the tables.

In what follows below, the environment variable ART_DIR is set during the ups setup of art.

MemoryTracker queries

Reported values Command (with export SQL_QUERY_DIR=${ART_DIR}/tools/sqlite/memoryTracker)
Peak memory usage sqlite3 memTracker.db < ${SQL_QUERY_DIR}/peak-summary.sql
Events with largest RSS sqlite3 memTracker.db < ${SQL_QUERY_DIR}/max-rss-events.sql
Events with largest Vsize sqlite3 memTracker.db < ${SQL_QUERY_DIR}/max-vsize-events.sql
Events with largest change in RSS sqlite3 memTracker.db < ${SQL_QUERY_DIR}/max-delta-rss-events.sql
Events with largest change in Vsize sqlite3 memTracker.db < ${SQL_QUERY_DIR}/max-delta-vsize-events.sql
Modules with largest RSS sqlite3 memTracker.db < ${SQL_QUERY_DIR}/max-rss-modules.sql
Modules with largest Vsize sqlite3 memTracker.db < ${SQL_QUERY_DIR}/max-vsize-modules.sql
Modules with largest change in RSS sqlite3 memTracker.db < ${SQL_QUERY_DIR}/max-delta-rss-modules.sql
Modules with largest change in Vsize sqlite3 memTracker.db < ${SQL_QUERY_DIR}/max-delta-vsize-modules.sql

TimeTracker queries

Reported values Command (with export SQL_QUERY_DIR=${ART_DIR}/tools/sqlite/timeTracker)
General statistics for events sqlite3 timeTracker.db < ${SQL_QUERY_DIR}/event-summary.sql
General statistics for modules sqlite3 timeTracker.db < ${SQL_QUERY_DIR}/module-summary.sql
Events with the longest wall-clock time sqlite3 timeTracker.db < ${SQL_QUERY_DIR}/max-events.sql