Project

General

Profile

An Introduction to the lar_ci test_runner

The test_runnner is part of the lar_ci package, and its goals are to make:
  • running tests
  • adding new tests
  • creating suites of tests

all easy. The test runner is driven by config files, which can be added in various software packages,
in fact in the initial release, any UPS product you have setup (i.e. that has a $WHATEVER_DIR set
in the environment) can have a test/ci/ci_tests.cfg file in it, which the test_runner will read to
help it find tests and test suites. The test/CMakeLists.txt file has a line install_scripts(AS_TEST) which,
as of cetbuildtools v4_03, is responsible for installing all unit tests and ci_tests.cfg and all binaries in test/ that run those
tests and suites of tests into localProductsXYZ/product/<version>/test/. The CI system will find them there.

Running Tests and Test Suites

To run a given test or test suite, simply:

cd /some/working/directory
setup software_version_to_test ...
setup lar_ci
test_runner test1 test2 ...

The test runner will run the named tests or test suites, and give you summary output.
Each test (and subtest) will be run in a subdirectory of the working directory you were in when you started the test_runner
and its standard output and standard error, and other input and output files, will in those directories if you need to
look at them.

Adding a Test

To add a test, you need to either edit an existing ci_tests.cfg file, or add a new one to some UPS product.
One could have a UPS product strictly to hold tests, or include them in an existing package.
Adding a test then is adding a stanza to that config file, something like:

[test my_test_echo]
script=/bin/echo
args=this is a test

which would say to run the command /bin/echo this is a test. So if we've added this test to the config file,
we can then run:
$ test_runner my_test_echo
      Start  1: my_test_echo
 1/ 1 Test  #1: my_test_echo .....................   Passed    0.00 sec   0.00 kVs

100% tests passed, 0 tests failed out of 1

Similarly, we could add a test that we know will fail:

[test my_test_false]
script=/bin/false
args=this is a test

and run them both:

$ test_runner my_test_echo my_test_false
      Start  1: my_test_echo
 1/ 2 Test  #1: my_test_echo .....................   Passed    0.00 sec   0.00 kVs
      Start  2: my_test_false 
 2/ 2 Test  #2: my_test_false ....................   Failed    0.00 sec   0.00 kVs

50% tests passed, 1 tests failed out of 2

Adding a Test Suite

Running individual tests as described in the previous section can lead to long command lines, as above,
so we recommend instead creating a test suite. This also involves editing the config file, and adding
a section like:

[suite my_suite]
testlist=my_test_echo my_test_false

Now we can tell test runner to run my_suite, and it will run the

$ test_runner my_suite
      Start  1: my_test_echo
 1/ 2 Test  #1: my_test_echo .....................   Passed    0.00 sec   0.00 kVs
      Start  2: my_test_false
 2/ 2 Test  #2: my_test_false ....................   Failed    0.00 sec   0.00 kVs

50% tests passed, 1 tests failed out of 2

Bells and Whistles

We've added features to the test runner to let you easily make tests more stringent.
In general, you turn these features on for your tests by adding config file options
to your test definition. We'll talk about several of these below.

Checking CPU Usage

You can specify a (scaled) CPU usage range for your test, by adding a cpu_usage_range line to your
test config entry, something like:

[test my_test_echo]
script=/bin/echo
args=this is a test
cpu_usage_range= 5:20 

This says the program should take at least 5 scaled (to a 1 kilo-VAX-MIPS cpu) cpu seconds, and at most 10 (that's the
kVs number the test runner reports). Since /bin/echo takes not nearly that much cpu, this example test will now fail.

Checking Output file sanity

If your test generates an output file, you can add a outputn = filename line (for n in {1..9}), and test_runner
will perform some simple sanity checks to make sure your file is sane:
  • It will check that the file exists
  • It will check that it is more than 5 bytes long
  • If it is a .root file, it will verify that it starts with 'root'

Checking memory usage

You can specify a memory usage range for your test, by adding a mem_usage_range line to your
test config entry, something like:

[test my_test_echo]
script=/bin/echo
args=this is a test
mem_usage_range= 5:20 

This says the program should take at least 5 kilobytes of Maximum Resident Set Size , and at most 10.

[This is currently buggy, as we can only tell the maximum memory usage of any test we run, not neccesarily
the latest one.]

Histogram comparisons

There is a check_histograms binary which takes three arguments: 2 ROOT files names and a minimum threshold for test passing. The two files are searched for common histogram names which are compared in a Kolmogorov-Smirnov sense. If all tests give a statistic higher than the threshold the test passes. One would use this with a ROOT file that emerges from the new Jenkins build and and one from an old, canonical build and looks for changes.

Checking Memory Leaks

[not yet implemented] This might re-run the test under valgrind(?) or just parse the output

Reading Art Suite output

If you add a "parse_art_output = True" line to your config, test_runner will read the output and look
for various usage and statistics messages and collect data from it.

Reporting results

You can [in the future] report your test run and results to the test LArSoft test database, with [yet to be created] command line option.

Some real life examples of lar_ci test suites

Some sanity checks

We may wish to know if at runtime we'll discover undefined symbols. We may want to simply run a job, generating a few single events in the Monte Carlo. We may wish to do something very useful: determine that the latest build still allows for processing files created in some canonical build. Namely, we want to know if the latest code is backwards compatible. Finally, we might want to ask if the latest build produces results that are compatible with the canonical build's results, in a real statistical sense. We will briefly discuss these four tests here (as applied to MicroBooNE or LBNE, to be concrete). All tests are run as above, by saying test_runner default_uboonecode or test_runner default_lbnecode.

All CI tests must pass for the CI test box in the grid to be green. A failure of any test gives a red box.

check_library

The section of the file ci_tests.cfg called check_library in the test/ci directory of the uboonecode repository. ldd is executed on each library in a list of libraries provided by the user and all Unassigned modules are pursued down through all possible libraries. The test is successful if no Unassigned modules remain.

prodsingle

The section of the file ci_tests.cfg called prodsingle in the test/ci directory of either the lbnecode or uboonecode repository just runs a few single muons from scratch. This test is successful if the test runs all muons through all modules successfully within some specified runtime and under a designated memory upper limit.

detsim+reco2D+reco3D

The section of the file ci_tests.cfg called ubooneold_detsim in the tests directory of the uboonecode repository picks up an old file detsims it, and reconstructs it. Similarly, this suite is successful if all events go through all modules successfully in each stage within some specified runtime and under a designated memory upper limit.

check_histograms

The section of the file ci_tests.cfg called check_histograms in the tests directory of the uboonecode repository runs Kolmogorov-Smirnov tests against pairs of histograms: in this case hit distributions on the three individual wire planes of MicroBooNE for an MCC5 muon file versus those generated/simulated/reconstructed in this test on the fly. This suite actually depends upon a couple other suites which themselves prepare the three histograms from the two samples. Final success depends on all three KS tests surpassing some passed-in minimum tolerance. For now this tolerance is set low to ensure passage, as poor statistics will frequently otherwise fail these tests. This CI test shows the resulting histograms.

Under-the-hood--django

This is a high-level description of the code in the lar_ci_db repository.

The reporting works with the test_runner script running the CI tests and then grep'ing the CI log files for various statistics and names, and passing them as arguments in calls to URI's, via curl, which are in fact python functions on the fermicloud server, running django. This is how django works: these server side python functions pass on some or all of the received arguments -- statistics and variable names -- to other python functions, which are wrappers for SQL INSERTs. All the python wrappers and SQL functions and the database schema are pre-fabricated one-time when django is informed of the data "model."

When you click around on the colored grid of platform vs stage on fermicloud346.fnal.gov/dj/new_builds you are making queries into that database; the output resulting from that SELECT call is dumped into json. Then, a javascript command uses that file to produce a Google strip chart and shows it and some text on one of a handful of html layouts.