Short user guide to integration tests in SBND¶
- Table of contents
- Short user guide to integration tests in SBND
- What are integration tests
- What an integration test does
- Introduction to test running
- Running the integration test with the Continuous Integration system
- The reference result files
- Available tests
- Investigating test failures
- Further resources
What are integration tests¶SBND has two levels of tests:
- unit tests are small tests targeting a single feature or module; you run it with
mrb test -j16or equivalent1
- integration tests exercise a complete chain of processing
Before pushing code that has any remote chance of changing existing results, you should run both. This guide is about the latter.
Every time code is pushed in
develop branch, tests are also automatically triggered.
What an integration test does¶An integration test does what it is asked to by its configuration at source:test/ci/ci_tests.cfg.
Each test may include:
- run a LArSoft job
- compare the size of the output (in particular, the length of the data product collections) with a reference result
- compare selected histograms from the output with reference results
If any of the steps fails, the test fails and the failure is reported. For failures in running the job, the solution is for the author to just fix the code. For failures in the comparison with the reference results, the mismatch must be studied to determine whether the change is acceptable or not. In the former case the reference results need to be updated, in the latter again the code must be fixed.
Introduction to test running¶
For this introduction we run the tests in the local area (as in section 4.2 below).
test_runner will execute the requested integration tests. Since it relies on the settings from the current UPS environment, no particular setup is needed, except for the UPS product containing the script, but you need to get a certificate proxy since the input is read from dCache. So:
setup lar_ci test_runner develop_test_sbndcodewill run the tests designed to be run during development.
The output of a test in detail¶
The information of this section somehow depends on both the version of the test SBND provides, and the version of
lar_ci. The following information was compiled using the latest
v06_68_00 and the current
We use as example the detector propagation test.
The list of available tests is:
$ test-runner -l The current parallel limit is: 5 suite quick_test_sbndcode: (5 tests) ci_anatree_regression_quick_test_sbndcode ci_detsim_regression_quick_test_sbndcode ci_g4_regression_quick_test_sbndcode ci_gen_regression_quick_test_sbndcode ci_reco_basic_regression_quick_test_sbndcode suite seq_test_sbndcode: (5 tests) ci_anatree_regression_seq_test_sbndcode ci_detsim_regression_seq_test_sbndcode ci_g4_regression_seq_test_sbndcode ci_gen_regression_seq_test_sbndcode ci_reco_basic_regression_seq_test_sbndcode suite develop_test_sbndcode: (5 tests) ci_anatree_regression_quick_test_sbndcode ci_detsim_regression_quick_test_sbndcode ci_g4_regression_quick_test_sbndcode ci_gen_regression_quick_test_sbndcode ci_reco_basic_regression_quick_test_sbndcode Tests in no suites:
We have 10 tests, and three suites that may share them. We can ask
test_runnerto run a single test, or a whole suite, or many of the above.
The detector propagation test is
$ setup lar_ci $ mkdir -p run_ci $ cd run_ci $ test-runner --statistics ci_g4_regression_quick_test_sbndcode The current parallel limit is: 5 Test Suite: ci_g4_regression_quick_test_sbndcode Expanded: ci_g4_regression_quick_test_sbndcode Statistic: ci_g4_regression_quick_test_sbndcode exitcode 5120 Statistic: ci_g4_regression_quick_test_sbndcode rusage_user_cpu 21.190000 Statistic: ci_g4_regression_quick_test_sbndcode rusage_scaled_user_cpu 118.647048 Statistic: ci_g4_regression_quick_test_sbndcode rusage_system_cpu 5.180000 Statistic: ci_g4_regression_quick_test_sbndcode rusage_scaled_system_cpu 29.003856 Statistic: ci_g4_regression_quick_test_sbndcode rusage_elapsed 213.580000 Statistic: ci_g4_regression_quick_test_sbndcode rusage_%cpu 12.000000 Statistic: ci_g4_regression_quick_test_sbndcode rusage_avgtext 0.000000 Statistic: ci_g4_regression_quick_test_sbndcode rusage_avgdata 0.000000 Statistic: ci_g4_regression_quick_test_sbndcode rusage_maxrss 221176.000000 Statistic: ci_g4_regression_quick_test_sbndcode rusage_inputs 3982312.000000 Statistic: ci_g4_regression_quick_test_sbndcode rusage_outputs 2368.000000 Statistic: ci_g4_regression_quick_test_sbndcode rusage_major_faults 17999.000000 Statistic: ci_g4_regression_quick_test_sbndcode rusage_minor_faults 495733.000000 Statistic: ci_g4_regression_quick_test_sbndcode rusage_swaps 0.000000 Statistic: ci_g4_regression_quick_test_sbndcode valerrs 0 Statistic: ci_g4_regression_quick_test_sbndcode success False Fcopy_out_results check point None false None None False 0 tests passed (0%), 1 tests failed, 0 tests with warnings, 0 tests skipped, out of 1 Not updating any reference files
copy_out_results check point None false None None Falseis debug output that will be removed soon, see issue #19068).
It shows a job failure, and the
errors.logexplains (in its own way) that there is a authentication error... I forgot to get a certificate proxy! I rerun after getting one, to get:
$ test_runner --statistics ci_g4_regression_quick_test_sbndcode The current parallel limit is: 5 Test Suite: ci_g4_regression_quick_test_sbndcode Expanded: ci_g4_regression_quick_test_sbndcode Statistic: ci_g4_regression_quick_test_sbndcode exitcode 0 Statistic: ci_g4_regression_quick_test_sbndcode rusage_user_cpu 19.650000 Statistic: ci_g4_regression_quick_test_sbndcode rusage_scaled_user_cpu 110.024280 Statistic: ci_g4_regression_quick_test_sbndcode rusage_system_cpu 2.750000 Statistic: ci_g4_regression_quick_test_sbndcode rusage_scaled_system_cpu 15.397800 Statistic: ci_g4_regression_quick_test_sbndcode rusage_elapsed 25.810000 Statistic: ci_g4_regression_quick_test_sbndcode rusage_%cpu 86.000000 Statistic: ci_g4_regression_quick_test_sbndcode rusage_avgtext 0.000000 Statistic: ci_g4_regression_quick_test_sbndcode rusage_avgdata 0.000000 Statistic: ci_g4_regression_quick_test_sbndcode rusage_maxrss 285528.000000 Statistic: ci_g4_regression_quick_test_sbndcode rusage_inputs 1708616.000000 Statistic: ci_g4_regression_quick_test_sbndcode rusage_outputs 4800.000000 Statistic: ci_g4_regression_quick_test_sbndcode rusage_major_faults 7626.000000 Statistic: ci_g4_regression_quick_test_sbndcode rusage_minor_faults 267986.000000 Statistic: ci_g4_regression_quick_test_sbndcode rusage_swaps 0.000000 Statistic: ci_g4_regression_quick_test_sbndcode valerrs 2 Statistic: ci_g4_regression_quick_test_sbndcode success True .copy_out_results check point None false None None False 1 tests passed (100%), 0 tests failed, 0 tests with warnings, 0 tests skipped, out of 1 Not updating any reference files
This is a winner. A local directory
ci_g4_regression_quick_test_sbndcodewas created, with:
||would contain rendering of comparison plots; empty because this test does not extract plots|
|| art output (from
|| art output (from
|| art output (from
||main output of LArSoft job on screen (art output from message facility service, as configured in SBND)|
||error output stream of LArSoft job (art output from message facility service, as configured in SBND)|
||output to screen of (most of) the test_runner script, including also "event dumps" for both new and reference results, and some error messages from LArSoft job|
||encoded error information (only on error)|
||error output stream of (most of) the test_runner script|
||test validation output|
|| average CPU times (broken when using
Example of failure from resource usage¶
This is the output of
test_runner after a successful LArSoft job has used more resources (or less!) than expected:
The current parallel limit is: 5 Test Suite: ci_g4_regression_quick_test_sbndcode Expanded: ci_g4_regression_quick_test_sbndcode Memory usage 285648.000000 not in 200000.000000:210000.000000 range Statistic: ci_g4_regression_quick_test_sbndcode exitcode 102 [...]
To obtain it, I artificially decreased the limit of (resident size) memory for the test to about 210 MB.
Running the integration test with the Continuous Integration system¶There are mainly three ways to run an integration test:
- automatically, by pushing into a
- locally, testing the code in the local development area
- remotely, testing the code in the GIT remote repositories
Currently tests are run on SLF6 and OSX 10.12 ("Sierra").
Automatic testing¶Whenever a commit is pushed into a
developbranch, a test is automatically triggered. The test starts 15 minutes after the first push to
develop, to give the user the time to push into different repositories as needed.
- if the push is in
sbndcodeis built and tested
- if the push is (also) in the
developbranch of a LArSoft repository, the C.I. system will check out and build all LArSoft,
sbndcodeand also the code from the other LArSoft-based experiments; this is to ensure that one experiment's change does not disrupt the others
The result of the test can be checked in the C.I. dashboard (on the top raw, you can select LArSoft or SBND to monitor the respective tests).
In this case, the quick test suite is executed (
quick_test_sbndcode, chosen by the SBND
lar_ci workflow configuration).
Testing of the code in the local working area¶
Before pushing the code anywhere, integration tests may be executed locally from the MRB area1 where the code has just been compiled (see also above, the "Introduction to test running" section). The commands to do so are:
setup lar_ci test_runner --verbose develop_test_sbndcode
Among the useful options:
--parallel-limit=12will run at most 12 tests in parallel, instead than one after the other.
1 In fact, they can be run even without a MRB area, in an environment where
sbndcode is already set up.
Remote testing of published code¶
The Continuous Integration system can build and test any publicly available branch. To ask for the SBND integration tests, use the
setup lar_ci trigger --build-delay 0 --workflow sbndcode_wf
This will run the "quick" test, just like if it had been triggered automatically. The
--build-delay 0option tells the system to start as soon as possible (instead of waiting for 15 minutes, which would be pointless since we are not pushing anything any more).
In general, the supported workflows are listed in the list of supported SBND workflows.
TODO: document how to run on branches
TODO: document how to run other tests
Generating the reference files¶
After having concluded there is the need to update reference files, a single command will do the trick:
trigger --build-delay 0 --workflow Update_ref_files_SBNDCODE_wf --force-platform slf6(there is the usual requirement of having a grid proxy and
Update_ref_files_SBNDCODE_wfis a special workflow used for this purpose only, and we use only one reference platform, assuming (wrongly) that all platforms will give the same results.
The reference result files¶
The reference result files are currently stored in dCache:
XROOTD_REFERENCEFILEDIR_SBNDCODEkey in the
DEFAULTsection of source:test/ci/ci_tests.cfg and the related keys
The available tests can be printed with
test-runner --list-tests (
-l for short).
Test suites should be documented at the beginning of source:test/ci/ci_tests.cfg configuration file.
A summary of the test suites (which will fall sadly out of date with time):
|test suite name||description||run time|
||tests intended to be run during code development||4300/1200 kVs|
|| tests intended to be run before final push (take longer than
||includes both single particle and data-like event quick tests||4300/1200 kVs|
||includes both single particle and data-like event sequences tests||7600/6200 kVs|
|| single particle (
||data-like neutrino and background (GENIE+CORSIKA) 5-stage chain, each step from reference file||3500/1200 kVs|
|| data-like single particle (
||neutrino and background (GENIE+CORSIKA) 5-stage chain, in sequence||6200/6200 kVs|
|| runs tests related to the gallery examples in
||reruns all the jobs generating output files that can be used as reference1||5800/5800 kVs|
||reruns all tests (used for maintenance only)||12000/6200 kVs|
1 The test suite
generate_reference_sbndcode is used by default by the workflow to update reference files (
Update_ref_files_SBNDCODE_wf). To use a different one,
trigger should be explicitly provided with a
Most quick tests typically run just one or two events.
The run time in the table is the normalised one as reported by the C.I. scripts, and it is heavily approximated. The first figure collects the integrated time, while the second is the ideal run time when all tests are run in parallel (for example, sequential tests can't be parallelised). The figures were obtained from
sbndbuild01.fnal.gov; for reference, 1000 kVs on that machine take about 3 CPU minutes.
Investigating test failures¶
Results are different from the reference¶
The reference files are normally generated with a special trigger, as described above. These special jobs are shown in the dashboard as
sbnd_ci builds with workflow
Update_ref_files_SBNDCODE, and can be recognised because they have a table header with more columns, including a column
gen_ref_files. A completed reference file job replaces the previous reference files, so the current files come from the most recent reference file job.
When comparing the result of a failed test, it may be useful to precisely know the difference in the source code. The exact commit of the code used to generate the reference result can be seen hovering the mouse on the
checkout column of the most recent reference file job. For example,
sbnd_ci/32, generating reference files on 2018-03-09 09:02, had:
larana LARSOFT_SUITE_v06_70_01 larcore LARSOFT_SUITE_v06_70_01 larcorealg LARSOFT_SUITE_v06_70_01 larcoreobj LARSOFT_SUITE_v06_70_01 lardata LARSOFT_SUITE_v06_70_01-6-g00d39317 lardataobj LARSOFT_SUITE_v06_70_01 lareventdisplay LARSOFT_SUITE_v06_70_01 larevt LARSOFT_SUITE_v06_70_01 larexamples LARSOFT_SUITE_v06_70_01 larpandora LARSOFT_SUITE_v06_70_01 larpandoracontent LARSOFT_SUITE_v06_70_01 larreco LARSOFT_SUITE_v06_70_01-2-g3db95cef larsim LARSOFT_SUITE_v06_70_01 larsoft LARSOFT_SUITE_v06_70_01 larsoftobj LARSOFT_SUITE_v06_70_01 larwirecell LARSOFT_SUITE_v06_70_01 sbndcode v06_70_01-3-g28fc0d2
The codes shown there are from
git describe, and
gitusually accepts them where a commit hash of a tag would be required. If the failure was from a remote job, you will get an e-mail with a history of recent commits (for
sbndcodeonly), and comparing to the reference tag for
sbndcodeyou can see which additional commits were used in your test. In the example above,
v06_70_01-3-g28fc0d2is the reference tag, which is formed by an actual base tag (
v06_70_01), the number of commits beyond it (
3) and the hash of the head commit (
28fc0d2, introduced by the
gletter). In this case, the correct commit is 28fc0d2.
1 If the working area is already built, you can run all the tests with:
cd "$MRB_BUILDDIR" ctest -j16You can also go to the subdirectory of
$MRB_BUILDDIRthat contains the tests you care of, instead, and you'll end up running only the tests under that directory.
For questions, ask Gianluca Petrillo.