CAFAna systematics


An analysis is based around the nominal Monte Carlo, that is, our best approximation as to how the world really is. If the MC really is correct, then the analysis should extract the true parameters as its best fit point (given sufficient stats). This can (and should) be checked with various closure tests. If it isn't the case, then there's a problem with your analysis procedures. This is not a systematic, it's something you ought to figure out and fix.

We're concerned with the case where reality isn't exactly like our simulation. A systematic shift/parameter (or loosely "systematic error") is some knob we can turn that modifies the details of the simulation. We set the magnitude of the adjustments allowed so that we believe the true state of the world is "covered" by some value of the parameter.

When making the final fit to the data, both the Near and Far Detector data are held fixed, but we vary the MC and repeat the analysis with differing systematic parameters. The additional uncertainty introduced into the best fit, on top of the statistical uncertainty, is the systematic uncertainty.

In some scenarios, such as evaluating the impact of a single uncertainty, one may want to use MC-derived fake data in the near and far detectors, generated with systematic shifts applied, and processes it through an analysis chain using nominal MC, to see the impact on the best fit point (for example for a "star plot"). This, and the procedure above, are two sides of the same coin, they give similar results and CAFAna supports both.

An analysis should be designed so as to minimize the impact of systematic shifts in these two cases. For example, this is the purpose of the extrapolation step (and having a near detector at all).

CAFAna approach

Systematics in CAFAna are based on altering the details of StandardRecord objects as they are loaded in. These shifted events are then passed on to the rest of the framework, most of which remains unaware it is dealing with modified Monte Carlo. Fits involving systematic shifts are accomplished by interpolating between different systematically-altered versions of the analysis.


This base class is a little like Var or FitVar. Its task is to alter StandardRecord objects passed to it, or to return an altered weight, for reweighting-based systematics. A collection of ISyst-derived objects, plus the number of sigma by which to shift each parameter, can be bundled together into a SystShifts. An optional argument to the Spectrum, Prediction, etc constructors allows the user to use shifted Monte Carlo.

When implementing a new ISyst-based systematic, simply rewrite the fields of the StandardRecord within the Shift() function. If you need to alter the weight applied to the event multiply the weight field (don't just overwrite it).

A large number of systematics are implemented inheriting from ISyst. Two of the most significant are XSecSyst and BeamSyst, which allow for reweighting for cross-section uncertainties and beam flux uncertainties respectively.

You should definitely make shifted versions of all your spectra and look at them to make sure you understand the features before plowing on further.

ISyst on doxygen
SystShifts on doxygen
SystRegistry on doxygen


If you want to fit taking systematics into account, either to find the best-fit value of a systematic parameter, or to include their influence in contours by marginalizing over them, you need a Prediction object that implements PredictSyst(). If you're doing something simple (eg a background scale) it might be worthwhile writing your own Prediction to achieve this. But usually PredictionInterp is a good way to go.

PredictionInterp acts like other Predictions, but you must provide it with a list of systematics it should be ready to interpolate over. Internally, PredictionInterp will evaluate all of its predictions at +/-1,2,3 sigma for each systematic and interpolate the behaviour of each bin using cubic interpolation. These curves are used to reweight the nominal prediction when a shifted prediction is requested.

This procedure means that the decomposition, extrapolation, and regular prediction code doesn't need to know anything about systematics. They do their own thing in their systematically-shifted universe. The use of interpolation means that a finite number of universes is needed, and a lot of work can be done up-front, meaning that the actual fit is fast.

These evaluations have to be done at some particular oscillation parameters, so pick ones near your expected best-fit point and pass them in. For full paranoia, you may want to follow an iterative procedure where the best-fit is passed in as the new expansion point to ensure there is no significant change.

docdb 11887 is a little out of date, but gives a nice overview.

Constructing a PredictionInterp is a little tricky. It needs to be able to create many child shifted Prediction objects, but a Prediction can have an associated extrapolation and decomposition etc. What you must do is provide PredictionInterp with an IPredictionGenerator object which takes a Loader and SystShifts and returns a new Prediction created using them. There are PredictionGenerators for various standard analyses that you can use, or look at to create your own.

PredictionInterp on doxygen
IPredicitionGenerator on doxygen

Fitter and Surface

As briefly mentioned in their sections in the CAFAna overview, Fitter and Surface can each take a list of systematic parameters that they should profile over. The Prediction in the Experiment they're using essentially must be able to make predictions for various combinations of systematic parameters (ie probably PredictionInterp), and it must have been constructed to be aware of the necessary systematics. Otherwise the systematics will have no effect.

A penalty term is automatically applied, sum s_i^2 where s_i is the shift of the i'th systematic in units of sigma.

When drawing a contour from the Surface class an "fc" histogram is required. This is the critical value surface discussed in the FC section below, but for simple cases one should use the Gaussian90Percent2D() etc functions from Surface.h to provide an appropriate level. For nue plots including deltaCP a 1D critical value is often more appropriate, which is really a sign of large non-gaussian corrections, and a strong hint to follow some FC procedure.


The original Feldman-Cousins paper It mostly focuses on some technical properties and shows that delta chisq fulfils them, but to me at least the more interesting aspect was the idea that one can use mock experiments to ensure proper coverage.

The critical values ("up values"), 2.30, 2.71 etc listed in the PDG are assuming various properties that are usually true in the high-stats limit, far from physical boundaries (Wilks' theorem), but are often substantially violated in practical situations. Usually we refer to "non-gaussianicity".

The key to the FC procedure is to empirically determine the correct up value to use at each point in parameter space, forming a critical value surface. At each point in parameter space one generates many mock experiments (spectra based on the expectation at these parameters, with poisson statistical fluctuations) and fits each one with the same procedure that will be used in the final fit to the data. In each of these experiments we see where the best fit would have been put, and we need 90% of these experiments to have included the true point within their contour (if we're going for 90% coverage), so we pick the critical value to be the chisq difference between the best fit point and true point that would have covered 90% of the mock experiments.

It is the responsibility of each individual analysis to put together the scripting to throw and fit the various experiments on a batch system, but CAFAna does provide some tools to help. The FC/ sub-directory contains and FCPoint object to store the results of a single mock experiment, an FCBin object to store collections of these and determine critical values, and an FCSurface object to store a collection of bins and compute the critical value surface.

The FC procedure is able to guarantee exact coverage throughout a 2-dimensional space (eg dmsq vs sinsq or sinsq vs delta) because each point in that space is directly displayed in the results plot and so can be treated appropriately. In higher-dimensional cases (with more oscillation parameters, or with systematic nuisance parameters) each point on the page stands for many different parameter sets, and while the FC procedure can generate a correctly covering volume in the higher-dimensional space, there's no perfect way to project it down. In the First Analysis, when dealing with systematic errors, the mock experiments at each point were thrown with the systematic parameters also picked from a gaussian. This guarantees correct coverage when averaging over our prior expectation of the systematics' values. Hints of Bayesianism here, and it's not so clear how this is supposed to work with oscillation parameters where we don't want to assume a prior. A new procedure, under consideration for the Second Analysis, would keep the systematic parameters for the mock experiments at each point fixed to the best-fit systematics to the data, given these oscillation parameters. This guarantees coverage for the specific systematic pulls the data indicates are most probable. The coverage over a range of true values of these parameters is claimed to have good properties (because the best-fit systematic is most likely to be close to the true one) but this has yet to be demonstrated in a realistic example.