Beginning Roadmap¶
NOTE: there is a much more recent organization of our goals & tasks developing here
 Table of contents
 Beginning Roadmap
 Systematics Estimation
 Startup Work List
Systematics Estimation¶
We can prioritize work on systematics after we know the relative sizes of each effect.
FiveParameter Fit Systematics¶
Precession frequency systematic shifts in 5parameter Twiggle fit:
Systematic  PPM Shift (Ratio) 
PPM Shift (5param) 
Effect  DocDB 

Pileup      Deform (suppress) exponential shape at early times   
Gain      Inflate/deflate asymmetry due to energy cut?   
CBO      Degrade sinusoidal fit (equally at early & late times)   
Lost Muons 
    Deform (suppress) exponential shape at late times?   
Characterize the scale of each systematic effect by fitting Toy MC. Introduce each effect into sampling function, do a ratio fit, and quantify ppm shift in precession frequency. Check with N_data
at various orders of magnitude (10^5 through 10^9). Check shift in both the ratio formulation and 5parameter Twiggle fit.
Startup Work List¶
Core Deliverable  Accessories  Critical Checks  Notes, Hints, etc.  Persons working  

First (Blinded) Fits 
Naive fits to data 
Residuals 
GoodnessofFit 
Results are here (with data and instructions to reproduce). No systematics are taken into account. At 54 million positrons, the ratio fit is good, but the fiveparameter and exponential fits are not (as judged by chisquared Pvalues). 

Sync pkl file format between toymc and data 
 
toymc output can be processed by fitting scripts 
 
 
James 
Pileup Check (ToyMC) 
Build pileup into Toy MC sampling of wiggle distribution, and fill in 'pileup' blank in systematics table 
Plot of chi^2 (or P value) vs. N_data for all three fit types Extrapolate to >1 billion clusters 
Vary fit start time & verify that pileup affects early times more 
Start with fiveparam_model in toymc/util.pyAdd an option to enforce a 'dead time' NOTE: dead time must be enforced before dataset quartering 
Manolis 
Gain Check (ToyMC) 
Build gain into Toy MC sampling of wiggle distribution, and fill in 'gain' blank in systematics table 
Profile fits with energy threshold (chi^2 or similar for all three fit types) using a couple of gain models 
 
Will need ToyMC to generate energies, and fitter will need to handle energy bins/energy threshold  
CBO Check (ToyMC) 
Build CBO into Toy MC wiggle distribution, and fill in 'CBO' blank in systematics table 
Recover CBO frequency from Fourier transform of fit residuals Check size of frequency shift (and chi^2) for each fit type, and a few different sizes of data 
Does CBO affect 5param fit more than ratio fit? Does CBO affect exponent at all? (Check chi^2 for fit degradation with stats.) 
Start with fiveparam_model in toymc/util.pyAdd an optional CBO amplitude and frequency Start with the assumed exponential decay for CBO decoherence (or pass a function like math.exp to the fiveparam_model function call?) 

Energy in ToyMC 
Sample (t,E) instead of just time Implement an energy threshold cut 
Fit scans with different values for E_threshold 
Set E_threshold=0 to reproduce original results? 
Gain systematic study probably wants this NOTE: E_threshold from figureofmerit NA^2 is derived assuming 5parameter T wiggle fit (see TDR section 3.5) 
Manolis 
Fitter Extensions 
Add some features to existing fitting code 
P values from chi^2 Residuals analysis (including Fourier transform) Option for ROOT fitting? (currently using scipy.optimize.leastsq ) 
 
Existing code is in gm2ilratio/fitting/pyfitter. NOTE: Some of the fitting code is duplicated in the Toy MC (including fitresult.py and util.py). We might need to merge some code in util.py and put it in some other directory accessible by both the fitting code and the Toy MC code. (While data fitting shouldn't affect Toy MC development, it should get improvements from it (like CBO terms in the wiggle function).)  
Scale data fits to >1billion clusters 
First pass: skim data files, write out clusters, build histograms without exploding memory & time requirements 
Think carefully about what information to write with clusters 
First check: how does chi^2 (or P value) degrade with N_data for all three fit types to real data? 
60hour dataset is ~65 runs, each with ~300 1.5Gb subrun files A subrun has ~150 'events' (fills), each with a few thousand clusters 
James 
60hour starttime scans 
fit params (omega_a, phi, A) as a function of fit start time 
tau, N0 as a function of start time, 
data_fit_test.py 
start from 30us to 100us in 10us intervals? data_fit_test.py currently overwrites its own fit t_min with the value found in histogram files, so needs to be changed 
 
Fit data percalo  ...  
Fit data perfilltype  ...  
Look for periods of unstable field index (CBO frequency)  ... 
more later... 