Running the pipeline¶
Quick Start¶
cd your-trunk-directory cd Wrapper python run_pipeline.py
The Glue¶
Commandline Options¶
Run control¶
--list
Lists the Modules and exits. Example output:
[0]: Target_Selection [1]: Survey_Strategy [2]: Fiber_Allocation [3]: Throughput [4]: Sim_Gal_Spec_Clean [5]: Sim_Gal_Spec_Observed [6]: Measure_Redshift [7]: Bin_Redshift [8]: Estimate_Cosmo_Params
--runUntil Module-Number
Runs the pipeline from module 0 to module Module-Number (e.g. 2 for Fiber_Allocation). After the last module finished, it takes a snapshot of the data bank. The pipeline can be started from this module using --runFrom and --runModule.
--runUnil can be combined with --runFrom
--runFrom Module-Number
Starts the pipeline run at Module-Number. It requires a snapshot of the data bank at the stage of the previous module to be present (created with either --runUntil or --runModule).
--runFrom can be combined with --runUntil
--runModule Module-Number
Runs only the Module Module-Number. It requires a snapshot of the previous module and saves a snapshot after the module has finished.
--noCopy
Prevents the creation of a snapshot after the run. Only useful in combination with --runUntil or --runModule.
--noConvert
Do not run convert before the pipeline runs.
Reporting¶
--runId
Assigns a runId, it saves a report and the parameters of the run to the data directory. It creates two files data/report5.pdf and data/run5.h5 for runId=5.
--report
Forces the creation of a report, even if the pipeline run is not complete.
Parameters and configuration¶
--config
Pass a different glue.ini file that replaces glue.ini/glue.mine.ini. e.g. --config glue.new.ini
--param
Pass a custom param.ini. It overrides the values in the module param files, those files are still ingested and used if the values are not in the given param.ini. e.g. --param spokes.ini
Deprecated Section¶
General Steps to Set Up and Run¶
- set parameters in the .ini file(s)
- set program paths in glue.ini and specify modules to run in glue.ini, [run] section
- execute run_pipeline.py
Parameter files¶
These can be found in the subfolder of each module (*.ini), one parameter file per module. They contain the physical parameters with which the module will run. If the parameter file (whose location you specify in glue.ini) is of the form
[Module_Name] # comment quantity = 42 # in supernatural units
then this quantity appears in the data_bank under '/Module_Name/quantity'.
Glue¶
This is the overall parameter file for the pipeline, located in trunk/Wrapper/glue.ini
. It contains information on where the module programs and parameter file can be found, and which modules are to be actually ran. Let's take a look:
[Convert] name = Convert directory = Convert param = param_convert.ini command = ipython-2.7 converters.py #profile = false [Target_Selection] name = Target_Selection directory = Target_Selection/Target_Selection_Hambrecht_4 param = param_target_selection.ini command = ipython-2.7 target_selection.py [Survey_Strategy_Serrano] name = Survey_Strategy directory = Survey_Strategy/Survey_Strategy_Serrano param = param.ini command = ipython-2.7 ObsSim.py [Survey_Strategy_Random] name = Survey_Strategy directory = Survey_Strategy/Survey_Strategy_Random param = param.ini command = ipython-2.7 survey_strategy_random.py [Fiber_Allocation_Forero-Romero] name = Fiber_Allocation param = param.ini directory = Fiber_Allocation/Fiber_Allocation_Forero-Romero #command = ipython-2.7 fiber_allocation.py command = ipython-2.7 FiberAllocation.py [Fiber_Allocation_Random] name = Fiber_Allocation param = param.ini directory = Fiber_Allocation/Fiber_Allocation_Random command = ipython-2.7 fiber_allocation_random.py [Throughput] name = Throughput directory = Throughput/Throughput_Nord command = ipython-2.7 throughput.py [Sim_Gal_Spec_Clean] name = Sim_Gal_Spec_Clean directory = Sim_Gal_Spec_Clean/Sim_Gal_Spec_Clean_Busha/ param = param.ini command = echo "imported Sim_Gal_Spec_Clean parameters" [Sim_Gal_Spec_Observed_Cunha] name = Sim_Gal_Spec_Observed directory = Sim_Gal_Spec_Observed/Sim_Gal_Spec_Observed_Cunha param = param.ini command = echo "imported Sim_Gal_Spec_Observed parameters" [Sim_Gal_Spec_Observed_Random] name = Sim_Gal_Spec_Observed directory = Sim_Gal_Spec_Observed/Sim_Gal_Spec_Observed_Random param = param.ini command = echo "imported Sim_Gal_Spec_Observed parameters" [Sim_Gal_Spec_Observed_Hambrecht] name = Sim_Gal_Spec_Observed directory = Sim_Gal_Spec_Observed/Sim_Gal_Spec_Observed_Hambrecht param = param.ini command = ipython-2.7 Sim_Gal_Spec_Observed.py [Measure_Redshift_Hambrecht] name = Measure_Redshift directory = Measure_Redshift/Measure_Redshift_Hambrecht param = param_measure_redshift.ini command = ipython-2.7 measure_redshift.py [Measure_Redshift_Hambrecht_2] name = Measure_Redshift directory = Measure_Redshift/Measure_Redshift_Hambrecht_2 param = param_measure_redshift.ini command = ipython-2.7 measure_redshift.py [Bin_Redshift] name = Bin_Redshift directory = Bin_Redshift/Bin_Redshift_Nord param = param.ini command = ipython-2.7 Bin_Redshift.py [Estimate_Cosmo_Params] name = Estimate_Cosmo_Params directory = Estimate_Cosmo_Params/Estimate_Cosmo_Params_Nicola_2 command = ipython-2.7 shell_script_cosmoparams.py [Generate_Report] name = Generate_Report directory = Generate_Report/ command = ipython-2.7 Generate_Report.py [run] modules = ['Target_Selection', 'Survey_Strategy_Random', 'Fiber_Allocation_Random', 'Throughput', 'Sim_Gal_Spec_Clean', 'Sim_Gal_Spec_Observed_Cunha', 'Measure_Redshift_Hambrecht'] #'Bin_Redshift', #'Estimate_Cosmo_Params'] [util] convert = Convert report = Generate_Report [data_bank] #datafile1 = data_bank.h5 #datafile1 = cosmos.h5 #datafile2 = emission.h5 #datafile3 = sensitivity.h5 [options] base_dir = ../ make = 1 run = 1 data_dir = ../../data/ data_bank = data_bank.h5 ptrepack = ptrepack
Each module has a fixed name and directory in which the executable file lies. These can be changed in case there are concurring versions of a module. Notice how most modules are in Python, but the glue can also invoke Matlab.
In the [run] section, you specify which modules you want to run (list of strings). Lines can commented out with #.
Execute pipeline¶
Executable location: trunk/Wrapper/run_pipeline.py
This file is called as:
python run_pipeline.py
or inside a Python session with
execfile('run_pipeline.py')What this does for each module:
- read in the parameter file into data_bank.h5
- execute the module
Command line options¶
--config¶
use this option to use an alternative configfile instead of glue.ini or glue.mine.ini. Usage: python run_pipeline.py --config anotherglue.ini
--runModule¶
this option can be used to run one module separately, be aware that a data bank snapshot of the module before needs to be in place. Usage: python run_pipeline.py --runModule 0
--runFrom¶
this option can be used to start the pipeline run at a certain module, be aware that a data bank snapshot of the module before needs to be in place. Usage: python run_pipeline.py --runFrom 3
--runUntil¶
this option can be used to stop the pipeline run at a certian module, if this option is used, a data bank snapshot is saved after the designed module has been run. Usage: python run_pipeline.py --runUntil 6