Project

General

Profile

Working with the Production Website

In the development model for the production website, a website manager maintains a local copy of the software that manages the production website and makes intended changes locally. When satisfied, the manager should commit their changes to the repository and then pull them down to the official location via an SVN update command. It is possible to make edits to the official location in situ, but this is not
recommended.

Installing the Website

The script for installing the website is part of the novaproduction package. It takes a single argument, a configuration file that defines where you would like
your local copy of the web-page scripts to reside, and a few other locations important for the system to know about. Checkout the novaproduction package and execute:

novaproduction/bin/install_prod_valid <config.sh>

The file config.sh is actually a bash shell script which is sourced by the install script. At a minimum it should define the following variables:
  • NOVAPRODVALID_TOP: The top level directory containing the production website software as well as several other custom packages. A sensible value is
    /nova/app/users/${USER}/prod_valid
  • NOVAPRODVALID_ANA: A directory where the package will store data-files, such as pickle files that contain the results of SAM definition queries. A sensible value is
    /nova/ana/users/${USER}/prod_valid
  • NOVAPRODVALID_WEB: a directory where the web-pages generated by this package will reside. A sensible value for a user is
    /nova/app/users/${USER}/prod_web
    but for the official site it is
    /nusoft/app/web/htdoc/nova/production/

In all three cases the directory defined by the environment variables will be created for you. However their parent directories will not, and you must make sure they exist. Optionally you could also define GITMODE=gitsvn (other values may be supported in the future). This uses the git-svn bridge to allow you to work with the SVN repo using git.

The script will check out a copy of the Validation/production package from SVN into $NOVAPRODVALID_TOP/python. It will also install external software packages (anaconda and psutil. Note that we use rather old versions of these pakcages.) into $NOVAPRODVALID_TOP/software, and create local preferences file that will be used in the future for
setting up a new shell.

Setting Up a New Shell

Once you have installed the production web software you can now start to work with it. To do so, go the $NOVAPRODVALID_TOP/python directory and execute:

source setup.sh

This should define all the environment variables you need to start working with the prod_valid scripts.

Adding New Samples to the Website

Getting a SAM definition onto the production website occurs in two steps. In one step, a cron job that runs every few hours and runs SAM queries to fetch the summary information for each dataset (number of files, number of events, etc). This information is stored on disk in pickle files. Exactly how often this step is run depends on your choice, and you will want to make different choices for different samples. You can also run this
step by hand down to the level of individual datasets to get the latest information. This step also takes automatic snapshots, to speed up the process of starting SAM projects for any jobs that might use these datasets. The second step runs once a minute and reads the pickle files produced in the first step and generates the web-pages to display this information.

The command to retrieve data for a dataset definition is:

./scripts/downloadDefinitionData.py --pageConfig /path/to/some_page.cfg

(note that you should issue this command from the directory $NOVAPRODVALID_TOP/python). The file some_page.cfg is a configuration file whose format is described below. For it to be automatically picked up by the page-making cron jobs, it should end in "_page.cfg". By convention these files are kept in the directory structure under:
NOVAPRODVALID_TOP/python/web
And this is where the cron jobs look for them. Examine the second_ana subdirectory for an example of how these files are organized. There are other usage patterns available for downloadDefinitionData.py that offer finer grained control (use the --help option for more details).

The same configuration file is used to make the corresponding page, using the scripts/makeDatasetsPage.py script

/path/to/some_page.cfg

Again, use the --help option to see additional arguments that provide more control or debugging information.

Note that it is possible to cache the pickle files for datasets, allowing the download script to use the results of an old query rather than downloading the information with every run. Possible usages are:

./scripts/cacheDefinitionData.py  --pageConfig /path/to/some_page.cfg
./scripts/cacheDefinitionData.py  --definition some_definition_name

Page Config File Format

You can look at some example real-world configuration files here:https://cdcvs.fnal.gov/redmine/projects/novaart/repository/entry/trunk/Validation/production/web/second_ana/nominal/fd_ana_genie_mc_epoch_page.cfg. The format is the one used by the Microsoft.ini inspired format used by the python ConfigParser module (in practice we use the SafeConfigParser). The config file has several sections:

  • A "main" section containing:
    • the page file name and location (relative to NOVAPRODVALID_WEB). * the page title * some descriptive text for the top of the page
  • A "chain" section ([chain chainName]) listing some descriptive information about the chain. Note that every section of the form "chain FOO" will be interpreted as defining a new chain "FOO".
  • A tiers section ([chainName/tiers]) listing the individual tiers within a chain, and the corresponding definition names. Note that every chain as defined in the previous bullet must have a corresponding tiers section.