Project

General

Profile

Using Docker with NOvA Software » History » Version 39

Version 38 (Andrew Norman, 05/18/2018 03:46 PM) → Version 39/48 (Andrew Norman, 05/18/2018 03:48 PM)

{{>toc}}

h1. Using Docker with NOvA Software

Contact *Pengfei* or join *nova-docker* slack channel for any questions.

h2. Step 1. Install OSX Fuse (https://osxfuse.github.io) if you are using macOS. Skip this step for Linux.

h2. Step 2. Install CVMFS

Follow the instructions at https://cernvm.cern.ch/portal/filesystem/quickstart to install CVMFS on macOS or Linux (Window is not supported yet). After installation, please do the following to configure CVMFS properly.

<pre>
sudo wget http://home.fnal.gov/~dingpf/cvmfs.tar.gz
sudo rm -rf /etc/cvmfs/*
sudo tar zxvf cvmfs.tar.gz; sudo mv cvmfs/* /etc/cvmfs/
sudo mkdir /cvmfs/nova.opensciencegrid.org
sudo mkdir /cvmfs/fermilab.opensciencegrid.org
sudo cvmfs_config reload
sudo mount -t cvmfs nova.opensciencegrid.org /cvmfs/nova.opensciencegrid.org
sudo mount -t cvmfs fermilab.opensciencegrid.org /cvmfs/fermilab.opensciencegrid.org
</pre>

Alternatively, [[Using CernVM-FS (CVMFS)|here]] is a Wiki page with more detailed instructions.

h2. Step 3. Install Docker

# For macOS, refer to https://store.docker.com/editions/community/docker-ce-desktop-mac
# after the installation finished, in Docker Preferences, add “/cvmfs” to the “File Sharing” list, click “Apply & Restart” to restart docker.
# For Linux, e.g. Ubuntu please refer to https://docs.docker.com/install/linux/docker-ce/ubuntu/ for installation instructions. You can choose a different Linux distribution from the menu on the left side of this page.

h2. Step 4. Run the docker container

<pre>
docker run --rm -it -v /cvmfs:/cvmfs:cached -v $HOME:/scratch dingpf/slf6.7
</pre>

h2. Step 5. Setup NOvA software environment, e.g.

<pre>
source /cvmfs/nova.opensciencegrid.org/novasoft/slf6/novasoft/setup/setup_nova.sh \
-e /cvmfs/nova.opensciencegrid.org/externals \
-5 /cvmfs/nova.opensciencegrid.org/novasoft/slf5/novasoft \
-6 /cvmfs/nova.opensciencegrid.org/novasoft/slf6/novasoft \
-r S18-02-25 -b maxopt
</pre>

h2. Step 6. Get a valid voms proxy

<pre>
kinit YOUR_USER_NAME@FNAL.GOV # replace YOUR_USER_NAME with your fermilab user name
# kx509 has been installed in the image, this is the recommended way of getting the certificate instead of using cigetcert directly.
# use cigetcert -i "Fermi National Accelerator Laboratory" in case kx509 failed.
kx509
voms-proxy-init --rfc --voms=fermilab:/fermilab/nova/Role=Analysis --noregen
</pre>

h2. Step 7. Make addpkg_svn to work properly

<pre>
mkdir ~/.ssh
# create ~/.ssh/config as the following (replace YOUR_USER_NAME with your fermilab user name)
host cdcvs.fnal.gov
User YOUR_USER_NAME
ForwardX11 = no
GSSAPIAuthentication yes
GSSAPIDelegateCredentials yes
</pre>

h2. Step 8. Create a test release and build a package

Create your working directory under /scratch (note only files under mounted path to the container will be kept after shutting down the container.
<pre>
newrel -t S18-02-25 testrel_s180225
cd testrel_s180225
addpkg_svn CAFAna S18-02-25
make all
# The build will fail due to missing shared library links (we will fix it in the development release)
# you will need to add the following line to “CAFAna/Core/GNUmakefile”
# override CPPFLAGS += -I$(BOOST_INC)
# Add similar thing to CAFAna/XSec/GNUmakefile, the line will be like:
# override CPPFLAGS += -I$(NUTOOLS_INC) -I$(GENIE_INC)/GENIE/ -I$(BOOST_INC)
</pre>

h2. Step 9. Running NOvA EventDisplay.

# If you want to open an event display, start the docker container with:
<pre>
docker run --rm -it -p 5900:5900 -v /cvmfs:/cvmfs:cached -v $HOME:/scratch dingpf/slf6.7
</pre>
# Then run the following script to start the vnc server in the container:
<pre>
/home/me/start-xvnc.sh &
</pre>
# Once the vnc server is up and running, you can connect to the VNC session from your host machine via any VNC client. The address is *vnc://localhost:5900*. The password is *password*.
# For mac, you can start a VNC viewer by press *Cmd+k* in *Finder*, and connect to *vnc://localhost:5900*.
# In the VNC session, there is an xterm opened for you. You can do your usual software setup with "setup_nova" and get a valid voms proxy. You can use *xrdcp* to get any file from dCache to your container, for example, if you want to run EventDisplay on this file *fardet_r00014054_s38_t00_R16-03-03-prod2reco.d_v1_data.pid.root* from one of the official dataset, you will need to do the following:
<pre>
# setup nova
source /cvmfs/nova.opensciencegrid.org/novasoft/slf6/novasoft/setup/setup_nova.sh \
-e /cvmfs/nova.opensciencegrid.org/externals \
-5 /cvmfs/nova.opensciencegrid.org/novasoft/slf5/novasoft \
-6 /cvmfs/nova.opensciencegrid.org/novasoft/slf6/novasoft \
-r S18-02-25 -b maxopt

# get the xrootd file access url
fpath=`samweb get-file-access-url --schema=xroot fardet_r00014054_s38_t00_R16-03-03-prod2reco.d_v1_data.pid.root`

# Get a valid voms-proxy
kinit YOUR_USER_NAME@FNAL.GOV # replace YOUR_USER_NAME with your fermilab user name
kx509
voms-proxy-init --rfc --voms=fermilab:/fermilab/nova/Role=Analysis --noregen

# Do a xrootd copy of the file to your local disk in the container
xrdcp $fpath ./

# start the EventDisplay on this file, the following command need to be run
# INSIDE the xterm in the VNC session (together with the setup_nova).
nova -c evd.fcl fardet_r00014054_s38_t00_R16-03-03-prod2reco.d_v1_data.pid.root

</pre>

h1. Feldman-Cousins Corrections in Docker

If you're crazy enough to want to run the 2017 Analysis Feldman Cousins corrections on a local machine, here's how you would do it. You cannot do batch submissions to the grid this way, but this is to illustrate how you might run a single FC job using the docker image: *dingpf/sfl6.7*.

h2. Download Prerequisites

The 2017 FC script requires some input root files. These will need to be downloaded to the local machine, and can scp'd from @/pnfs/nova/persistent/users/ddoyle/localnova.tar.gz@
Unpack this to some /path/to/localnova.

h2. Enter Docker

See previous instructions for setting up Docker on your local machine. Once installed, run:
<pre>
sudo docker run --rm -it -v /path/to/localnova:/scratch -v /cvmfs:/cvmfs:cached dingpf/slf6.7

# set environment variable used in FC script
export FCHELPERANA2017_LIB_PATH=/scratch/FCHELPERANA2017_LIB_PATH
</pre>

The @-v /path/to/localnova:/scratch@ option mounts the localnova volume to the /scratch directory within the docker. It is important that localnova is mounted here for the code to find required files. Feel free to use this volume as storage as changes are not persistent within the docker.

h2. Setup test release

Create a test release from S18-02-25 then add CAFAna from the FC-at-NERSC branch.

<pre>
newrel -t S18-02-25 <mylocalfc>
cd <mylocalfc>
srt_setup -a
addpkg_svn -b CAFAna FC-at-NERSC
novasoft_build -t
</pre>

h2. Run the script

h3. CAFAna/nue/Ana2017/joint_fit_make_fc_surf.C

<pre>
void joint_fit_2017_make_fc_surf(int NPts, int bin, bool nh, int N,
std::string plot)
</pre>

This take @NPts@, the number of experiments to throw at a bin, @bin@ which bin to throw to, a bool for specifying mass hierarchy, @N@ a bookkeeping parameter, and a string @plot@ specifying either "ssth23dmsq32" or "deltassth23" contours.

Example:
<pre>
cafe -bq joint_fit_2017_make_fc_surf.C 10 10 true 0 ssth23dmsq32
</pre>

h3. CAFAna/nue/Ana2017/joint_fit_make_fc_slice.C

<pre>
void joint_fit_2017_make_fc_slice(int NPts, int bin, bool nh, int N,
std::string plot="delta")
</pre>

The procedure for slices is similar. The options for @plot@ are "delta", "ssth23", or "dmsq32".

Example:
<pre>
cafe -bq joint_fit_2017_make_fc_slice.C 10 10 true 0 ssth23
</pre>

h1. Feldman-Cousins Corrections in Docker without cvmfs

Follow the following instructions to run Feldman-Cousins Corrections in Docker if you do not have cvmfs in your local system.

<pre>
# Get the standalone docker image with software and spectrums for running FC corrections
scp novagpvm02.fnal.gov:/pnfs/nova/persistent/users/dingpf/dingpf--fc-on-nersc--S18-02-25-maxopt-Ana2017.tar .

# Register it with docker daemon
docker load -i dingpf--fc-on-nersc--S18-02-25-maxopt-Ana2017.tar

# Get the script to run the docker image
wget http://home.fnal.gov/~dingpf/run_fc.tar.gz
tar zxvf run_fc.tar.gz
cd run_fc

# run FC experiments
./run_fc.sh --macro=CAFAna/nue/Ana2017/joint_fit_2017_make_fc_slice.C --npoints=2 --bin=1 --hierarchy=true --id=0 --plot=ssth23
./run_fc.sh --macro=CAFAna/nue/Ana2017/joint_fit_2017_make_fc_surf.C --npoints=2 --bin=1 --hierarchy=true --id=0 --plot=ssth23dmsq32
</pre>

h1. Building docker images for NERSC

Currently the image building procedure includes the following steps:
# Prepare a test release of CAFAna locally with the version of code you want;
# Run a generic docker container and mount the local test release and cvmfs repo to the container, and build the test release in the container;
# Also in the container, compile the CAFAna macro you want to run at NERSC.
# Gather files needed by CAFAna package from cvmfs, and pull those files from cvmfs repo to a local directory;
# Build the docker image with the local test release and local cvmfs directory;
# Push the image to NERSC and put it on Cori or Edison.

I've created scripts to locate and pull files from cvmfs, as well as build the docker image. They are under the following directroy on docker-bd.fnal.gov.
<pre>
/home/dingpf/cvmfs_dev
</pre>

Under this directory, you will see:

<pre>
cvmfs_dev
├── build.sh
├── copy_cvmfs_dir.py
├── copy_cvmfs_file.py
├── dirs.list
├── Dockerfile_nova-fc-2018:fc-on-nersc:development-maxopt-base
├── Dockerfile_nova-fc-2018:fc-on-nersc:development-maxopt-v0.0
├── image
│   ├── cvmfs
│   └── development
├── libs.list
└── make_libs.sh
</pre>

The test release is under
<pre>
/home/cvmfs_dev/image/development
</pre>
You should follow the directory structure to make the build script work.

h2. Rebuilding the Test Release

At the top level execute the "build_dev.sh" script which will run the following docker command:
<pre>
IMAGE=<path to image subdir in build directory>
sudo docker run --rm -it -v $IMAGE/development:/development -v /cvmfs:/cvmfs dingpf/slf6.7 /development/ru
n.sh make CAFAna.all
</pre>

This is built against the full CVMFS repo.

Now you need to rebuild the CAFAna macro. To do this you need to run the macro once.

<pre>
IMAGE=<path to image subdir in build directory>
sudo docker run --rm -it -v $IMAGE/development:/development -v $IMAGE/cvmfs:/cvmfs dingpf/slf6.7 /development/ru
n.sh cafe -br <fullpath in image>/<macro_name> <options for macro>
</pre>

Example Syntax:
<pre>
sudo docker run --rm -it -v $IMAGE/development:/development -v /cvmfs:/cvmfs registry.services.nersc.gov/nova-fc-2018/fc-on-nersc:fcinput-cvmfs dingpf/slf6.7 /development/run.sh cafe -bq /development/CAFAna/nue/Ana2018/make_fc_slices_nersc_2018_stats.C 1 0 0 true 0 10 5 ssth23 both false false
</pre>

h2. Rebuilding the Images and pushing to NERSC

To pull the files from cvmfs, run the following:
<pre>
./copy_cvmfs_dir.py dirs.list image/cvmfs; ./copy_cvmfs_file.py libs.list image/cvmfs
</pre>

To build the image, copy "Dockerfile_nova-fc-2018:fc-on-nersc:development-maxopt-v0.0" into "Dockerfile_nova-fc-2018:fc-on-nersc:development-maxopt-v${NEW_VERSION}", where ${NEW_VERSION}" is your desired version number, and then run
<pre>
sudo ./build.sh Dockerfile_nova-fc-2018:fc-on-nersc:development-maxopt-v${NEW_VERSION}
</pre>
This script will build the new image, and tag it. It will prompt you to push the image to NERSC registry at the end. Run the command that it prompts you do run to do the push. This must be done AS ROOT (not sudo):
<pre>
ksu # become root

# Push the image
docker push registry.services.nersc.gov/nova-fc-2018/fc-on-nersc:development-maxopt-v0.3
</pre>

If you forget to increment the version number, the image can be retagged via:
<pre>
sudo docker images # List the images

# Then Retag with
sudo docker tag ba7c53be347e registry.services.nersc.gov/nova-fc-2018/fc-on-nersc:development-maxopt-<New Version>
</pre>

Once the image is pushed to NERSC private registry, loing to Cori and/or Edison, and run the following to make the image available on the Supercomputers:
<pre>
# ON CORI Load the Shifter Module
$ module load shifter-registry

# Login to the Registry
$ shifterimg-beta login registry.services.nersc.gov
registry.services.nersc.gov username: <NERSC username>
registry.services.nersc.gov password: <Application Token>

# Pull the Image in to the Shifter Registry
$ shifterimg-beta pull registry.services.nersc.gov/nova-fc-2018/fc-on-nersc:development-maxopt-v0.0
</pre>

h1. Run the container interactively at for NERSC

Once the image is pulled to Cori, you can do the following to run it in the interactive queue.

<pre>
dingpf@cori02:~> salloc -N 1 -C haswell -q interactive --image=docker:registry.services.nersc.gov/nova-nus18/median-sensitivity:development-maxopt-v0.5 -t 04:00:00
salloc: Pending job allocation 12513427
salloc: job 12513427 queued and waiting for resources
salloc: job 12513427 has been allocated resources
salloc: Granted job allocation 12513427
salloc: Waiting for resource configuration
salloc: Nodes nid00050 are ready for job

dingpf@nid00050:~>
dingpf@nid00050:~> shifter --volume=$CSCRATCH:/output /development/run.sh cafe -bq -nr /development/CAFAna/nus/Nus18/MakeSurfaceMedian.C fhc th24vsdm41 systs both
Running cafe -bq -nr /development/CAFAna/nus/Nus18/MakeSurfaceMedian.C fhc th24vsdm41 systs both
** NOvA Common Analysis Format Executor **
root -l -n -b -q /development/CAFAna/load_libs.C /development/CAFAna/nus/Nus18/MakeSurfaceMedian.C+("fhc","th24vsdm41","systs","both")
</pre>