Project

General

Profile

15-February-2016

Attending: Satish, Paul, Justin, Enrique, Susan, Joe, Tom

Computing Issues (all)

LEMServer has been having difficulties over the weekend, but we hope the issues have been resolved now. Full deployment of the fix will require an new tag that Paul is working on.

Offsite jobs have been having two flavors of problems:

  • At MWT2 and UChicago (which tend to get a lot of our jobs) have been failing with errors about missing libraries associated with our modules. Enrique suspects that this is because of issues with these nodes not properly seeing CVMFS. He will follow up with the site administrators there, and possibly file a GOC ticket. These jobs also apparently exit with status code 0, meaning that they are not reliably flagged as error processes. Enrique will investigate that.
  • At SU-OG and SMU, there are issues with missing system libraries such as libXmu. This will require the shim library to fix, which Gavin is working on.

SW Tags, especially latest prod2calib tag (Paul S)

The new release R16-01-27-prod2calib.f currently being built. It should be available in a few hours. Last week Paul built prod2genie.a to be used for genie generation.

Nightly builds are ok.

Paul has been in communication with Jonathan. Apparently access to jenkins will be limited tomorrow. The reasons are at present unclear. Paul will find out details and notify us.

Processing Status

ND NuMI/ FD Cosmic preview

These samples are paused, pending a release of a new version of prod2calib

Rock nu generation (Felipe)

Felipe has been attempting to run these jobs over the weekend but has run into some difficulties. In particular the jobs are producing large output, which is causing jobs to go into a held state when disk consumption gets too large. The reasons for the large disk consumption are not understood. Paul Rojas is investigating.

FD Genie real conditions samples (Enrique)

Generation of the nonswap samples is about 85% done. The flux swap and tau files are about 95% done. Enrique is waiting to submit reco jobs for the new tag later today.

FD Genie ideal conditions samples (Joe)

Joe hasn’t started yet, as he has some questions. He will send around his questions and hopefully get started this morning.

raw2root keepup (Vito)

These jobs are proceeding smoothly

reco keepup (Felipe)

These jobs have been crashing because we recently switched to a tag (S16-02-02) with the bad v08.00 version of calibcsv. We considered switching to S16-02-09 which should work, but on Bruno’s suggestion we decided to revert to S15-12-07 until we have the available resources to backprocess older data.

AOB

There is a concern, raised by email last week, that the early run numbers in the period and epoch naming dataset are not correct. Joe is digging into his old emails to try and understand the issue.

Tom had some questions about why nova jobs do not get directed to SMU even though we have dedicated job slots and technical support there. Satish agreed that jobs should be going there preferentially. We will attempt to sort this out. A phone meeting with Alex, Satish, Enrique and Amit may be needed. Satish requested that Enrique keep statistics on where production jobs in the wild end up running, which he will do once some of the technical issues at SMU are sorted out.