Project

General

Profile

10/09/14

Attendance

Matthew, Jonathan, Bruno, Dominick, Susan, Nick, Pavan, Gareth, Satish, Ryan, Nate, Evan, Chris, Adam, Gavin, Jeny, Paola

News (Matthew)

Since last week v06 beam flux files have been generated which address the problem with the triangle of missing neutrinos. The problem was that the flux window was a little too small meaning that the back face of the detector wasn't entirely included. We'll use these new files for generation of the FD & ND genie samples.

The production test problem reported last week turned out to be another "feature" of the testing environment whereby a certain job fails when called by python but not when called from the shell. Why this has happened isn't understood but this failure mode is again not a problem for actual production. The lesson learned here is that we're getting false negatives from the production tests. Chris suggested that it could be useful to also run the tests in debug mode.

Production testing was also run on FA14-10-09 and all looks well. The results can he seen here.

First analysis production status (Nate)

Nate submitted FD cry jobs (20,000 jobs * 200 events) on Thursday/Friday last week and this has been running continuously since then. At the last check there were was about 9,000 files. He's had issues running as novapro - and will follow up with Paola and Jeny. He's also suffering from low priority on-site. Offsite (non-OSC) jobs have run very well. He noted that his jobs are taking longer than he thought they should, showing a bimodal distribution with peaks at 800 & 1,300 minutes (2 or 3 factor of this can be attributed to debug mode slow down (as he accidentally submitted in debug mode)). The ND rock files haven't been submitted yet. The plan is now for Nate to submit the remaining MC jobs.

Action Item: Has the ND MC mix issue with meta-data been fixed? Matthew will chase Adam and Nate up to find out.

It was noted that jobs on the OSC were all failing. Nate has a ticket in to find out why Alex Sousa was also following this up.

Craig pointed out that we only seem to every get 1k nodes onsite, whereas our quota should be 1.3k. He is chasing up to find out what is going on.

Reco validation samples (Ryan, Gavin)

Reco validation will be run on the SIM validation samples. Gavin will submit this.

New keep-up reco files (Dominick)

He has processed a large fraction of the files back to the end of the neutrino hunt (1.7 TB, 32k files). These have been announced to the collaboration. All is well.

PC hits keep-up & validation (Paola, Evan)

Paola has made validation samples of pclist, pcliststop and timecal. Including the entire statistics of the ND cosmic MC. She has sent these to the appropriate experts. Alex signed off on the pclist. Luke is in the process of signing off the pcliststop. Evan identified a problem with the timecal files due to a missing commit. He has put together backport details and will communicate these to Jonathan. A second round of validation may be needed. It was noted that we can easily make more ND cosmic stats if need - I expect we might need to.

Auto sim validation & meta-data (Gareth et al)

There is a plan on how to do this. The changes to make_sim_fcl will go in now. The metadata inclusion in the output TFileService needed for the output histogram files to be archived to SAM will be implemented later when Dominick has time.

New job sub transition, what needs to happen before we can switch (Jeny, Paola, Matthew)

Paola met Ken and the other SCD folks along with Andrew. Andrew noted that being able to run from cron jobs and being able to use the group the account are both not available right now in the new jobsub, therefore we are not in a position to switch. The cron job feature will be implemented in a couple of weeks but the group account will take longer. It was decided that the best course of action is likely to invite Ken to one of our future meetings to discuss all this.

Slow pnfs (Nate et al)

PNFS has been slow. SCD seem to have a fix for this involving changing the mount type and/or patching. The ticket indicates that the patch might not be deployed for a week, we'd like it sooner. This problem manifested itself in slow FTS's and Zukai's pnfs transfers taking ages. There is activity on this ticket hopefully this will resolve things.

More releases to CVMFS & policy

It was decided that it would be best to push more releases to CVMFS, but that we should check the disk space available first. Depending on this we need to consider a policy for which releases to keep including notifying potential users before any deletion.

AOB

  • Disk space on /nova/ana has been an issue, and there was some discussion on how best to stop this happening in future. No firm conclusions were reached, this needs further follow up. Including a touch base with the various analysis conveners (nu_e are using lots of space).
  • Gavin noted that there was a problem with the data-logger in the overnight run that started around midnight. This manifested as FD files being produced with duplicate stream and file type extensions. This messed up the nearline. Jeny will work to rename these files.