Project

General

Profile

20-July-2015

Attending
Paul, Chris, David, Alex, Bruno, Joseph, Enrique, Gavin, Jonathan

News (Satish)

  • The Production workshop is now scheduled for Aug 24-25.
  • We have a request for a tutorial on pnfs usage at the collaboration meeting. We need a volunteer to give this talk
  • As of July 16, nova logins to oasis are disabled. This means we can no longer publish to OASIS, but our content is still there. That is scheduled to be removed on Aug 11.
  • We have a new version of NovaGridUtils that includes art_sam_wrap.sh. The modified version changes the process description reported to the SAM project monitor. Satish has delayed requesting that declared current and published until now, but now he will request David to go ahead and publish the new version.
  • We have gotten a batch of requests from the ND physics group. Many of them require some additional work to get the tools in place, but at least there is a request for an additional 10e20 POT of standard genie files.

SW Tags (David/Jonathan)

  • The SL5 nightly build is not working. David has determined that we need an updated version of novaddt built and installed. He should be getting help from Brian to ensure that is done. The SL6 build is fine.
  • David is hoping to update to a new versions of art and nutools in a few days. He may need help from Gavin or Jonathan to avoid the build issues that arose with the last art/nutools version update. Gavin had tracked that problem to a version skew with respect to other external packages. He suggests to take this time to also update the versions for those externals.
  • Brian has made some changes to novasoft to support running on OSX. This did cause some issues, now resolved, but Jonathan reports that it is now possible to run, eg, the event display on a mac.

Generation (Ruth/Paul)

  • Paul is supposed to be taking over ND CRY generation from Ruth, but has not heard from Ruth. Paul will email Ruth to get details.
  • Ruth (not present) is attempting to generate some samples with alternate geometries. She has run into difficulties with accessing new (v7) flux files. Gavin is working to get these published on pnfs and blue-arc so they are available for use.

Alt Intensity Samples (Enrique/Chris/Gavin)

  • Reco for the last (high intensity) file is almost done, and should be completed today. Reco of low intensity files is completely done.
  • 100% of the low intensity of the available high intensity files have been lem'd
  • There was some confusion over the weekend as to whether Chris had processed the correct files. It was confirmed that he had.
  • Gavin has been waiting for this confirmation before submitting mix/caf jobs. He's received confirmation and will launch jobs shortly.

Raw2root Keepup (Paola)

raw2root has been running smoothly. There have been issues with missing files for about 2 weeks. Paola will ping Pengfei again.

Calibration Keepup (Paola)

Calibration has been having some issues with jobs failing at OSC. Some nodes were not able to access the nova CVMFS repository. This was fixed as of Thursday. Subsequent jobs failed with a run history issue, that Jon identified as being the result of database replication failing. This should have been fixed, but Paola was seeing the problem recur since then. She will confirm that this is still happening, and then Satish will follow up.

There have also been the usual the secman 2007 errors (indicating a too busy scheduler) that recurs from time to time. The jobsub folks are aware of the problem, and are transitioning to new architecture that should prevent the problem from recurring, but no timeline has been reported. Paola will file a ticket to try and get a timeline.

Jobsub Priorities (all)

The general discussion was that any proposal is more or less fine, as long as the priorities span a sufficient range with reasonable granularity.

It was suggested that we do need a production keepup queue with a small quota. Paola reports that keepup generally requires 500 jobs/day typically taking no more that 90 minutes/job. Hence it was decided that a quota of ~100 is probably appropriate.

It was also suggested that novaana should have a quota of ~1000 to ensure that users are always able to get some jobs through, even when production is in the midst of a big crunch. There was some general confusion about what the total quota across groups should be constrained to, or if that number includes off-site resources.

AOB

novapro OSG certificate will expire in 2 months. Needs to be renewed (Satish to conact A Norman, cc Anna and Tanya)