Project

General

Profile

08-February-2016

Attending: Satish, Susan, Alex, Paul S, Chris, Paul, Gavin, Joe, Bruno

SW Issues (All)

Reco jobs over the weekend have been not been able to proceed because of retry and timeout errors with the LEMServer/LEMClient system. Bruno is attempting to run some small scale tests, as he suspects it is a scaling problem. However neither Chris nor Bruno see any evidence of server-side problems. Chris also raised the possibility that there may be a problem with firewalls not letting traffic through. They are investigating.

There was some discussion of ideas Chris has for further optimizing the LEMServer architecture, but we decided to table that discussion for now.

There were also a number of crashes associated with Calibrator over the weekend, although the code and databases appear to be fixed now.

There was some general discussion about the need for developers to test code before committing to the repository and certainly before handing off to production. In general the sense was that the tools to do this exist, but we need to be more disciplined about using them.

There was a request to make submit_nova_art.py support interactive interactive tests at a small scale. The sense was it should be easy, but we need a volunteer to do the work.

SW Tags (Paul S)

Development is ok, and there have been no nightly build issues. There have been some disk space issues, and so Paul is working on cleaning out old releases.

Last week, Paul cut tag S16-02-02, which as been used for reco keepup and ND raw2root (keepup and backprocessing).
He also cut calibration several releases in the prod2calib series: .a, .b, .c and .d. Today he needs to cut the .e release, which will contain the fix to calibrator and Kanika’s adjustments to the calibcsv’s.

This week will also need the first prod2genie tag, once we have nutools built against the new genie.

Chris wants to know the cutoff date for reco, which Alex will supply by this afternoon’s reco meeting.

Production Assignments (Satish/Alex)

These are the production assignments for this week. Instructions will be coming soon, but they all require the new prod2genie tag, so work won’t be able to start just yet.

Rock nus: OPOS
ND Genie: no reweighting: Bruno
FD Genie Real: Enrique
FD Genie Ideal: Joe

We also need to back process ND DDActivity triggers with S16-02-02 raw2root, and run the calibration chain on the resulting files. A service desk ticket for the first part has already been filed, and the second ticket will be submitted once the first is complete.

Status Reports

Raw2root backprocessing (Felipe)

This went smoothly and is nearly done. One file is missing, and will be resubmitted shortly.

ND data PCHits (Qiulan)

Period 1 and epoch 3c are done. Period 2 and epoch 3b are nearly done (12 and 2 missing files, respectively). There were some errors, and Qiulan will resubmit. As noted above this does not include DDActivity files, which OPOS was not asked to process. Those will be coming soon.

Qiulan also had some concerns about the information available in sam project monitors not reflecting the complete state of submitted jobs. This is ultimately because the sam project monitors are focused explicitly on monitoring the SAM side of things only. A tool like POMS is more appropriate for the information Qiulan wants.

FD data PCHits (Vito)

Vito cannot attend, but looking at dataset definitions, periods 1 and 2 are be pretty much done, while epochs 3b and 3c need to be redone to pick up changes in calibrator.

FD CRY and PCHits (Enrique)

Process here has been held up by calibrator.

ND CRY and PCHits (Paul R)

The generation step is nearly done with only 200 files left to go. The PCHits stage has been held up by the same calibrator issues causing trouble for Enrique.

Reco Preview (Joe/Bruno)

Both sets of jobs have been held up by LEMServer.

Raw2root Keepup (Vito)

Vito could not attend, but will send update.

Reco Keepup (Qiulan)

This has been going smoothly.