Project

General

Profile

29 June 2015

Attending:

Paul, David, nue-workshop (Chris, Gavin, Tian, Kanika), Enrique, Susan, Nick, Dominick, Neha, Vito, Ryan, Bruno

Computing Issues (all)

FTS was in a strange state this morning, and was not picking up files. Bruno could not see the FTS monitor pages. He restarted it twice. After the second attempt, FTS seemed to be healthy again, but has developed a backlog. It was restarted twice. reco and pidpart were not being uploaded, now catching up.

Over the weekend, there were issues with users submitting jobs that accessed BlueArc and were monopolizing locks, making grid jobs inefficient, and also causing difficulties for other users who may need locks for BlueArc. There was some discussion of updating submit_nova_art.py to prevent this, but as it currently does not support input from BlueArc, such a modification is unlikely to help.

Enrique conveyed a request for us to run additional jobs at SMU. There was a question of whether this refers to the new mainframe cluster or the old cluster. Apparently Enrique was referring to the Mianframe cluster, but jobs there still crash. However, the old SMU cluster will be disappearing soon.

Also SMU has 90 TB of disk space, and are seeking to understand the best way for nova to make use of it. Enrique will contact Andrew and Pengfei for suggestions.

SW Tags (David/Jonathan)

David will produce backport revisions r15434 and r15436 (from Chris) to:

  • S15-05-04
  • S15-05-04a
  • S15-05-04b
  • S15-05-22
  • S15-05-22a

These revisions force RemoveBeamSpills to crash in case of a failed DB connection; and CAFMaker to crash, in case it sees an empty event. These will ensure that the reprocessing of the files with the bad metadata will fall prey to the same issue again.

David will also produce two new tags. One tomorrow, using art v1.14.02 and nutools 1.12.00; and a second tag tomorrow or wed with art 1.14.03 and nutools v1.13.

Simulation (Ruth/Paul)

The ND Birks Mod C generation is complete, and Birks mod B is nearly complete. The ND low intensity samples have just finished and Paul is waiting for files to show up in SAM. The ND high intensity samples are running now.

FD Top Up MC jobs were launched this evening.

Reco/PIDPart (Bruno for Joseph/Satish)

The FD data top up jobs are running well. There have been some errors, but because FTS is backlogged it will be roughly a day before Bruno can attempt to drain the datasets. About 400 files remain to be processed.

Satish has been keeping up with the generation of ND Birks samples. About 100 files remain to be processed for Birks C and 1k files for Birks B. Enrique will be taking over managing the final runthroughs for Birks samples. Satish has not yet launched the alternate intensity jobs.

LEM (Chris)

The calibration shifted files are ~90% done. FD Birks B is done. Next up is FD BirksC then MRBrem. Chris will start on FD top up when ready.

There is one ND Stagger file with the wrong location that Chris will sweep up at Bruno's request.

Mix/CAF (Bruno/Gavin)

Data

  • ND data is done.
  • FD top-up waiting for inputs but can start running now
  • FD Cosmics waiting for backport, can wait till tomorrow

Monte Carlo

  • 100% of the FD BirksModB is done; Waiting for birks Mod C inputs
  • The ND calib jobs are keeping up w/ LEM; 50%-65% done in each of 4 sets is done
  • All the other samples are waitng for the go-ahead from Chris

Raw2Root Keepup (Paola)

This is running smoothly. There were some issues with dCache failing. Four jobs were affected and resubmitted and went fine.

Calibration Keepup (Paola)

Paola has switched to a new release for this. Calibration has many duplicated files due to overlapping run ranges with different raw2root releases. Calibration has been turned off until we sort through that. This should be acceptable, as the raw2root process is very stable with no differences between the different releases

Reco Keepup (Vito)

Last week, this was stopped because of the bad-diblock mask. We will shortly be updating to a new tag and resuming. Bruno has requested a 1-2 week overlap with the old samples to ensure the stability of results.