Matthew, Jeny, Paola, Michael, Chris, Susan, Nick, Jon, Brian, Satish, Dominick, Gareth, Nate, Ryan, Gavin, Craig, Kanika
- Send Matthew your FTEs.
- This is the last meeting of the year, next meeting January 5th.
- Matthew and Dominick have decided that for data we will process all data regardless of DQ and then define additional datasets at the end that also require this. It was noted that the number of sub-runs failing DQ is relatively small and thus so are the overheads associated with this choice. Dominick will likely build the creation of the additional datasets into defman. Chris asked whether we need an "is unprocessable" metadata flag. It was decided that we would revisit this if/when it is needed.
- Some gaps in raw2root processing were caught by Dominick and Susan. Jeny determined that these were due to sub runs which span midnight. She has fixed this now and these will be reprocessed this afternoon.
- LEM is grinding on slowly but surely, Chris overcame some bugs and now consistent datasets are coming out of the otherside to be PIDed. Chris estimates that the FD NuMI files will take about 9 days to process once the cosmic stream is done. Matthew requested that some LEM metrics be collected before the collaboration meeting to show to everyone. Chris can provide these with a few days notice.
- Dominick noted that strange, non-production files look to have shown up in FTS. Matthew will dig into this.
Rapid turn around v5¶
Rapid turn around v5 is primarily waiting on a new calibration (v5.2?) to become available. This will include cell by cell calibrations. Alex is working on this now and we expect it will become available before the end of the week. Jon noted that there are some missing runs in the DB, which he has now recovered. He want to reprocess bad channels and then tag (v6.0?). He expects this will get going today.
Production tests update (Matthew)¶
These now run nightly. Everything was broken when the tests started last week. Since then a lot of issues have been fixed, the remaining ones are:
- Detector service in RAW2ROOT in FA - needs Kanika’s changes merged across - done.
- Bad Channels with non-real run numbers - done.
- ND RHC still can’t be generated - known feature.
- ND rock generation breaking - known (complicated, unresolved) issue.
Post-shutdown ND artdaq generation (Nate)¶
Nathan submitted ~9k files and only 409 files remain to be processed. He noted that there was a problem with his datasets that he'll look into and resolve soon.
During this discussion it was noted that a lot of our metadata is still stored as strings rather than integers (or other appropriate types). Matthew will email Robert and Andrew about this this week.
FD MC RHC processing (Satish, Gavin)¶
Done and ready for LEM. But then they all crashed in LEM... Chris is investigating and will coordinate with Gavin.
FD MC cosmics processing (Satish)¶
Done and Ready for PID - pending results of above reco check.
Post-shutdown reco keep up (Jeny)¶
Some files were missed in the first pass and Jeny is reprocessing these now. She is also getting errors from runs missing in DAQ database. She will work with Jon P on this. The problem is seen just for ND where around 20% of jobs have failed.
Bruno found that cosmic rejection variables in these files are not being properly filled, so we will need a new version for sure. The release used is S14-11-25. As reconstruction wasn't frozen until November 27th, this bug likely affects the ND rapid turn around v4 samples.
Log file consolidation (Paola)¶
This is up and running. She will continue to use gzip, even though it doesn't acheive the compression factor that bzip does, for now as the compressed files are only order 10s of GB.
MC PoT weighted subrun by sub run (Ryan et al.)¶
Generation of MC run by run works well for FD, but not for ND given the run length. Consensus is generating sub run by sub run takes care of this. To implement this change we would need to revisit Channelnfo/BadChannels code that distributes times across entire run to do it across sub runs instead. Jon will add the bad channel feature and hide it behind a fcl switch, we will need to remember to turn in on when the time comes.
On a related note, in order to handle MC run numbers, Jon will default the bad channels not to fail for run numbers >10e6 and Ryan will change make_sim_fcl to default to 10e6.
Review of ART issues (all.)¶
This is a good forum to look at relevant art issues and assess the relative priority for these. Such a discussion will focus on this list:
Kanika volunteered to be the person to be responsible for this. She will attend stakeholders meetings, communicate with the ARTISTs and be aware of tickets not issued by NOvA that could affect NOvA.
Folder naming schemes for CAF files on bluearc (all.)¶
There was some discussion about what to do with out CAF directories on blue-arc. The problem is that files from multiple different datasets fall into the same directory and careless wildcarding can lead to wrong results. The best answer is to use SAM to get your input files, however that can be slower. Apparently Chris has requested an IFDH feature which will speed this up. No conclusion was reached on what to do, and this discussion will be followed up offline.
Christmas processing wish list (all)¶
- ND rapid turn around v5 - include all pre and post shutdown MC
- FD data and MC to CAF
- If the pieces are ready in time: subrun-by-subrun PoT weighted generation.
- A nice feature would be ability to define a random subsample of an existing dataset. Dominick will communicate with Robert about how best to do this.