Matthew, Jonathan, Michael G., Dmitry, Craig, Dominick, Chris, Bruno, Satish, Evan, Gavin, Gareth, Ryan, Nate, Paola.
- Robert I. couldn't make this meeting but Paola and Nate both report that things are looking good.
- Michael G. says Robert has made significant changes which are likely responsible for the observed improvement.
PNFS status (Dmitry L.)¶
- Eliminated a source of slow down on the server side. Related to dh’s.
- Using log files some other features have been identified, another fix to follow this week.
- He understands some changes have been made to the FTS including some decoupling.
- Major slow down on the client side observed, going from one node to another - he’ll work on this
- Additional monitoring implemented to catch problems earlier.
dCache dashboard: http://fndca.fnal.gov
An example plot: http://www-stken.fnal.gov/enstore/transactions_on_stkensrv1n.jpg
The take home message is that no smoking gun responsible for the slowdowns has been found but lots of minor improvements have been made along the way.
New release summary text discussion (Jonathan et al.)¶
We discussed the releases S14-10-28 & FA14-10-28 guided by docdb-12282. Since the last tag 281 SVN commits have been made. An executive summary provided by Dominick:
- Multipoint fit updated
- ShowerLID retuned - points to new UPS
- Bad channels update in the database
- Minor changes to energy handling in LEM.
After some discussion it was concluded that the idea of going through all commits is un-workable. An alternative proposal to set up a voting system was put forward. This involves the creation of an email list to which people can forward commits they think are important. The discussion can then use these as a seed. The mailing list has since been set up and is: email@example.com
ND number of events bug & ND genie validation plan (all)¶
There was a bug in the ND genie mixing step whereby the number of events was hardcoded to 1,000. The jobs were run requesting 2,000 events at the detector stage, this resulted in post-mixing files including only 1,000 events but the PoT information for 2,000 events. This has been fixed and a new set of files are under production.
It was requested by the analysis groups that even minor problems (such as production thought this was) be communicated to the analysis groups in future as this would have aided in the diagnosis of the problem.
Simulation generation status (Nate)¶
ND files are discussed above. Nate is also working on the geo-jittered v2 samples now and on getting the meta-data for these correctly incorporated into naming. His plan is to run these on the DAQ files that are already in place so that the geo-jittered samples are identical to these.
He'll run the random runs samples tonight.
Calibration summary (Paola)¶
- She processed all the ND cry and has turned on the cron jobs.
Reco plans (All)¶
- Satish to start first with the FD MC reco, followed by Bruno & Gavin doing PID + CAF.
- Dominick will define the FD data dataset using Ryans list of files and a fancy sam option.
Rapid turn around files (Dominick, Gavin, Evan)¶
- Dominick ran ND reco, PID, LEM & CAF good experience - some PID failures under investigation - will repeat a few of these in the next couple of days.
During this discussion it was noted that all our metadata fields are strings rather than the appropriate types. A feature request should be put in to address this.
- Dominick would like to run another pass but is waiting on new bad channels.
- Gavin is also making a set without bad channels for a study of Evans.
Ryan notes that we should focus on making things that are crucial to understand data Monte Carlo differences.
Log file discussion (All)¶
We currently have nearly 2TB of log files! The plan was to setup some cron jobs which first zip files > 2 weeks old, then remove those > 1 month. Satish has scripts which do this and he'll send them to Paola who will implement this. It was noted that the new job sub doesn’t download the logs so this would be a problem going forward.
Increasing message verbosity to “INFO”¶
Chris requested that we remove the module by module trace and instead increase the message service level to info. Chris and Dominick will follow up on this.
- Susan emailed the collaboration about old releases and no-one commented.