Matthew, Jonathan, Michael, Jeny, Paola, Gavin, Joseph, Alex, Anna, Satish, Ryan, Craig, Jon P, Dominick, Chris
- Matthew has revamped official datasets page (including a version of Chris's dataset explorer), you can find it here: http://nusoft.fnal.gov/nova/production/datasets/overview.html. Please send comments and suggestions. * The FTS error cleaning script written by Chris has been implemented by Jonathan. During this process Jonathan implemented a system for versioning crontabs: https://cdcvs.fnal.gov/redmine/projects/novaart/wiki/Production_Tools_and_Procedures#Crontabs * Matthew opened a ticket to switch the raw2root processing to use the sub-run end time rather than the start - as this can sometimes be incorrectly filled. During the discussion of this Anna raised the point that CD would like to use "official scripts" going forwards - that is to say scripts provided, versioned and validated by NOvA. Matthew will talk to Jeny and Paola to get their scripts and start version controlling them. * We plan to tag tomorrow (31/03/15)
Novapro/production role audit¶
There are currently 50 users in the nova VOMS with production. They are summarised here:
Matthew will email nova offline to identify who can be removed from the above list. Anna will do the same for SCD.
Birks suppression discussion¶
The discussion of how to treat this are ongoing. Production will pay close attention to how this pans out.
Bad channels status¶
Jon reported that this was all done. He is now waiting for validation to be run by Kanika. Jon noted that one run failed and has been manually dropped from the good runs list. He also noted that bad-channels keep-up is temporally disabled, but will come back in the next day or so.
RAW file renaming¶
This is now not really within SCDs remit. Matthew will email Jeny to get scripts and pass them onto someone in the group. During the discussion of this it was pointed out that it would be good to have a complete explanation of how all these failures arose so we can address the causes of them if needed.
Calibration status and tag¶
Calibration looks to be complete and ready to tag if it is desired. Alex R. asked whether there is any point tagging now given the above, ongoing, discussion about Birks? Chris thinks that there is value in tagging as it represents an intermediate step in calibration that might be useful down the line. The plan is to tag with this in tomorrow. It was also noted that we are expecting some minor changes to calibrator and that these are in the closing stages of validation now and should make the impending tag.
Simulation update (Gavin)¶
Gavin handed his scripts over to Ruth on Thursday. Once a minor bug was fixed fixed she ran a whole host of jobs. Although these appear to be mostly idle at the moment. Gavin worries that the nova pro priority might be degraded given that the new paradigm means that all jobs are run as nova pro. Could request that nova-pro gets a higher priority? If we can confirm that this is the issue then Craig can push to get this resolved.
Matthew to contact Ruth to see if he can establish how she is getting on.
Paola has included the lower run limit files and has finished processing these and modified dimension to include them. She notes that in the first pass processing 14 files were produced with incomplete meta-data, she reprocessed them and their metadata is now complete. She will retire the old files on request.
This is all done except for the dataset definition not being ideal as it has to query on the file name. It was noted that this solution is currently the only correct solution so Paola has done exactly the right thing here.
This has been populated with some of the releases and all of the externals - handled by Andrew. Jonathan reports that the i/o errors he observed when logging in last week have been resolved. He can also see the externals published from the gvpm machines. However currently this server is not under his control - it’s under Andrews and he is waiting on Andrew’s go ahead.
Chris noted that offsite users need to be able to setup this new repository - they need the URL, Jonathan will circulate this when he's ready. It was also noted that we will likely need a parallel period to verify that the new server is working.
It turns out that the discussions last week about how to handle this were based on a misconception. Flux files are not delivered via scripts rather by something deeper in the code. This means the interim solution discussed last week is not viable. Instead it looks as if we're going to need to implement the genie helper. This could be tricky as it likely involves a roll forward of the nu-tools version in an old tag. Gavin and Jonathan will contact Robert Hatcher and start working towards a solution.
PyCurl and other errors (Satish)¶
This is thought to be arising as a conflict between authentication methods. He has a work around using the samweb command line to start the project. This start process is done as your user and then the jobs are configured so that this user is used within the jobs. He notes that this is not the fermilab vision - they want things to run as nova pro. Satish pushed back on this as he isn't convinced that that's the best option but has received no feedback.
Hopefully this won’t lead to problems down the line.
Offsite keep-up (Jeny)¶
After debugging the jobsub client and old jobsub submission, Jeny found that her releases seems to be is missing runNovaSAM.py. She'll contact the list to try and establish where it went to.
- No meeting next week