Attending: Bruno, Alex, Cristiana, Alex R., Evan, Kanika, Satish, Enrique, Felipe, Adam
Diblock masking problems¶
We’ve decided to drop at the analysis level all runs with two chunks of detector. The issue is that our FA rule is that at least 4 diblocks, if there’s a tie, we get the most downstream. It turns out that when we introduced the subrun level masking. RunHistory was not applying the analyse making rules at the subrun level. That’s why it’s showing up now. Evan has made a spillcut, so it’s a standard spill cuts, so it removes this runs.
SAM is still slow, Robert continues to find workarounds. Our database is big for sam. Snapshotting on the website has been working for a week now.
There will be a new snapshot after Jeremy and Brian fix the issues with GENIE reweight. Probably a new prod2reco once we figure out IFDbeam issues
There are issues with file counts so Bruno will check what happened. Apparently retiring didn’t quite work out 100%
Draining datasets just a few tens of files, except for period1 where we have about 40 000 missing. Enrique usually kills stuff after running for longer than 10 hours in order for it to work. About 3000 nodes seems to be the optimum to not attach dCache hardly. Probably a few more days to finish period 1
Done because SCD gave us huge priority and killed a lot of glide ins. A few missing files, we need to know if this is the same category than the standard FD sample errors. Three draining files, but 37 CAFs without deCAFs.
Kanika is counting files to determine if she’s done right now. All the FD MC datasets. Now running on FD numi datasets, and finally cosmics (limited CAFs). But not data yet.
We will need to remove FD Monte Carlo because they are wrong (or produce them again after the fix)
Paola has seen 170 jobs failing while running keepup, related to the mask. She will send more details later by email.
Alex made a script to clear errors in the FTS. It looks like it’s cleaned a lot. If the number remains stable after running a couple of times, these will probably be real errors.
Joseph asks for input on defgen. Alex asks to clarify when using data or MC.
MRE ready to go now that we have the ND flux swap. As usual, use maxConcurrent and manually adjust until finding a right level.
No request for MR Brem so far, but we should figure out what's going to happen. RHC and ND top-up are higher in priority.
Adam will produce RHC MonteCarlo with a different amount of rock overlay.
Alex points out that once you start a query, it goes to the server, even if you kill it. Also the advice for draining is to snapshot the parent database and make the child as simple as possible (such as ‘prod2reco.e’).
Gavin asks for permission to make CAF + deCAF concats for steriles. Grid is pretty free, so Gavin has green light. It should be done quickly, and Gaving will put it on the ECL