Project

General

Profile

Task #18575

Release v06_60_00 of dunetpc

Added by David Adams about 3 years ago. Updated about 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Start date:
12/15/2017
Due date:
% Done:

0%

Estimated time:
Duration:

Description

We should make dunetpc release v06_60_00 soon. The first v06_60_00 release of larsoft is available. There are plans to make a second one based on a new version of art.

We plan to use v06_60_00 or one of its descendants for MC production.

We are waiting for TPC validation from Dorota and photon detector changes from Alex. Please report here when these are complete and if there are any other requirements.

Do we have preference on the art version for MC production?

History

#1 Updated by David Adams about 3 years ago

Preparation for MCC10 is being tracked at #18529. We will wait for this before cutting the MCC10 production release.

Note that we can make v06_60_00 before we have everything for production and then make v06_60_00_01, etc. with additional changes for production.

#2 Updated by Dorota Stefan about 3 years ago

At this moment, I don't see any preferences on the art version for MC production. We can use the current version of art. The most critical part are changes from Alex.

#3 Updated by Alexander Himmel about 3 years ago

Good idea. I'm fairly certain I fixed the issue with the photon library last night. I am rechecking it this morning.

I will also be introducing some minor fhicl changes to allow for more flexible photon detector studies down the road with official files and double-checking the photon detector reconstruction in the protoDUNE fhicls. I should be able to have this all done this morning.

#4 Updated by Christoph Alt about 3 years ago

I will wait for confirmation from Robert concerning cosmics simulation and from Alex concerning the photon libraries before I update the dunetpc product dependencies and perform a test build. David can then start tagging and building the new release (v06_60_00).

I saw that Dorota already bumped the larsoft version.

#5 Updated by Tingjun Yang about 3 years ago

I want to tweak a few fcl parameters for the FD. I will be done this morning.

#6 Updated by Alexander Himmel about 3 years ago

OK -- my changes are all in. Let's wait for the output of CI and then proceed?

#7 Updated by Robert Sulej about 3 years ago

Update of cosmic t0 reconstruction requires change in larreco. Everything is tested and looks OK, but larsoft v06_60_00 is already tagged, so we cannot introduce the t0 changes only in dunetpc v06_60_00, and for us it is ok to tag dunetpc now.

Changes to t0 reconstruction are however a good progress. In case there is any plan to have the next larsoft tag soon - we can push the changes to larreco and dunetpc in any moment.

#8 Updated by Christoph Alt about 3 years ago

Okay. I just started the test build.

@Tingjun: do you want your changes to be in the new release?

#9 Updated by Dorota Stefan about 3 years ago

I see changes from Alex in fcl files. I will push a similar change in mcc10_detsim_protoDune_beam_cosmics_noROI.fcl. This is not very serious at this moment since this file will not be used in mcc10 production.

#10 Updated by Christoph Alt about 3 years ago

No problems with the test build. I have updated and pushed the remaining product dependencies for duneutil and cetbuildtools (the latter was bumped in LArSoft as well). Alex already bumped dune_pardata and Dorota bumped larsoft. CI test results will follow soon.

@Tingjun: should we wait for your fcl changes?

If not: @David: you can go ahead and tag dunetpc v06_60_00

#11 Updated by Tingjun Yang about 3 years ago

Sorry for the delay. Can I have 10 minutes to make the changes?

#12 Updated by Alexander Himmel about 3 years ago

@Robert -- note that there were changes at the detsim and reco stages, as well as G4 changes that happened in services_dune. Basically, if you see "35t" anywhere in the fhicls it is wrong. I want to make sure there aren't other fhicls equivalent to that MCC10 one that also need updating.

#13 Updated by Dorota Stefan about 3 years ago

@Alex it is ok. we are having only gen stage specially prepared for mcc10. standard G4 detsim and reco stages will be used for mcc10. It should be fine then. Thanks!

#14 Updated by Tingjun Yang about 3 years ago

I am done with my changes. d91d8dcc9bc644037133f9c7a16186cf531b2ec6.

#15 Updated by Christoph Alt about 3 years ago

The CI test reports errors at the reco and the mergeana stages for DUNE FD, DUNE 35T and protoDUNE SP: http://dbweb5.fnal.gov:8080/LarCI/app/ns:dune/build_detail/phase_details?build_id=dune_ci_slf/305&platform=Linux%202.6.32-696.1.1.el6.x86_64&phase=ci_tests&buildtype=slf6%20e14:prof

Most of them seem to be related to pandora.

There is also 5 warnings that I have not looked into yet.

#16 Updated by Tingjun Yang about 3 years ago

Adding Elizabeth to this issue.

I think we need to think about it more for the FD production.

If the name of pandora products changes, it will have an impact on downstream reconstruction/analysis, e.g. shower reconstruction, neutrino energy reconstruction.

I don't think it's a good idea to start FD production without fully testing the reco chain first.

#17 Updated by Alexander Himmel about 3 years ago

The 5 warnings, which are at G4 and Detsim, are all related to intentional changes in photon simulation. They are not an issue.

#18 Updated by Tingjun Yang about 3 years ago

267: 15-Dec-2017 10:06:14 CST  Opened output file with pattern "prodgenie_nue_dune10kt_1x2x6_reco_Current.root" 
268: 15-Dec-2017 10:06:14 CST  Closed input file "xroot://fndca1.fnal.gov:1094/pnfs/fnal.gov/usr/dune/persistent/users/vito/ci_tests_inputfiles/DUNEFD/detsim/prodgenie_nue_dune10kt_1x2x6_detsim_Reference.root" 
269: Malformed TimeTracker database.  The TimeEvent table is empty, but
270: the TimeModule table is not.  This can happen if an exception has
271: been thrown from a module while processing the first event.  Any saved
272: database file is suspect and should not be used.
273: 
274: ====================================================================================================
275: MemoryTracker summary (base-10 MB units used)
276: 
277:   Peak virtual memory usage (VmPeak)  : 4093.68 MB
278:   Peak resident set size usage (VmHWM): 650.969 MB
279: ====================================================================================================
280: %MSG-s ArtException:  PostEndJob 15-Dec-2017 10:06:14 CST ModuleEndJob
281: cet::exception caught in art
282: ---- OtherArt BEGIN
283:   ---- EventProcessorFailure BEGIN
284:     EventProcessor: an exception occurred during current event processing
285:     ---- ScheduleExecutionFailure BEGIN
286:       Path: ProcessingStopped.
287:       ---- ProductNotFound BEGIN
288:         getByLabel: Found zero products matching all criteria
289:         Looking for type: std::vector<recob::Track>
290:         Looking for module label: pandora
291:         Looking for productInstanceName: 
292:         cet::exception going through module EMShower/emshower run: 20000002 subRun: 0 event: 1
293:       ---- ProductNotFound END
294:       Exception going through path reco
295:     ---- ScheduleExecutionFailure END
296:   ---- EventProcessorFailure END
297:   ---- OtherArt BEGIN
298:     ---- FatalRootError BEGIN
299:       Fatal Root Error: @SUB=TTree::SetEntries
300:       Tree branches have different numbers of entries, with 1 maximum.
301:     ---- FatalRootError END
302:   ---- OtherArt END
303: ---- OtherArt END
304: %MSG
305: Art has completed and will exit with status 1.
306: 
307: CI MSG BEGIN
308:  Script: ci_regression_test_dunetpc.sh
309:  Function: data_production - error at line 158
310:  Stage: reco
311:  Task: data_production
312:  exit status: 1
313: CI MSG END
314: 

This is from the latest CI test:
http://dbweb5.fnal.gov:8080/LarCI/app/ns:dune/storage/docs/2017/12/15/stdout_yel7ROq.log

So we need to change the pandora label for emshower. This can be fixed. But there may be other issues.

#19 Updated by Tingjun Yang about 3 years ago

Just committed 5c56e2f3dff6da89f53c85e90fb2a5c52eb9cca4 to fix the emshower problem. Wait to see what CI test shows.

#20 Updated by Robert Sulej about 3 years ago

@Christoph:

The error message (in ProtoDUNE tests) appears after the reco or mergana job is done, and it says:

Cannot convert recob::Vertex::pos_ from type: ROOT::Math::PositionVector3D<ROOT::Math::Cartesian3D<double>,ROOT::Math::DefaultCoordinateSystemTag> to type: ROOT::Math::PositionVector3D<ROOT::Math::Cartesian3D<double>,ROOT::Math::GlobalCoordinateSystemTag>, skip elemen

So, isn't it only the problem of comparing type in the old reference file to the new type used now in the reconstruction to save the vertex position? So everything is fine with the reco code and only the type comparison is not possible?

We run the reco after merging pandora branches and result looks reasonable.

#21 Updated by Dorota Stefan about 3 years ago

yes we tested full reco chain and for ProtoDUNE-SP all looked fine.

#22 Updated by Tingjun Yang about 3 years ago

This message looks like from art, not the script to compare products.
Robert Sulej wrote:

@Christoph:

The error message (in ProtoDUNE tests) appears after the reco or mergana job is done, and it says:

Cannot convert recob::Vertex::pos_ from type: ROOT::Math::PositionVector3D<ROOT::Math::Cartesian3D<double>,ROOT::Math::DefaultCoordinateSystemTag> to type: ROOT::Math::PositionVector3D<ROOT::Math::Cartesian3D<double>,ROOT::Math::GlobalCoordinateSystemTag>, skip elemen

So, isn't it only the problem of comparing type in the old reference file to the new type used now in the reconstruction to save the vertex position? So everything is fine with the reco code and only the type comparison is not possible?

We run the reco after merging pandora branches and result looks reasonable.

#23 Updated by Robert Sulej about 3 years ago

This is interesting - I can see the message after first reports at the end of the job:
Art has completed and will exit with status 0.

and then another, after the error, with the status "1"...

#24 Updated by Christoph Alt about 3 years ago

I added Vito to the ticket.

#25 Updated by Tingjun Yang about 3 years ago

This may be from lar -c eventdump.fcl on the reference file. Some data product has changed and no conversion method was provided.
Robert Sulej wrote:

This is interesting - I can see the message after first reports at the end of the job:
Art has completed and will exit with status 0.

and then another, after the error, with the status "1"...

#26 Updated by Tingjun Yang about 3 years ago

I still see a few issues.

I don't want this to hold up protoDUNE production. Maybe we can tag v06_60_00 for protoDUNE now and another release later for FD?

#27 Updated by Tingjun Yang about 3 years ago

Just had a meeting with Anna. The production team would like to start with protoDUNE and supernova samples.

David, could you cut the v06_60_00 release now?

Thanks.

#28 Updated by David Adams about 3 years ago

Build is underway.

#29 Updated by Tingjun Yang about 3 years ago

I believe the current version is good for FD production. I got confused by my earlier test.

#30 Updated by David Adams about 3 years ago

Release is tagged and builds are underway on Jenkins.

#31 Updated by David Adams about 3 years ago

Slf6 builds are installed on scisoft and CVMFS. The others are rebuilding on Jenkins.

#32 Updated by David Adams about 3 years ago

The non-slf6 builds are failing consistently. It looks like the problem may be in dunetpc. It is very difficult to access the Jenkins logs outside of FNAL. I will try to reproduce the problem on my mac.

Does anyone have an slf7 machine I can use?

#33 Updated by David Adams about 3 years ago

I found and fixed one problem--a missing library in the geometry tests--on d16 but there are more.

#34 Updated by Tingjun Yang about 3 years ago

dunesl7gpvm01 is slf7.
David Adams wrote:

The non-slf6 builds are failing consistently. It looks like the problem may be in dunetpc. It is very difficult to access the Jenkins logs outside of FNAL. I will try to reproduce the problem on my mac.

Does anyone have an slf7 machine I can use?

#35 Updated by David Adams about 3 years ago

False alarm. The additional problems went away when I built from scratch.

I have now created tag v06_60_00_01 with the missing library fix. Builds are running on Jenkins.

#36 Updated by David Adams about 3 years ago

The slf6 and slf7 builds for dunetpc v06_60_00_01 have been installed on CVMFS snd SciSoft.

#37 Updated by David Adams about 3 years ago

  • Status changed from Assigned to Closed

Release v06_60_00_01 is now available for all platforms.

Also available in: Atom PDF