Project

General

Profile

Bug #14794

Event viewer crashing with "NULL run" error

Added by Will Foreman almost 4 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
High
Assignee:
Category:
Infrastructure
Target version:
Start date:
12/09/2016
Due date:
% Done:

100%

Estimated time:
Spent time:
Occurs In:
Scope:
Internal
Experiment:
LArIAT
SSI Package:
art
Duration:

Description

Hi,

When I try to run the lariat event-viewer (evd_lariat.fcl) over an art-ROOT file (raw sliced data, no TPC reconstruction), it crashes after only a couple events with a "Null run" error. Here's what I am doing.

(after setting up lariatsoft v06_16_00):
lar -c evd_lariat.fcl /lariat/app/users/wforeman/pickyPbars_161208.root

I am able to see the first two events, but when I reach the third it exits with this error:

evd [0] %MSG-w InfoTransfer: PostSource 09-Dec-2016 14:08:11 CST run: 8777 subRun: 427 event: 66986
failed to get handle to std::vector<recob::Hit> from ffthit
%MSG
%MSG-i RawDataDrawer: PostProcessPath end_path 09-Dec-2016 14:08:12 CST run: 8777 subRun: 427 event: 66986
Region of interest for C:0 T:0 P:0 detected to be within wires 1 to 236 (plane has 240 wires)
%MSG
evd [0] %MSG-s ArtException: PostPathEndRun end_path 09-Dec-2016 14:08:15 CST PostEndRun
cet::exception caught in art
---- EventProcessorFailure BEGIN
An exception occurred during current event processing
---- NullPointerError BEGIN
Tried to obtain a NULL run.
---- NullPointerError END
cet::exception caught in EventProcessor and rethrown
---- EventProcessorFailure END
%MSG
Art has completed and will exit with status 65.

I know there are more than 2 events in this file, so I'm not sure what is going on. It appears unable to retrieve a handle of hit objects which may trigger the crash, though I don't see why this should be a problem (in fact this data has not yet been sent through any TPC reco).

Any idea what might be going on?

Thanks,
Will

History

#1 Updated by Will Foreman almost 4 years ago

  • Occurs In deleted (v0_00_04)

#2 Updated by Tingjun Yang almost 4 years ago

If you do

lar -c evd_lariat.fcl /lariat/app/users/wforeman/pickyPbars_161208.root --nskip 2 -n 1

you can see the 3rd event. I think evd crashes when it tries to display an event from a new run. I suspect this is an art problem.

#3 Updated by Gianluca Petrillo almost 4 years ago

  • Category set to Event Display
  • Status changed from New to Accepted
  • Occurs In v06_16_00 added

The output of:

count_events --hr /lariat/app/users/wforeman/pickyPbars_161208.root
/lariat/app/users/wforeman/pickyPbars_161208.root    665 runs, 63963 subruns, 32 events, and 0 results.

tells me that the file was obtained from filtering others, and some of the runs (in fact, most of them) have been completely depopulated.
This reminds me of a similar problem, that I can't really remember but maybe an artist like Kyle Knoepfel (added to the watchers) may remember.

I could reproduce the problem with lariatsoft v06_16_00 (LArSoft v06_16_00, art v2_05_00).

#4 Updated by Will Foreman almost 4 years ago

This only appears to happen when events from multiple runs are together in the same art-ROOT file. When running evd_lariat.fcl over a file list of subruns from several different runs, it makes the transition to the new run just fine when looping through the events. But when I then merge those same files together (using a simple "copy.fcl" workflow) it crashes at the end of the events from the first run.

To reproduce the problem, here is the test file list used: /lariat/app/users/wforeman/test.list

These files are merged together in this file: /lariat/app/users/wforeman/test_merge.root

#5 Updated by Kyle Knoepfel almost 4 years ago

I believe this is an art problem. I will request that it be moved to the art issues tracker.

#6 Updated by Lynn Garren almost 4 years ago

  • Project changed from LArSoft to art
  • Category deleted (Event Display)
  • Target version deleted (v06_16_00)

Moved to the art tracker.

#7 Updated by Kyle Knoepfel almost 4 years ago

  • Status changed from Accepted to Assigned
  • Assignee set to Kyle Knoepfel

#8 Updated by Kyle Knoepfel almost 4 years ago

  • Category set to Infrastructure
  • Status changed from Assigned to Resolved
  • % Done changed from 0 to 100
  • Scope set to Internal
  • SSI Package art added
  • SSI Package deleted ()

The problem is understood. Some significant cleanups were implemented for art 2.01.00. One of those cleanups involved the accidental removal of setting the underlying Run principal for a SubRun principal whenever the random-access functionality of RootInput is used. The consequence is that whenever SubRun::getRun() is called, perhaps indirectly through Event::getRun(), the Run principal is not accessible.

The line of code that triggered the "NULL run" exception for this particular issue is lariatsoft/Utilities/DetectorPropertiesServiceLArIAT_service.cc:87, specifically:

fProp->Update(evt.getRun().run());

Although I have implemented the fix so the above line of code will appropriately retrieve the Run object, the fix will not be available until the next release of art. In this particular case, since only the Run number is retrieved, you can avoid the exception throw by directly retrieving the run number via:

- fProp->Update(evt.getRun().run());
+ fProp->Update(evt.run());

Thank you for your patience, and my apologies for the error.

Fix implemented with commit art:e4be2ce7.

#9 Updated by Kyle Knoepfel almost 4 years ago

Added Gianluca as a watcher so he can track changes.

#10 Updated by Tingjun Yang almost 4 years ago

Hi Kyle,

Thanks for the information. I have implemented your suggested fix in lariatsoft.

Tingjun

#11 Updated by Kyle Knoepfel over 3 years ago

  • Status changed from Resolved to Closed
  • Target version set to 2.06.00


Also available in: Atom PDF