Project

General

Profile

Bug #10490

Print contents of exception thrown from within art module constructor

Added by John Freeman about 5 years ago. Updated almost 5 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Infrastructure
Target version:
Start date:
10/12/2015
Due date:
% Done:

100%

Estimated time:
Spent time:
Occurs In:
Scope:
Internal
Experiment:
-
SSI Package:
art
Duration:

Description

It appears that when an exception is thrown from within an art module constructor, the message that always gets printed is the following:

Mon Oct 12 13:20:20 -0500 2015: terminate called after throwing an instance of 'cet::coded_exception<art::errors::ErrorCodes, &art::ExceptionDetail::translate>'
Mon Oct 12 13:20:20 -0500 2015:   what():  ---- ServiceNotFound BEGIN
Mon Oct 12 13:20:20 -0500 2015:   Service  no ServiceRegistry has been set for this thread
Mon Oct 12 13:20:20 -0500 2015: ---- ServiceNotFound END

The issue here is that an exception can carry information that's effectively swallowed. So, e.g., if I execute throw std::runtime_error("John's engineered error"); in the constructor, "John's engineered error" will never appear on stderr, but instead we'll get the ServiceRegistry complaint above. This is as opposed to if the exception were thrown in the analyze() function of the art module, in which case the message would appear. Would it be possible for art to be modified so that the contents of exceptions thrown from art constructors are shown?

Associated revisions

Revision 806e920a (diff)
Added by Kyle Knoepfel about 5 years ago

Implement fix for #10490; small clean-up

History

#1 Updated by Christopher Green about 5 years ago

  • Category set to Infrastructure
  • Status changed from New to Feedback
  • Assignee set to Christopher Green
  • Target version set to 1.17.00

Can you identify where the first exception is being caught (catch catch in gdb) and the second exception is being thrown (catch throw)? At least in the art exec proper, it is not intended that any exception be able to cause a call to terminate() by not being caught.

This sounds like it ought to be easy to fix if we can tie down exactly where it is happening.

#2 Updated by Christopher Green about 5 years ago

Also, please let us know what release of art is exhibiting this issue, and whether it is still present in 1.16.02.

Thanks,
Chris.

#3 Updated by Christopher Green about 5 years ago

Note: a quick test with art 1.16.02 throwing an art::Exception from a constructor results in that particular exception being seen without any trouble. I also verified that changing that to an std::runtime_exception makes no difference.

#4 Updated by John Freeman about 5 years ago

Very interesting - thanks for looking into this. In hindsight it would have probably been relevant for me to note that I witnessed this phenomenon when running artdaq-based DAQ systems, so while the difference in what we've seen might simply be the result of some changes made on your end between art v1_15_02 (on which artdaq is currently based) and art v1_16_02, it seems likely that this has to do with the way the art thread is embedded within a larger process in the case of artdaq. I'll investigate.

#5 Updated by John Freeman about 5 years ago

Looking into this, it appears this issue is actually related to Bug #8891, which I filed in May; in a nutshell, it has to do with the destructor of an art module requesting a service object that no longer exists. If I wrap the contents of the destructor for the NetMonInputDetail class in artdaq v1_12_12a (defined in artdaq/ArtModules/NetMonInput_source.cc) in a try-catch block:

try {
      ServiceHandle<NetMonTransportService> transport;
      transport->disconnect();
    } catch (...) {
      artdaq::ExceptionHandler(artdaq::ExceptionHandlerRethrow::no,
                       "Exception swallowed in ~NetMonInputDetail; possible cause is exception thrown within an art module constructor");
    }

and then run artdaq-demo using this modified version of artdaq, using an art module in the second aggregator which intentionally throws std::runtime_error("John's engineered error") from its constructor, then I see the following:

Thu Oct 15 15:23:33 -0500 2015: %MSG-e ExceptionHandler:  ToyDump:toyDump@Construction ModuleConstruction
Thu Oct 15 15:23:33 -0500 2015: Exception swallowed in ~NetMonInputDetail; possible cause is exception thrown within an art module constructor
Thu Oct 15 15:23:33 -0500 2015: %MSG
Thu Oct 15 15:23:33 -0500 2015: %MSG-e ExceptionHandler:  ToyDump:toyDump@Construction ModuleConstruction
Thu Oct 15 15:23:33 -0500 2015: art::Exception object caught: returnCode = 30, categoryCode = 30, category = ServiceNotFound
Thu Oct 15 15:23:33 -0500 2015: %MSG
Thu Oct 15 15:23:33 -0500 2015: %MSG-e ExceptionHandler:  ToyDump:toyDump@Construction ModuleConstruction
Thu Oct 15 15:23:33 -0500 2015: art::Exception object stream:---- ServiceNotFound BEGIN
Thu Oct 15 15:23:33 -0500 2015:   Service  no ServiceRegistry has been set for this thread
Thu Oct 15 15:23:33 -0500 2015: ---- ServiceNotFound END
Thu Oct 15 15:23:33 -0500 2015: %MSG
Thu Oct 15 15:23:33 -0500 2015: %MSG-s StdLibException:  ToyDump:toyDump@Construction ModuleConstruction
Thu Oct 15 15:23:33 -0500 2015: Standard library exception caught in art
Thu Oct 15 15:23:33 -0500 2015: John's engineered error
Thu Oct 15 15:23:33 -0500 2015: %MSG

This is interesting, since artdaq v1_12_12a uses art v1_15_02, which contains the code implemented to resolve Bug #8891 . Essentially, it appears that while #8891 was indeed resolved, I've found a new circumstance under which a service object isn't guaranteed to exist in an art module's destructor.

#6 Updated by Kyle Knoepfel about 5 years ago

  • Assignee changed from Christopher Green to Kyle Knoepfel

John, can you provide a feature-branch of artdaq-demo, or perhaps just a patch so that I can recreate your error? As a note, and in general, no exceptions should ever be allowed to propagate outside a destructor. By catching the throw, you prevented a resource leak and were thus able to see the full exception stack.

I am confused, however, that you refer to an exception being thrown in a module's destructor, whereas the printout above says ModuleConstruction...

#7 Updated by Kyle Knoepfel about 5 years ago

  • Status changed from Feedback to Assigned

I am able to reproduce the error described above solely within art. Specifically, if the input source tries to construct an art::ServiceHandle in its destructor and a module throws during construction, then whenever the garbage collection is being done due to the first exception being thrown, the input source d'tor is called, and for some reason the requested service is not available. It is not obvious to me why this is the case. I am investigating with gdb.

#8 Updated by Kyle Knoepfel about 5 years ago

  • Tracker changed from Feature to Bug
  • Status changed from Assigned to Resolved
  • % Done changed from 0 to 100
  • SSI Package art added
  • SSI Package deleted ()

This issue has been resolved. The difficulty was in understanding when the services are valid for use. In principle, they are to be available whenever a source or module destructor is called. However, our implementation did not adequately take into account the scenario of the (e.g.) source destructor being called due to an exception being thrown by a module constructor. This has been fixed, and more tests have been implemented to verify that the behavior you encountered does not occur.

Implemented with art:ac9249e.

#9 Updated by John Freeman about 5 years ago

OK, thanks for looking into this. We'll look for this improvement in art v1_17_00; in the meantime we'll make sure to account for exception throws in artdaq related to this phenomenon.

#10 Updated by Kyle Knoepfel almost 5 years ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF