Project

General

Profile

Support #19599

art job influenced by random files

Added by Andrei Gaponenko about 2 years ago. Updated 3 months ago.

Status:
Feedback
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
04/06/2018
Due date:
% Done:

0%

Estimated time:
Scope:
Internal
Experiment:
Mu2e
SSI Package:
Duration:

Description

Hi,

It looks like the behavior of an art job can be influenced by "random"
rogue files on disk, which are not findable via PATH, LD_LIBRARY_PATH,
etc). This happens even when there is no '.' anywhere in the
environment. This is a bug because jobs should be defined by their
release area and explicit inputs, and not affected by other random
files.

Andrei

1) prepare a test file

   ssh mu2egpvm01.fnal.gov
   setup mu2e
   source /cvmfs/mu2e.opensciencegrid.org/Offline/v6_5_2/SLF6/prof/Offline/setup.sh
   mkdir -p /mu2e/app/users/$(whoami)/20180406-dict-breakage
   cd /mu2e/app/users/$(whoami)/20180406-dict-breakage
   cp -pr /mu2e/app/users/gandr/20180406-dict-breakage/inputs .
   mu2e -c inputs/testjob.fcl  # takes a couple of minutes

2) Verify that data products can be listed without complaints:

   mu2e -c Print/fcl/dumpDataProducts.fcl carbon_muons_hits.art  > /dev/null

(no complaints)

3) Drop a bomb:

   
   mv inputs/RecoDataProducts .

4) Re-run the data product listing:

   mu2e -c Print/fcl/dumpDataProducts.fcl carbon_muons_hits.art  > /dev/null
   In file included from libmu2e_RecoDataProducts_dict dictionary payload:27:
./RecoDataProducts/inc/StereoHit.hh:5:20: error: typedef redefinition with different types ('mu2e::ComboHit' vs 'mu2e::StereoHit')
  typedef ComboHit StereoHit;
  ...
  ...
  fatal error: too many errors emitted, stopping now [-ferror-limit=]

History

#1 Updated by Kyle Knoepfel almost 2 years ago

Paul Russo wrote:

Oh,

This is ROOT behavior, check the definition of your ROOT_INCLUDE_PATH environment variable.

You are picking up this header file through it:

./RecoDataProducts/inc/StereoHit.hh:5:20: error: typedef redefinition with different types ('mu2e::ComboHit' vs 'mu2e::StereoHit')
typedef ComboHit StereoHit;

#2 Updated by Kyle Knoepfel almost 2 years ago

Rob then wrote:

Thanks Paul,

Can you explain a bit more about what is going on? Does the action happen in the JIT? At dictionary-load-time? At dictionary-use-time?

I take it that the problem is that the jit can’t find the header file because it has been moved?

Or am I way out in left field? Waveland Avenue? Wisconsin?

Rob

#3 Updated by Kyle Knoepfel almost 2 years ago

  • Status changed from New to Feedback
  • Assignee set to Paul Russo

#4 Updated by Andrei Gaponenko almost 2 years ago

I checked on Paul's suggestion. The offending file can not be found via ROOT_INCLUDE_PATH. The value of that variable is shown below. Note that it does not include the /mu2e/app/users/gandr/20180406-dict-breakage directory or any of its parents. I also checked that there is no '/mu2e/app' anywhere in the environment (besides the PWD), and no dot anywhere.

Andrei

20180406-dict-breakage$ echo $ROOT_INCLUDE_PATH | tr ':' '\n'
/cvmfs/mu2e.opensciencegrid.org/Offline/v6_5_2/SLF6/prof/Offline
/cvmfs/mu2e.opensciencegrid.org/artexternals/cry/v1_7i/Linux64bit+2.6-2.12-e15-prof/cry_v1.7/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/mu2e_artdaq_core/v1_02_01e/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/art/v2_10_02/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/fhiclcpp/v4_06_05/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/cetlib/v3_02_00/slf6.x86_64.e15.prof/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/boost/v1_66_0/Linux64bit+2.6-2.12-e15-prof/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/artdaq_core/v3_01_05/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/artdaq_core/v3_01_05/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/TRACE/v3_13_04/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/art/v2_10_02/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/canvas_root_io/v1_01_02/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/xrootd/v4_8_0a/Linux64bit+2.6-2.12-e15-prof/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/mysql_client/v5_5_58/Linux64bit+2.6-2.12-e15/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/postgresql/v9_6_6a/Linux64bit+2.6-2.12-p2714b/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/pythia/v6_4_28j/Linux64bit+2.6-2.12-gcc640-prof/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/gsl/v2_4/Linux64bit+2.6-2.12-prof/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/fftw/v3_3_6_pl2/Linux64bit+2.6-2.12-prof/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/canvas/v3_02_02/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/range/v3_0_3_0/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/clhep/v2_3_4_5c/Linux64bit+2.6-2.12-e15-prof/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/messagefacility/v2_01_06/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/tbb/v2018_2/Linux64bit+2.6-2.12-e15-prof/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/fhiclcpp/v4_06_05/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/cetlib/v3_02_00/slf6.x86_64.e15.prof/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/boost/v1_66_0/Linux64bit+2.6-2.12-e15-prof/include
/cvmfs/mu2e.opensciencegrid.org/artexternals/cetlib_except/v1_01_06/include

#5 Updated by Paul Russo almost 2 years ago

This is an example of auto-parse behavior. We should re-examine the question of whether or not this should be disabled in ROOT at art startup, and configurable by fhicl.

#6 Updated by Rob Kutschke almost 2 years ago

Hi Paul,

Can you explain "auto-parse behaviour".
Thanks,
Rob

#7 Updated by Kyle Knoepfel almost 2 years ago

  • Tracker changed from Bug to Support

#8 Updated by Kyle Knoepfel almost 2 years ago

Based on discussions with Rob and Andrei, the next step is to experiment with disabling ROOT's auto-parsing behavior. This must be one before opening the first input file: disabling auto-parsing at module or service construction would take effect before the first input file is opened.

We are willing to experiment with this, but your testing of disabling auto-parsing would be more elucidating for your use case.

#9 Updated by Raymond Culbertson almost 2 years ago

You probably already discussed this with Rob and Andrei, but could
reply with a summary of what root is trying to achieve with auto-parsing
and what functionality might be lost now and in future versions of root
if it is disabled. Thanks

#10 Updated by Kyle Knoepfel 4 months ago

  • Assignee deleted (Paul Russo)

#11 Updated by Philippe Canal 3 months ago

Note you can try running with Autoparsing disabled by calling:

gInterpreter->SetClassAutoparsing( false );

However, I don't think we ever test this mode 'carefully/completely' so some corner of ROOT itself may be relying on it.

#12 Updated by Philippe Canal 3 months ago

On a side note:

./RecoDataProducts/inc/StereoHit.hh:5:20: error: typedef redefinition with different types ('mu2e::ComboHit' vs 'mu2e::StereoHit')
  typedef ComboHit StereoHit;

Is there really two piece code that use the same typedef name for 2 different type? Or is that a change made from one release to the other?

#13 Updated by Andrei Gaponenko 3 months ago

I created a service that make the

gInterpreter->SetClassAutoparsing( false );

call in its constructor. If I add that service to
Print/fcl/dumpDataProducts.fcl then instead of the long list
of errors with "fatal error: too many errors emitted, stopping now"
in the original ticket I get a shorter crash
$ mu2e -c Print/fcl/dumpDataProductsNoAutoparse.fcl ../carbon_muons_hits.art   > /dev/null
In file included from libmu2e_RecoDataProducts_dict dictionary payload:27:
./RecoDataProducts/inc/StereoHit.hh:5:20: error: typedef redefinition with different types ('mu2e::ComboHit' vs 'mu2e::StereoHit')
  typedef ComboHit StereoHit;
                   ^
Forward declarations from /mu2e/app/users/gandr/autoparsing/Offline.autoparsing/lib/libmu2e_RecoDataProducts_dict.rootmap:1:828: note: previous definition is here
  ...mu2e { class StrawHitFlagDetail; }namespace mu2e { class HelixHit; }namespace mu2e { class ComboHit; }namespace mu2e { class StrawHitPosition; }namespace mu2e { class StereoHit; }namespace mu2e ...
                                                                                                                                                                            ^
Segmentation fault

If the "exploit" file ./RecoDataProducts/inc/StereoHit.hh is removed everything works as before.
So the attempt to disable autoparsing did not break things, but also did not prevent ROOT from
peeking at files in "." that it should not be looking at.

Are there other calls to try to stop ROOT from messing itself up with unnecessary files?



Also available in: Atom PDF