Support #20507
Should dictionary building in lardataobj happen in parallel?
0%
Description
It appears to me that while building lardataobj
the different dictionaries are built in sequence rather than in parallel.
In my very incomplete understanding, a dictionary does not depend on any other dictionaries, and no library depends on dictionaries except than the dictionary itself.
If this is true, I would have expected the build to run in parallel, but my observation (v07_00_00
e15:prof
) is that they are compiled one at a time.
Is that the optimal behaviour?
Currently dictionary compilation is the slowest part of lardataobj
build.
History
#1 Updated by Kyle Knoepfel over 2 years ago
- Tracker changed from Bug to Support
- Status changed from New to Feedback
cetbuildtools
/CMake treats dictionary targets in the same way as it treats regular libraries. If it can build them in parallel, it will do so. What makes you think the dictionaries are being built sequentially?
#2 Updated by Gianluca Petrillo over 2 years ago
htop
while compiling lardataobj
.
Given what you write about cetbuildtools
, I would think that either
cetbuildtools
logic to detect what can be run in parallel does not work well with dictionaries inlardataobj
- there are actual dependencies between dictionaries (but I would have expected that should not happen)
- my observation is wrong or just a local accident
#3 Updated by Lynn Garren over 2 years ago
I see that lardataobj uses cet_make() and a separate call to art_dictionary(). Also, each dictionary is built against the related library. And I see that, for instance, lardtaobj_RecoBase depends on lardataobj_RawData. It appears that there are real dependencies in play.
#4 Updated by Gianluca Petrillo over 2 years ago
lardataobj_RecoBase
depends on lardataobj_RawData
. That is a real dependency.lardataobj_RecoBase_dict
depends on lardataobj_RecoBase
(check?).lardataobj_RawData_dict
depends on lardataobj_RawData
(also: check?).
I would say that lardataobj_RecoBase_dict
does not depend on lardataobj_RawData_dict
(only on lardataobj_RawData
), so I would expect the two of them to run in parallel.
#5 Updated by Lynn Garren over 2 years ago
I do see some parallel builds of libraries with other parts of the build when I test, but it may depend on whether or not the other parts of the build are already complete. If you want to pursue this, would you please provide a set of steps that will allow us to reproduce what you see, along with your evidence?
#6 Updated by Kyle Knoepfel over 2 years ago
Which machine are you building on, Gianluca? That's an important data point.
#7 Updated by Gianluca Petrillo over 2 years ago
I was building on icarusbuild01.fnal.gov
.
In the attempt to satisfy Lynn request, I focused a bit more on all the steps being taken. It turns out that my observation was wrong, because I missed that almost at the beginning of the build GenReflex was being run.
That one lasts so long for RecoBase
that it is the last command to finish in the build. So while I had the impression that the compilation of RecoBase_dict
was waiting for AnalysisBase_dict
to finish (that one also ends very late in the build chain for the same reason), it was actually still waiting for its own GenReflex.
Probably RecoBase
should be split in multiple dictionaries, but when I tried it proved to be more complicate than I hoped. Maybe an updated recommendation on the use of #ifndef __GCCXML__
directive might help.
Sorry for the distraction, this request can be closed.
#8 Updated by Lynn Garren over 2 years ago
- Status changed from Feedback to Closed
Thanks for the analysis. Closing as per request.