Project

General

Profile

Bug #5431

Move data products into packages containing only data products

Added by Brian Rebel almost 6 years ago. Updated about 3 years ago.

Status:
Rejected
Priority:
High
Assignee:
-
Category:
-
Start date:
02/12/2014
Due date:
% Done:

0%

Estimated time:
Duration:

Description

There are several circular dependencies in the NOvA code, some of which are due to data products being in the same packages as algorithms, modules and services. The resolution to these dependencies is to move the definition of data products and other similar objects into separate packages.

These packages are currently only for data products, no modules or services, and so need not be changed in any way:

CAFReweight (Need to ensure that this one only ever contains data products, currently only CAFReweight product)
CalibrationDataProducts
RawData
RecoBase
Simulation
StandardRecord
SummaryData

The following packages currently mix data products with modules, algorithms or services.

CellToCell:

data product is CellSummary

ChannelInfo:

defines dictionary for the following objects:

nova::dbi::RunHistory
nova::dbi::RunHistory::FEB
nova::dbi::RunHistory::DCM
nova::dbi::RunHistory::BNEVB
nova::dbi::RunHistory::DiBlock
nova::dbi::BadChan_t
nova::dbi::ChanInfo_t
std::map<geo::OfflineChan,std::vector<nova::dbi::BadChan_t>
std::map<geo::OfflineChan,std::vector<nova::dbi::ChanInfo_t>

Commissioning:

defines dictionary for the following data products:

comi::CanaNt
comi::LeanaNt

CosRej:

defines dictionary for cosrej::CosRejObj

FEBFlashFilter:

defines dictionary for:

febflash::FEBFlashInfo
febflash::AssociatedSliceInfo

Geometry:

defines dictionary for geo::OfflineChan (need to just move the classes_def file to GeometryObjects)

HMatrixE:

defines dictionary for hme::HMPid

HoughTransform:

defines dictionary for hough::HValidateNt

KalmanFilter3D:

defines dictionaries for:

kalm::KTrackSimple
kalm::KTrackStateSimple
kalm::KTrackStateBase
kalm::KTrackRecoInfo
kalm::KShowerRecoInfo
kalm::KIDInfo
kalm::KShower
kalm::KID
kalm::KIDNames

LEM

defines dictionaries for:

lem::PIDDetails
lem::MatchSummary
lem::LibrarySummary
lem::MatchList
lem::PIDExtraVars
lem::EventSummary
lem::LiteHit
lem::LEMInput

LIDBuilder:

defines dictionary for:

slid::LID
slid::ShowerLID

MCCheater:

defines dictionary for:

cheat::SimHit
cheat::SimTrack

MichelEFilter:

defines dictionary for:

me::MichelEInfo
me::MichelECluster

MuonRemove:

defines dictionary for murem::MRCCParent

NueSandbox:

defines dictionary for nuesand::NueSandObj

NuMuBDTSelector

defines dictionary for numubdtsel::EID

NumuEnergy

defines dictionary for numue::NumuE

NumuSandbox

defines dictionary for numusand::NumuSandObj

QEEventFinder

defines dictionary for qeef::QePId

RecoJMShower

defines dictionary for

jmshower::JMShower
jmshower::EID

RecVarPID

defines dictionary for rvp::RVP

ReMId

defines dictionary for remid::ReMId

Slicer

defines dictionary for slicer::SlicerAnaNt

TimingFit

defines dictionary for tf::TimingFitResult

These products should probably go into Simulation:

cheat::SimHit
cheat::SimTrack
murem::MRCCParent

I believe we can accommodate the rest of the data products in a few new packages.

PIDDataProducts - to hold all data products associated with particle ID algorithms:

hme::HMPid
kalm::KIDInfo
kalm::KID
kalm::KIDNames
lem::PIDDetails
lem::MatchSummary
lem::LibrarySummary
lem::MatchList
lem::PIDExtraVars
lem::EventSummary
lem::LiteHit
lem::LEMInput
numubdtsel::EID
slid::LID
slid::ShowerLID
qeef::QePId
jmshower::EID
rvp::RVP
remid::ReMId

NtupleDataProducts - holds data products associated with ntuple output

comi::CanaNt
comi::LeanaNt
hough::HValidateNt
slicer::SlicerAnaNt

DetPerformanceDataProducts - holds data products associated with the detector state

c2c::CellSummary
nova::dbi::RunHistory
nova::dbi::RunHistory::FEB
nova::dbi::RunHistory::DCM
nova::dbi::RunHistory::BNEVB
nova::dbi::RunHistory::DiBlock
nova::dbi::BadChan_t
nova::dbi::ChanInfo_t
std::map<geo::OfflineChan,std::vector<nova::dbi::BadChan_t>
std::map<geo::OfflineChan,std::vector<nova::dbi::ChanInfo_t>
febflash::FEBFlashInfo
febflash::AssociatedSliceInfo
tf::TimingFitResult

AnalysisDataProducts - data products associated with an analysis result

cosrej::CosRejObj
nuesand::NueSandObj
numusand::NumuSandObj
numue::NumuE

RecoDataProducts - data products that are not RecoBase but still used for reconstruction

jmshower::JMShower
me::MichelEInfo
me::MichelECluster
kalm::KTrackSimple
kalm::KTrackStateSimple
kalm::KTrackStateBase
kalm::KTrackRecoInfo
kalm::KShowerRecoInfo
kalm::KShower

We will have to verify the dependencies for all these data products are limited to Simulation, RecoBase, Geometry and other very foundational packages and that there are no circular dependencies in the above. By separating out data products from algorithms that last one should be easy to do.

History

#1 Updated by Christopher Backhouse almost 6 years ago

Things that I think we can just remove instead of moving, and things I think may have been misclassified:

CAFReweight (Need to ensure that this one only ever contains data products, currently only CAFReweight product)

I have no idea why any of these things are data products. CAFReweight is a simple interface to NuReweight, with nothing that should ever be serialized.

CellToCell

This is Tim Kutnick's old drift corrections code, we don't use it any more.

ChannelInfo

Sounds like Jon just wants to get rid of all these products

MCCheater:

defines dictionary for:

cheat::SimHit
cheat::SimTrack

These were obsoleted long ago when BackTracker was first written. I don't believe anyone is using them anymore.

These products should probably go into Simulation:

murem::MRCCParent

Really? This is mostly reco information. The "parent" is the slice that had the muon removed.

DetPerformanceDataProducts - holds data products associated with the detector state

tf::TimingFitResult

I don't think this is associated with detector performance. This is a timing fit for the direction of a single track, it probably belong in RecoDataProducts.

#2 Updated by Christopher Backhouse almost 6 years ago

General thoughts and questions:

The risk of circular dependencies must be much lower for reco and analysis code, where the modules tend to be basically independent, and always run in a specific order in the reco job. Whereas the basic services have a tendency to get all tangled up.

When these products have matching cxx files (I think all of them must have at least constructors, and some have pretty significant logic), are those left in the original package to be built into their library? Only the header (and default constructor?) is needed to generate the dict, right?

Some of these proposed packages are pretty broad and mix code from a lot of different sources with pretty different goals. Is it possible to do something like have packages have a prods/ directory where they put data product definitions, and do something in the makefiles to get these built early and the main packages built later?

What happens to the svn history when files move between packages?

#3 Updated by Brian Rebel almost 6 years ago

Actually, there is still a high risk of circular dependencies between reconstruction and analysis code. We saw that happen in LArSoft - people see something useful in one package and link to it in another and you get these dependencies. That is especially true with data products.

Putting data products in a few packages makes it possible to find all the analysis or reconstruction products easily, rather than having to hunt for them in addition to preventing possible circular dependencies. I don't think there is a reason to keep data products in very specific packages. Data products should be available for multiple algorithms and modules.

The svn history moves with the file if one uses the

svn mv

command to move the file.

#4 Updated by Brian Rebel almost 6 years ago

Christopher Backhouse wrote:

Things that I think we can just remove instead of moving, and things I think may have been misclassified:

CAFReweight (Need to ensure that this one only ever contains data products, currently only CAFReweight product)

I have no idea why any of these things are data products. CAFReweight is a simple interface to NuReweight, with nothing that should ever be serialized.

OK, we should be sure of whether it needs to be a data product or not

CellToCell

This is Tim Kutnick's old drift corrections code, we don't use it any more.

That is what I thought, so I am fine with removing it.

ChannelInfo

Sounds like Jon just wants to get rid of all these products

Also fine by me.

MCCheater:

defines dictionary for:

cheat::SimHit
cheat::SimTrack

These were obsoleted long ago when BackTracker was first written. I don't believe anyone is using them anymore.

Yes, well, Steve M might disagree with you.

These products should probably go into Simulation:

murem::MRCCParent

Really? This is mostly reco information. The "parent" is the slice that had the muon removed.

It is? Then it should move into the RecoDataProducts.

DetPerformanceDataProducts - holds data products associated with the detector state

tf::TimingFitResult

I don't think this is associated with detector performance. This is a timing fit for the direction of a single track, it probably belong in RecoDataProducts.

Fine

#5 Updated by Alexander Himmel about 3 years ago

  • Status changed from New to Rejected


Also available in: Atom PDF