Project

General

Profile

Prelude

One of the main goals for all these design guidelines is to minimize coupling. Some of the recommendations we provide will make it more difficult to write the initial version of a class. The minimization of coupling will make it easier to modify these classes in the future, and for others to use them. The initial investment in more robust code has paid off over the expected lifetime of the code. Since it is impossible to accurately predict which classes will need modification, a small investment in robustness up-front is what we recommend. Minimizing coupling also helps to avoid circular coupling, which causes significant difficulties in builds.

For extended discussion of some of these issues (and for many additional good suggestions), see C++ Coding Standards by Herb Sutter and Andrei Alexandrescu. This is available in the Fermilab library (http://www-spires.fnal.gov/spires/find/books/www?cl=QA76.73.C153S85::2005).

Data Product Design Guide

Separate algorithm code from data products.

In the general case, object-oriented paradigms (and we) encourage the design and use of, "objects that do with what they have" rather than, "objects that have and are done to." In the particular case of data persistency in art / ROOT however, we have an exception caused by the need to transport each product as an entity through different layers and modules of the program. There is a separation-of-concerns issue here: nothing in art's infrastructure code or ROOT needs to know anything whatsoever about the algorithms which operate on the data they are moving and making available. Similarly, algorithms often need to reference multiple products that have no particular reason to be coupled together.

In addition, the data description will change much less frequently than the algorithms. Separation of these concerns improves robustness, compile times and library (and dictionary library) sizes. Also, many different algorithms may need to deal with the same data product but are (and should be) unaware of each other, such as alternative cone-finding or clustering algorithms.

Finally, one should be free to program an algorithm without being bound by the constraints of the perisistency / dictionary system.

To summarize: the primary function of a data product is the transport of data between different algorithms (a.k.a. modules) and this should be the principal driver of its structure and couplings. Algorithms should be separated from and use such data products, not be part of them.

Pointers and references are forbidden.

A data product may not have pointers or references (or containers thereof) as data members. Bare pointers as data members in classes are poor class design in any case, as described in Guidelines for the Use of Pointers. References as data members are still more problematic. We suggest containment by value or use of the art class templates art::Ptr<T> or art::PtrVector<T>, as necessary.

Prefer containment to inheritance.

Inheritance is one of the strongest couplings in C++. Prefer containment to inheritance to help reduce the coupling between classes. See Items 34--36 in Sutter & Alexandrescu. Discussions of this subject frequently reference the Liskov Substitution Principle; Wikipedia has a useful description of this at http://en.wikipedia.org/wiki/Liskov_substitution_principle.

For persistent data products, there are possible schema-evolution complications, especially with collections of such items. See Facilitating Schema Evolution for Data Products.

Use struct for value aggregates.

If your "class" would have no interesting member functions other than "getters" and "setters", prefer the use of a struct to represent what is actually a value aggregate (and not an abstraction). See Item 41 in Sutter & Alexandrescu.

If member functions are necessary, restrict them to simple accessors if possible.

If you are using a struct, to represent a value aggregate, you do not need accessors for data members. Complex behaviors should be avoided in data products. Algorithms should not be embedded in the constructors of data products; they should be external to the data, for the decoupling reasons described above.

Ensure that the Meaning of All Comparison Operators is Unambiguous

Imagine a data class, RawData, that represents the raw data from some subsystem, with data members a channel id, a tdc value and an adc value. For such a class there may be up to 15 possible sorting criteria, by channel id, by tdc value, by adc value, by the six possible ordered pairs and by the six possible ordered triplets. Therefore we recommend that, for this class, you do NOT provide bool operator<( const RawData& ) const;. Instead provide as many of the 15 comparisons as make sense by using appropriately named free functions in the same namespace as the class; for example


bool lessByChannelID ( const RawData& a, const RawData& b );
bool lessByTDC       ( const RawData& a, const RawData& b );
bool lessByADC       ( const RawData& a, const RawData& b );
bool lessByAll       ( const RawData& a, const RawData& b );

where the last function implements a comparison based on a well documented ordering of the three sort keys. Only if that ordering is also widely understood, even by beginners, should you implement it as bool operator<( const RawData& ) const;.

Allow compiler-generated destructor, copy, move where possible.

If your data product is simple, as it should be (i.e. no pointers or references) then explicit provision of destructor, copy constructor, copy assignment, move constructor, and move assignment is not necessary ("Rule of Five"). If for any reason1 you provide one of them, then as the compiler will not automatically generate the others you should provide all five. This will allow the framework to be as efficient as possible when moving your data products around, using move semantics wherever possible.

Only if you need to provide any or all of the above-mentioned member functions, it should be sufficient to ask the compiler explicitly to provide them:

 /* virtual */ ~MyProduct() { }
#ifndef __GCCXML__
 MyProduct(MyProduct const &) = default; // Copy constructor.
 MyProduct(MyProduct &&) = default; // Move constructor.
 MyProduct & operator = (MyProduct const &) = default; // Copy assignment.
 MyProduct & operator = (MyProduct &&) = default; // Move assignment.
#endif
Note that the destructor must be visible to gccxml (see below). The other methods may be hidden from gccxml in their entirety. It is far easier, therefore, to avoid inheritance and other complications and rely on the compiler to implement these functions without being asked.

Issues mostly related to ROOT.

Minimize the memory footprint of dictionaries.

ROOT requires a public default constructor and a public destructor.

If the correct implementation of your destructor would be empty, it is safe to allow the compiler to generate the destructor for you (although see above).
While the compiler can, under the correct circumstances, generate a default constructor, in order to avoid possible difficulties for classes which have additional (non-default) constructors, we suggest explicitly implementing a default constructor.

Hide all methods except the default constructor and the destructor from gccxml. The art framework will never make use of any of these dictionary functions that would be generated if gccxml sees them. In order to not waste memory, we recommend not generating dictionaries for any methods except those needed for persistence (which is why the default constructor and the destructor must have dictionary entries).

Inline function implementations should be outside the class definition.

This makes it possible to hide inline definitions from gccxml en bloc, including any header includes necessary for these definitions.

#include "MyContainedClass.h" 
#include "MyPath/Thing.h" // Should also be simple.

#include <vector>

class MyProduct {
public:
  MyProduct() { }
  // The compiler-generated destructor is correct, so
  // we do not write one.
#ifndef __GCCXML__
  void addThing(mynm::Thing const & thing);

  void addThing(mynm::Thing && thing);

  ...
#endif

private:
  std::vector<mynm::Thing> things_;
};

#ifndef __GCCXML__
#include "pkg/MyHeader.hh" 

inline
void
MyProduct::addThing(mynm::Thing && thing)
{
  things_.emplace_back(std::move(thing));
}

inline
void
MyProduct::addThing(mynm::Thing const & thing)
{
  things_.emplace_back(thing);
}
#endif

Keep header inclusions in the data product header to an absolute minimum.

  • Use forward declarations wherever possible (pointer or reference use in interfaces require only knowledge of the name of a type, not its definition).
    #ifndef __GCCXML__
    namespace mynm {
      // Forward declaration of class MyType. GCCXML won't see the
      // function, so it doesn't need to see the type, either.
      class MyType;
    }
    #endif
    
    class MyProd {
    public:
      ...
    #ifndef __GCCXML__
      void fillMyType(MyType &) const;
    #endif
      ...
    };
    
    #ifndef __GCCXML__
    // Include the full header for MyType since we now need to know
    // the actual definition in order to fill it.
    #include "MyPath/MyType.h" 
    
    inline
    void
    MyProd::fillMyType(MyType & mt)
    {
      mt.method(...);
      ...
    }
    #endif

Prefer façade classes to transient data members.

If derivative information is expensive to calculate and should be cached as transient data, this should be done via a façade class. This is actually the most robust way to hide transient members from the I/O system (and the only way for non-ROOT I/O systems currently in development). This is a clean way of separating concerns, and leaves one with a "clean" and easy to read data product header.

  • Reading:
    class ReadFacade {
    public:
      explicit ReadFacade(Product const & prod) : prod_(prod) { }
      ...
    
    private:
      Product const & prod_;
    
      // Other cached derivative members.
    };
  • Writing:
    class WriteFacade : public ReadFacade {
    public:
      explicit WriteFacade(Product & prod) : ReadFacade(prod), prod_(prod) { }
      ...
    
    private:
      Product & prod_;
    
      ...
    };

These products are cheap to make and destroy on the stack, and can completely isolate whole chains of headers (e.g. from the geometry service) from gccxml thereby helping to minimize coupling to the data products themselves in addition to allowing for transient derivative data. Note that the cases where one might need a writing façade are extremely rare, but the example has been included for completeness.

Dealing with C++ 2011 with the current ROOT tools.

ROOT is not magic: it has many limitations to its ability to produce dictionaries for data products. These limitations are more pronounced currently, where we are now able to use C++ 2011 features. gccxml, the utility which produces the syntax tree that genreflex uses, is not able to understand C++ 2011 (or C-99) features. Therefore any code to be seen by gccxml must either not use these modern features, or the use of such features must be wrapped with:

#ifndef __GCCXML__
#endif
With the nested nature of headers therefore, it is important to follow the guidelines above especially with regard to minimizing header includes and protection from gccxml of code and includes it does not need to see with #ifndef __GCCXML__.


1 Explicit virtual destructor in a base class, for example -- although see guideline above regarding inheritance.