Cluster::AllCells returns a copy, Cluster::XCells and Cluster::YCells frequently called in a way to require a copy
Frequently Cluster::X(Y)Cells is called in such a way that a copy of the request vector has to be produced rather than using the provided const reference. That is
art::PtrVector<rb::Hit> cells = cluster.XCells();
art::PtrVector<rb::Hit> const& cells = cluster.XCells();
The latter would not require a copy. It turns out that Cluster::X(Y)Cells() is used in only 8 places throughout the code and these changes should be relatively easy.
On the other hand, Cluster::AllCells() has to return a copy of the art::PtrVector<rb::Hit> because it loops over the X and Y cells in the cluster and fills a new vector. If, instead, we made the AllCells vector at the time of the Cluster construction and made use of std::partition to then get the AllCells vector partitioned such that all XCells are at the beginning of the vector and all YCells are at the end. The Cluster object could then return the iterators that bound the 2 partitions as a reference to an std::pair that could be used by the calling code to loop over the X or Y cells. AllCells is called in 135 places in the code, and if it is called inside of nested loops it is seriously slowing our code down.
This change would require some significant digging through the code. It is easy to identify where the methods are called with ack, for instance, but it will take some effort to make the changes and it will require an IO rule to not break files already written to disk.
One option to achieve some performance improvement would be to change the return type of XCells and YCells to be copies of the vectors as those methods are not called frequently, and instead make AllCells return a const reference. Some one would still have to go through the code to ensure that copies are not being made in the code that calls the AllCells method. No change would be required to calls to XCells and YCells since none of those calls actually make use of the const reference anyway. Ug.
#1 Updated by Christopher Backhouse over 5 years ago
This is right in principle. In practice I don't think it matters. I've never seen these calls actually show up in profiles (whatever module's inner loop over the cells themselves tends to dominate). This may be because a PtrVector is pretty lightweight, I think each new Cell is just one more integer index into the product.
Do you have a case where this is shown to cause a significant slowdown?
#3 Updated by Christopher Backhouse over 5 years ago
You might be missing quite a few calls to the individual views via Cell(view, int), which is better than CellX/CellY if you're doing the same thing for each view, since it's easier to include in a loop. A bit harder to grep for though.
I think that's everything now. Apart from to mention that I did consider other schemes when implementing this but decided on the simplest one so long as we didn't have data showing it was causing real performance problems.
#4 Updated by Brian Rebel over 5 years ago
I think it is worth considering making the change to provide good examples for people how to use the code. These bad practices are being copied a lot. It is also probably worth at least adding some comments to the code based on your proposed best practices, i.e. iterating using the NCells and Cells methods.
It is also worth noting that the use case appears to be somewhat different from the one you expected, i.e. people tend to call AllCells more frequently than XCells or YCells, so going behind the scenes to make AllCells return a const reference and the others do the copies would improve the situation without altering the user experience much at all.
Just because we are getting away with bad practices, doesn't mean we shouldn't try to clean them up when we find them. That being said, I am not volunteering to do this just now.
#5 Updated by Christopher Backhouse over 5 years ago
I looked through all the instances of AllHits() and fixed the easy cases. Many of the callers are passing the hits off to BackTracker, or creating a new rb product from them, so there's nothing to be done right now.
I saw some legitimate uses of getting the full concrete list, including wanting to sort the hits in a custom order, and wanting to edit the hit list.