Project

General

Profile

Feature #21717

It would be nice for ContainerFragmentLoader/Fragment/QuickVec to use fewer memcpy's when adding lots of fragments to a Container

Added by Kurt Biery about 1 month ago. Updated 18 days ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
01/18/2019
Due date:
% Done:

0%

Estimated time:
Experiment:
-
Co-Assignees:
Duration:

Description

This Issue is very much related to Issues 21267, 21640, and 21641. It was during the testing of the fixes for those three Issues that I noticed some residual slowness in building the Container fragment when using the circular_buffer_mode_example simple_test_config.

Actually, I was simply trying to measure the time taken when building Containers when I stumbled onto this issue.

In my testing, I used the following software:
  • artdaq_utilities branch feature/AggregateMetricsInMetricManager
  • artdaq_core branch bugfix/ContainerFragmentLoader_MemoryLeak
  • artdaq branch bugfix/CFG_ExplicitlyTrackDataBufferSizeBytes
the following command:
  • sh ./run_demo.sh --config circular_buffer_mode_example --bootfile `pwd`/artdaq-utilities-daqinterface/simple_test_config/circular_buffer_mode_example/boot.txt --comps component01 component03 --runduration 100 --no_om --partition=4
I enabled the following extra TRACE levels:
  • component03_CommandableFragmentGenerator 9, 20
  • ContainerFragmentLoader 4, 20
  • QuickVec 13
and I often used the following tshow command:
  • tshow | egrep 'component03|QuickVec|ContainerFragmentLoader' | tdelta -ct 1 | more
In the circular_buffer_mode_example config, I modified the following files:
  • DataLogger1.fcl to turn off ROOT compression for the data files (compressionLevel: 0)
  • component03_hw_cfg.fcl to modify the size of the fragments (nADCcounts: 10) and the size of the request window (request_window_offset: 2000 request_window_width: 2000)
    [Note that the default event rate with this configuration is 1 Hz.]

Without modifying any software, I found that the maximum nADCcounts that could be used, without causing warning messages about timeouts waiting to close request windows, was around 650-700 (keeping the request window at 2000).

With nADCcounts set to 600, I found that the time to build the ContainerFragment in component03 with 2000 sub-fragments was about 700 msec.

[biery@mu2edaq01 331Demo]$ tshow | egrep 'component03|QuickVec|ContainerFragmentLoader' | tdelta -ct 1 | egrep 'BEGIN|Finish' | head -33
   1741 01-18 07:55:37.380171 31824 15515   9           component03_DataSenderManager dbg . Shutting down DataSenderManager BEGIN
   1961 01-18 07:55:36.878857 31824 15515   9 component03_CommandableFragmentGenerato d09 . Finished Processing Event 105 for fragment_id 2.
   5166 01-18 07:55:36.439562 31824 21398  22 component03_CommandableFragmentGenerato dbg . Stopping Request receiver thread BEGIN
   5167 01-18 07:55:36.439530 31824 21398  22             component03_BoardReaderCore dbg . Stopping CommandableFragmentGenerator BEGIN
  14020 01-18 07:55:36.139011 31824 15515   9 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
  15213 01-18 07:55:35.887201 31824 15515   6 component03_CommandableFragmentGenerato d09 . Finished Processing Event 104 for fragment_id 2.
  27262 01-18 07:55:35.138784 31824 15515   6 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
  28457 01-18 07:55:34.864581 31824 15515   7 component03_CommandableFragmentGenerato d09 . Finished Processing Event 103 for fragment_id 2.
  40504 01-18 07:55:34.138895 31824 15515   7 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
  41694 01-18 07:55:33.871851 31824 15515   9 component03_CommandableFragmentGenerato d09 . Finished Processing Event 102 for fragment_id 2.
  53746 01-18 07:55:33.138925 31824 15515   9 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
  54933 01-18 07:55:32.881329 31824 15515   7 component03_CommandableFragmentGenerato d09 . Finished Processing Event 101 for fragment_id 2.
  66987 01-18 07:55:32.138963 31824 15515   7 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
  68177 01-18 07:55:31.880698 31824 15515  11 component03_CommandableFragmentGenerato d09 . Finished Processing Event 100 for fragment_id 2.
  80232 01-18 07:55:31.138965 31824 15515  11 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
  81422 01-18 07:55:30.872444 31824 15515   8 component03_CommandableFragmentGenerato d09 . Finished Processing Event 99 for fragment_id 2.
  93473 01-18 07:55:30.138895 31824 15515   8 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
  94666 01-18 07:55:29.869052 31824 15515   6 component03_CommandableFragmentGenerato d09 . Finished Processing Event 98 for fragment_id 2.
 106713 01-18 07:55:29.138894 31824 15515   6 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
 107905 01-18 07:55:28.877343 31824 15515  11 component03_CommandableFragmentGenerato d09 . Finished Processing Event 97 for fragment_id 2.
 119955 01-18 07:55:28.138993 31824 15515  11 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
 121147 01-18 07:55:27.898356 31824 15515  10 component03_CommandableFragmentGenerato d09 . Finished Processing Event 96 for fragment_id 2.
 133197 01-18 07:55:27.138900 31824 15515  10 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
 134387 01-18 07:55:26.893648 31824 15515  11 component03_CommandableFragmentGenerato d09 . Finished Processing Event 95 for fragment_id 2.
 146437 01-18 07:55:26.138838 31824 15515  11 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
 147628 01-18 07:55:25.884141 31824 15515   8 component03_CommandableFragmentGenerato d09 . Finished Processing Event 94 for fragment_id 2.
 159680 01-18 07:55:25.138925 31824 15515   8 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
 160867 01-18 07:55:24.872740 31824 15515   7 component03_CommandableFragmentGenerato d09 . Finished Processing Event 93 for fragment_id 2.
 172920 01-18 07:55:24.139693 31824 15515   7 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
 174109 01-18 07:55:23.878669 31824 15515   7 component03_CommandableFragmentGenerato d09 . Finished Processing Event 92 for fragment_id 2.
 186162 01-18 07:55:23.138960 31824 15515   7 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
 187352 01-18 07:55:22.874284 31824 15515   7 component03_CommandableFragmentGenerato d09 . Finished Processing Event 91 for fragment_id 2.
 199403 01-18 07:55:22.138823 31824 15515   7 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN

If I modified the CommandableFragmentGenerator code to pre-allocate sufficient capacity in the Fragment to avoid re-allocations, this went down to about 70-75 msec.

[biery@mu2edaq01 Application]$ git diff .
diff --git a/artdaq/Application/CommandableFragmentGenerator.cc b/artdaq/Application/CommandableFragmentGenerator.cc
index 67fd0e62..605284d8 100644
--- a/artdaq/Application/CommandableFragmentGenerator.cc
+++ b/artdaq/Application/CommandableFragmentGenerator.cc
@@ -951,6 +951,7 @@ void artdaq::CommandableFragmentGenerator::applyRequestsWindowMode(artdaq::Fragm
                        TLOG(TLVL_DEBUG) << "applyRequests: Creating ContainerFragment for Window-requested Fragments";
                        frags.emplace_back(new artdaq::Fragment(req->first, fragment_id()));
                        frags.back()->setTimestamp(ts);
+                       frags.back()->reserve(350000);
                        ContainerFragmentLoader cfl(*frags.back());

                        // In the spirit of NOvA's MegaPool: (RS = Request start (min), RE = Request End (max))

[biery@mu2edaq01 331Demo]$ tshow | egrep 'component03|QuickVec|ContainerFragmentLoader' | tdelta -ct 1 | egrep 'BEGIN|Finish' | head -33
   1784 01-18 08:05:46.253652 18172  2712   2           component03_DataSenderManager dbg . Shutting down DataSenderManager BEGIN
   1805 01-18 08:05:46.236301 18172  7720  23 component03_CommandableFragmentGenerato dbg . Stopping Request receiver thread BEGIN
   1806 01-18 08:05:46.236281 18172  7720  23             component03_BoardReaderCore dbg . Stopping CommandableFragmentGenerator BEGIN
   2722 01-18 08:05:45.593431 18172  2712   0 component03_CommandableFragmentGenerato d09 . Finished Processing Event 103 for fragment_id 2.
  12732 01-18 08:05:45.518591 18172  2712   0 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
  13963 01-18 08:05:44.591715 18172  2712   1 component03_CommandableFragmentGenerato d09 . Finished Processing Event 102 for fragment_id 2.
  23974 01-18 08:05:44.518759 18172  2712   1 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
  25203 01-18 08:05:43.591909 18172  2712   1 component03_CommandableFragmentGenerato d09 . Finished Processing Event 101 for fragment_id 2.
  35214 01-18 08:05:43.519516 18172  2712   1 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
  36446 01-18 08:05:42.593782 18172  2712   1 component03_CommandableFragmentGenerato d09 . Finished Processing Event 100 for fragment_id 2.
  46459 01-18 08:05:42.518755 18172  2712   1 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
  47687 01-18 08:05:41.589687 18172  2712   4 component03_CommandableFragmentGenerato d09 . Finished Processing Event 99 for fragment_id 2.
  57699 01-18 08:05:41.518690 18172  2712   4 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
  58927 01-18 08:05:40.592205 18172  2712  15 component03_CommandableFragmentGenerato d09 . Finished Processing Event 98 for fragment_id 2.
  68939 01-18 08:05:40.518732 18172  2712   3 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
  70168 01-18 08:05:39.591333 18172  2712   2 component03_CommandableFragmentGenerato d09 . Finished Processing Event 97 for fragment_id 2.
  80178 01-18 08:05:39.518719 18172  2712   2 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
  81409 01-18 08:05:38.591847 18172  2712   5 component03_CommandableFragmentGenerato d09 . Finished Processing Event 96 for fragment_id 2.
  91420 01-18 08:05:38.518705 18172  2712   5 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
  92648 01-18 08:05:37.589725 18172  2712   3 component03_CommandableFragmentGenerato d09 . Finished Processing Event 95 for fragment_id 2.
 102660 01-18 08:05:37.518601 18172  2712   3 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
 103890 01-18 08:05:36.592303 18172  2712   3 component03_CommandableFragmentGenerato d09 . Finished Processing Event 94 for fragment_id 2.
 113900 01-18 08:05:36.518650 18172  2712   3 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
 115131 01-18 08:05:35.590101 18172  2712   5 component03_CommandableFragmentGenerato d09 . Finished Processing Event 93 for fragment_id 2.
 125142 01-18 08:05:35.518630 18172  2712   5 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
 126370 01-18 08:05:34.591598 18172  2712  16 component03_CommandableFragmentGenerato d09 . Finished Processing Event 92 for fragment_id 2.
 136382 01-18 08:05:34.518663 18172  2712  16 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
 137610 01-18 08:05:33.591858 18172  2712  16 component03_CommandableFragmentGenerato d09 . Finished Processing Event 91 for fragment_id 2.
 147620 01-18 08:05:33.518730 18172  2712   4 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
 148849 01-18 08:05:32.591677 18172  2712   2 component03_CommandableFragmentGenerato d09 . Finished Processing Event 90 for fragment_id 2.
 158862 01-18 08:05:32.518576 18172  2712   2 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
 160089 01-18 08:05:31.589865 18172  2712  15 component03_CommandableFragmentGenerato d09 . Finished Processing Event 89 for fragment_id 2.
 170100 01-18 08:05:31.518659 18172  2712   3 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN

I have some sample changes to ContainerFragmentLoader/Fragment/QuickVec that I'll describe next.


Related issues

Related to artdaq - Bug #21267: Problems seen with large, and non-unique, request windowsReviewed2019-01-08

Related to artdaq - Bug #21640: Memory leak in ContainerFragmentLoaderClosed2019-01-08

Related to artdaq - Feature #21641: Improve performance of metric reporting subsystemReviewed2019-01-08

History

#1 Updated by Kurt Biery about 1 month ago

  • Related to Bug #21267: Problems seen with large, and non-unique, request windows added

#2 Updated by Kurt Biery about 1 month ago

  • Related to Bug #21640: Memory leak in ContainerFragmentLoader added

#3 Updated by Kurt Biery about 1 month ago

  • Related to Feature #21641: Improve performance of metric reporting subsystem added

#4 Updated by Kurt Biery about 1 month ago

With the following candidate code changes in artdaq_core:

[biery@mu2edaq01 artdaq-core]$ git diff .
diff --git a/artdaq-core/Core/QuickVec.hh b/artdaq-core/Core/QuickVec.hh
index aeeabc3..f70fdd3 100644
--- a/artdaq-core/Core/QuickVec.hh
+++ b/artdaq-core/Core/QuickVec.hh
@@ -265,6 +265,18 @@ struct QuickVec
         */
        void resize(size_t size);

+       /**
+        * \brief Resizes the QuickVec and requests additional capacity
+        * \param size New size of the QuickVec
+        * \param growthFactor Factor to use when allocating additional capacity
+        * 
+        * This method updates the size of the QuickVec.  If the new size is within the current
+        * capacity, no realloction takes place.  If not, then the reallocation reserves
+        * additional capacity as a cushion against future needs to reallocate, based
+        * on the specified growth factor.
+        */
+       void resizeWithCushion(size_t size, double growthFactor = 1.3);
+
        /**
         * \brief Resizes the QuickVec, initializes new elements with val
         * \param size New size of the QuickVec
@@ -404,8 +416,8 @@ inline void QUICKVEC::reserve(size_t size)
                //data_ = new TT_[size];
                data_ = (TT_*)QV_MEMALIGN(QV_ALIGN, size * sizeof(TT_));
                memcpy(data_, old, size_ * sizeof(TT_));
-               TRACEN("QuickVec", 13, "QUICKVEC::reserve after memcpy this=%p old=%p data_=%p" 
-                         , (void*)this, (void*)old, (void*)data_);
+               TRACEN("QuickVec", 13, "QUICKVEC::reserve after memcpy this=%p old=%p data_=%p capacity=%d" 
+                       , (void*)this, (void*)old, (void*)data_, (int)size);
                free(old);
                capacity_ = size;
        }
@@ -421,13 +433,29 @@ inline void QUICKVEC::resize(size_t size)
                TT_* old = data_;
                data_ = (TT_*)QV_MEMALIGN(QV_ALIGN, size * sizeof(TT_));
                memcpy(data_, old, size_ * sizeof(TT_));
-               TRACEN("QuickVec", 13, "QUICKVEC::resize after memcpy this=%p old=%p data_=%p" 
-                         , (void*)this, (void*)old, (void*)data_);
+               TRACEN("QuickVec", 13, "QUICKVEC::resize after memcpy this=%p old=%p data_=%p size=%d" 
+                       , (void*)this, (void*)old, (void*)data_, (int)size);
                free(old);
                size_ = capacity_ = size;
        }
 }

+QUICKVEC_TEMPLATE
+inline void QUICKVEC::resizeWithCushion(size_t size, double growthFactor)
+{
+       if (size > capacity_)
+       {
+               size_t new_size = std::round(capacity_ * growthFactor);
+               if (new_size < size) {new_size = size;}
+               if (new_size < 512) {new_size = 512;}
+               else if (new_size < 2048) {new_size = 2048;}
+               else if (new_size < 4096) {new_size = 4096;}
+               else if (new_size < 8192) {new_size = 8192;}
+               reserve(new_size);
+       }
+       resize(size);
+}
+
 QUICKVEC_TEMPLATE
 inline void QUICKVEC::resize(size_type size, TT_ val)
 {
diff --git a/artdaq-core/Data/ContainerFragmentLoader.hh b/artdaq-core/Data/ContainerFragmentLoader.hh
index b9f9ead..8e95180 100644
--- a/artdaq-core/Data/ContainerFragmentLoader.hh
+++ b/artdaq-core/Data/ContainerFragmentLoader.hh
@@ -143,9 +143,9 @@ inline size_t artdaq::ContainerFragmentLoader::words_to_frag_words_(size_t nWord
 inline void artdaq::ContainerFragmentLoader::addSpace_(size_t bytes)
 {
        auto currSize = sizeof(artdaq::Fragment::value_type) * artdaq_Fragment_.dataSize(); // Resize takes into account header and metadata size
-       artdaq_Fragment_.resizeBytes(bytes + currSize);
+       artdaq_Fragment_.resizeBytesWithCushion(bytes + currSize, 1.3);
        reset_index_ptr_(); // Must reset index_ptr after resize operation!
-       TLOG(TLVL_TRACE,"ContainerFragmentLoader") << "addSpace_: dataEnd_ is now at " << (void*)dataEnd_();
+       TLOG(TLVL_TRACE,"ContainerFragmentLoader") << "addSpace_: dataEnd_ is now at " << (void*)dataEnd_() << " (oldSizeBytes/deltaBytes: " << currSize << "/" << bytes << ")";
 }

 inline void artdaq::ContainerFragmentLoader::addFragment(artdaq::Fragment& frag)
diff --git a/artdaq-core/Data/Fragment.hh b/artdaq-core/Data/Fragment.hh
index bd5edc0..ac45777 100644
--- a/artdaq-core/Data/Fragment.hh
+++ b/artdaq-core/Data/Fragment.hh
@@ -426,6 +426,16 @@ public:
         */
        void resizeBytes(std::size_t szbytes);

+       /**
+        * \brief Resize the data payload to hold szbytes bytes (padded by the
+        * 8-byte RawDataTypes, so, e.g., requesting 14 bytes will actually
+        * get you 16) and request additional capacity in the underlying storage
+        * (to help avoid extra reallocations)
+        * \param szbytes The new size of the payload portion of the Fragment, in bytes
+        * \param growthFactor The requested growth factor in the capacity of storage
+        */
+       void resizeBytesWithCushion(std::size_t szbytes, double growthFactor=1.3);
+
        /**
        * \brief Resize the data payload to hold sz bytes (padded by the
         * 8-byte RawDataTypes, so, e.g., requesting 14 bytes will actually
@@ -994,6 +1004,15 @@ artdaq::Fragment::resizeBytes(std::size_t szbytes)
        resize(nwords);
 }

+inline void
+artdaq::Fragment::resizeBytesWithCushion(std::size_t szbytes, double growthFactor)
+{
+       RawDataType nwords = ceil(szbytes / static_cast<double>(sizeof(RawDataType)));
+       vals_.resizeWithCushion(nwords + fragmentHeader()->metadata_word_count +
+                               fragmentHeader()->num_words(), growthFactor);
+       updateFragmentHeaderWC_();
+}
+
 inline
 void
 artdaq::Fragment::resizeBytes(std::size_t szbytes, byte_t v)

the time to build the ContainerFragment in component03 only increased slightly from the situation in which we pre-allocated all of the needed memory (time to build the Container now 70-80 msec):

[biery@mu2edaq01 331Demo]$ tshow | egrep 'component03|QuickVec|ContainerFragmentLoader' | tdelta -ct 1 | egrep 'BEGIN|Finish' | head -33
   1781 01-18 08:28:07.027691 10999 27273   1           component03_DataSenderManager dbg . Shutting down DataSenderManager BEGIN
   1801 01-18 08:28:07.011773 10999   696  19 component03_CommandableFragmentGenerato dbg . Stopping Request receiver thread BEGIN
   1803 01-18 08:28:07.011740 10999   696  19             component03_BoardReaderCore dbg . Stopping CommandableFragmentGenerator BEGIN
   2944 01-18 08:28:06.159095 10999 27273   2 component03_CommandableFragmentGenerato d09 . Finished Processing Event 104 for fragment_id 2.
  12974 01-18 08:28:06.083494 10999 27273   2 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
  14205 01-18 08:28:05.157217 10999 27273   1 component03_CommandableFragmentGenerato d09 . Finished Processing Event 103 for fragment_id 2.
  24234 01-18 08:28:05.083488 10999 27273   1 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
  25463 01-18 08:28:04.160515 10999 27273   2 component03_CommandableFragmentGenerato d09 . Finished Processing Event 102 for fragment_id 2.
  35494 01-18 08:28:04.083512 10999 27273   2 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
  36722 01-18 08:28:03.161145 10999 27273   3 component03_CommandableFragmentGenerato d09 . Finished Processing Event 101 for fragment_id 2.
  46752 01-18 08:28:03.083540 10999 27273   3 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
  47984 01-18 08:28:02.160590 10999 27273   1 component03_CommandableFragmentGenerato d09 . Finished Processing Event 100 for fragment_id 2.
  58014 01-18 08:28:02.083539 10999 27273   1 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
  59244 01-18 08:28:01.159539 10999 27273   0 component03_CommandableFragmentGenerato d09 . Finished Processing Event 99 for fragment_id 2.
  69275 01-18 08:28:01.083456 10999 27273   0 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
  70503 01-18 08:28:00.162934 10999 27273   4 component03_CommandableFragmentGenerato d09 . Finished Processing Event 98 for fragment_id 2.
  80534 01-18 08:28:00.083466 10999 27273   4 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
  81761 01-18 08:27:59.155817 10999 27273   4 component03_CommandableFragmentGenerato d09 . Finished Processing Event 97 for fragment_id 2.
  91791 01-18 08:27:59.083576 10999 27273   4 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
  93021 01-18 08:27:58.159049 10999 27273   1 component03_CommandableFragmentGenerato d09 . Finished Processing Event 96 for fragment_id 2.
 103051 01-18 08:27:58.083625 10999 27273   1 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
 104281 01-18 08:27:57.156822 10999 27273   5 component03_CommandableFragmentGenerato d09 . Finished Processing Event 95 for fragment_id 2.
 114310 01-18 08:27:57.083519 10999 27273   5 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
 115541 01-18 08:27:56.155521 10999 27273   5 component03_CommandableFragmentGenerato d09 . Finished Processing Event 94 for fragment_id 2.
 125570 01-18 08:27:56.083546 10999 27273   5 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
 126800 01-18 08:27:55.157505 10999 27273   2 component03_CommandableFragmentGenerato d09 . Finished Processing Event 93 for fragment_id 2.
 136830 01-18 08:27:55.083622 10999 27273   2 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
 138057 01-18 08:27:54.159200 10999 27273   2 component03_CommandableFragmentGenerato d09 . Finished Processing Event 92 for fragment_id 2.
 148089 01-18 08:27:54.083547 10999 27273   2 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
 149318 01-18 08:27:53.154880 10999 27273   0 component03_CommandableFragmentGenerato d09 . Finished Processing Event 91 for fragment_id 2.
 159346 01-18 08:27:53.083540 10999 27273   0 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN
 160576 01-18 08:27:52.156038 10999 27273   3 component03_CommandableFragmentGenerato d09 . Finished Processing Event 90 for fragment_id 2.
 170607 01-18 08:27:52.083512 10999 27273   3 component03_CommandableFragmentGenerato d09 . applyRequestsWindowMode BEGIN

The number of memcpy's per event in this situation was around 20:

[biery@mu2edaq01 artdaq-core]$ tshow | egrep 'component03|QuickVec|ContainerFragmentLoader' | tdelta -ct 1 | egrep memcpy | more
   4992 01-18 08:28:06.133516 10999 27273   2                                QuickVec d13 . QUICKVEC::reserve after memcpy this=0x7f6e740017b0 old=0x7f6e7429fa00 data_=0x7f6e7400be00 capacity=322587
   6831 01-18 08:28:06.115278 10999 27273   2                                QuickVec d13 . QUICKVEC::reserve after memcpy this=0x7f6e740017b0 old=0x7f6e7412ac00 data_=0x7f6e7429fa00 capacity=248144
   8242 01-18 08:28:06.103962 10999 27273   2                                QuickVec d13 . QUICKVEC::reserve after memcpy this=0x7f6e740017b0 old=0x7f6e7400be00 data_=0x7f6e7412ac00 capacity=190880
   9333 01-18 08:28:06.097094 10999 27273   2                                QuickVec d13 . QUICKVEC::reserve after memcpy this=0x7f6e740017b0 old=0x7f6e74138400 data_=0x7f6e7400be00 capacity=146831
  10169 01-18 08:28:06.092735 10999 27273   2                                QuickVec d13 . QUICKVEC::reserve after memcpy this=0x7f6e740017b0 old=0x7f6e7408e800 data_=0x7f6e74138400 capacity=112947
  10810 01-18 08:28:06.089952 10999 27273   2                                QuickVec d13 . QUICKVEC::reserve after memcpy this=0x7f6e740017b0 old=0x7f6e7400be00 data_=0x7f6e7408e800 capacity=86882
  11306 01-18 08:28:06.088152 10999 27273   2                                QuickVec d13 . QUICKVEC::reserve after memcpy this=0x7f6e740017b0 old=0x7f6e74094a00 data_=0x7f6e7400be00 capacity=66832
  11687 01-18 08:28:06.086972 10999 27273   2                                QuickVec d13 . QUICKVEC::reserve after memcpy this=0x7f6e740017b0 old=0x7f6e74047600 data_=0x7f6e74094a00 capacity=51409
  11983 01-18 08:28:06.086160 10999 27273   2                                QuickVec d13 . QUICKVEC::reserve after memcpy this=0x7f6e740017b0 old=0x7f6e7400be00 data_=0x7f6e74047600 capacity=39545
  12209 01-18 08:28:06.085602 10999 27273   2                                QuickVec d13 . QUICKVEC::reserve after memcpy this=0x7f6e740017b0 old=0x7f6e7404a400 data_=0x7f6e7400be00 capacity=30419
  12380 01-18 08:28:06.085235 10999 27273   2                                QuickVec d13 . QUICKVEC::reserve after memcpy this=0x7f6e740017b0 old=0x7f6e74027000 data_=0x7f6e7404a400 capacity=23399
  12516 01-18 08:28:06.084958 10999 27273   2                                QuickVec d13 . QUICKVEC::reserve after memcpy this=0x7f6e740017b0 old=0x7f6e7400be00 data_=0x7f6e74027000 capacity=17999
  12617 01-18 08:28:06.084739 10999 27273   2                                QuickVec d13 . QUICKVEC::reserve after memcpy this=0x7f6e740017b0 old=0x7f6e74029600 data_=0x7f6e7400be00 capacity=13845
  12698 01-18 08:28:06.084595 10999 27273   2                                QuickVec d13 . QUICKVEC::reserve after memcpy this=0x7f6e740017b0 old=0x7f6e74019400 data_=0x7f6e74029600 capacity=10650
  12829 01-18 08:28:06.084380 10999 27273   2                                QuickVec d13 . QUICKVEC::reserve after memcpy this=0x7f6e740017b0 old=0x7f6e74011200 data_=0x7f6e74019400 capacity=8192
  12895 01-18 08:28:06.084273 10999 27273   2                                QuickVec d13 . QUICKVEC::reserve after memcpy this=0x7f6e740017b0 old=0x7f6e7400d000 data_=0x7f6e74011200 capacity=4096
  12946 01-18 08:28:06.084198 10999 27273   2                                QuickVec d13 . QUICKVEC::reserve after memcpy this=0x7f6e740017b0 old=0x7f6e7400be00 data_=0x7f6e7400d000 capacity=2048
  12962 01-18 08:28:06.084164 10999 27273   2                                QuickVec d13 . QUICKVEC::reserve after memcpy this=0x7f6e740017b0 old=0x7f6e74002600 data_=0x7f6e7400be00 capacity=512
  12966 01-18 08:28:06.083554 10999 27273   2                                QuickVec d13 . QUICKVEC::resize after memcpy this=0x7f6e740017b0 old=0x7f6e74009c00 data_=0x7f6e74002600 size=6
  12967 01-18 08:28:06.083552 10999 27273   2                                QuickVec d13 . QUICKVEC::reserve after memcpy this=0x7f6e740017b0 old=0x7f6e7400aa00 data_=0x7f6e74009c00 capacity=5

Changing the growth factor to 2.0 decreased the number of memcpy's to 11. (this is mainly informational; I'm not sure we want to run the risk of allocating twice as much memory as needed [minus epsilon])

#5 Updated by Kurt Biery about 1 month ago

I've create a branch called feature/ContainerFragmentLoader_FewerMemCpy in the artdaq_core repo to save this code. This branch was forked from bugfix/ContainerFragmentLoader_MemoryLeak, so it already contains the fixes on that branch.

[BTW, I'm not sure that resizeWithCushion() is the most clear name... Maybe resizeWithExtraCapacity()?]

I'm going to run some tests of this new branch at protoDUNE, if I get the chance.

#6 Updated by Kurt Biery about 1 month ago

At protoDUNE, with the earlier software (none of the changes from this Issue, 21267, 21640, and 21641), a Trigger fragment size of 300-550 bytes, and the number of Trigger fragments per Container at 4, the time to fill the Container fragment was about 800 usec.

With the same (old) software and 1200 fragements per container, and the size of the fragments still around 300-500 bytes, the time to fill the Container was on the order of 100s of milliseconds. (Hard to tell exactly because the trigger rate was ~10 Hz, so when the time per Container was more than 100 msec, the system bogged down.)

With the new code from this Issue and Issues 21267 and 21640, the time per Container was 35-50 msec and the number of memcpy calls was around 14

[np04daq@np04-srv-012 dune-artdaq-KurtTest4]$ tshow | egrep 'trigger|QuickVec|ContainerFragmentLoader' | tdelta -ct 1 | egrep 'Finish|BEGIN' | head -33
    272 01-18 18:16:25.546648 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
    284 01-18 18:16:25.546478 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 3128 for fragment_id 0.
   1504 01-18 18:16:25.522982 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
   1564 01-18 18:16:25.441016 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 3127 for fragment_id 0.
   2784 01-18 18:16:25.417591 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
   2851 01-18 18:16:25.313490 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 3125 for fragment_id 0.
   4071 01-18 18:16:25.290172 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
   4082 01-18 18:16:25.290029 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 3126 for fragment_id 0.
   5318 01-18 18:16:25.266550 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
   5361 01-18 18:16:25.189117 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 3124 for fragment_id 0.
   6581 01-18 18:16:25.165616 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
   6650 01-18 18:16:25.061987 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 3123 for fragment_id 0.
   7870 01-18 18:16:25.038439 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
   7881 01-18 18:16:25.038293 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 3122 for fragment_id 0.
   9101 01-18 18:16:25.014740 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
   9150 01-18 18:16:24.961262 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 3120 for fragment_id 0.
  10370 01-18 18:16:24.937603 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
  10381 01-18 18:16:24.937412 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 3121 for fragment_id 0.
  11601 01-18 18:16:24.914011 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
  11658 01-18 18:16:24.836139 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 3119 for fragment_id 0.
  12878 01-18 18:16:24.812631 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
  12949 01-18 18:16:24.708778 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 3118 for fragment_id 0.
  14169 01-18 18:16:24.685411 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
  14180 01-18 18:16:24.685287 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 3117 for fragment_id 0.
  15414 01-18 18:16:24.661767 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
  15458 01-18 18:16:24.608070 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 3115 for fragment_id 0.
  17910 01-18 18:16:24.560919 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
  17943 01-18 18:16:24.507070 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 3113 for fragment_id 0.
  19163 01-18 18:16:24.483373 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
  19174 01-18 18:16:24.483202 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 3114 for fragment_id 0.
  20410 01-18 18:16:24.459727 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
  20452 01-18 18:16:24.382255 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 3112 for fragment_id 0.
  21686 01-18 18:16:24.358792 119834 123853  11 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN

#7 Updated by Kurt Biery 28 days ago

Coming at this from the other direction, I was able to run at a maximum rate of approximately 40 Hz at protoDUNE with the new code from this Issue and Issues 21267 and 21640. This was with 1200 Trigger fragments per event and an average size of each fragment of about 380 bytes (approx 460,000 bytes per Trigger Container fragment).

I didn't run the same test (see what the maximum rate without errors would be) with the old/existing code, but we know that it is less than 10 Hz because I saw that the system was struggling with 10 Hz.

To get a sense of the effect of the code changes in this issue, I look at the max rate that the system could achieve without errors using just the fixes from Issues 21267 and 21640 (I backed out the changes from this Issue). That was around 15-18 Hz.

#8 Updated by Kurt Biery 28 days ago

To follow up on the time to build a container fragment for the Trigger system at protoDUNE with 1200 fragments but without the speed improvements, this looked like it was around 150-180 msec in tests that I ran today:

[np04daq@np04-srv-012 dune-artdaq_artdaq_v3_03_01_plus]$ tshow | egrep 'trigger_0|QuickVec|ContainerFragmentLoader' | tdelta -ct 1 | egrep 'BEGIN|Finish' | head -120 | tail -22
  68398 01-22 18:46:19.995210 45579 46923  14 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
  69081 01-22 18:46:18.489774 45579 46923  14 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 538 for fragment_id 0.
  70287 01-22 18:46:18.328223 45579 46923  14 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
  70294 01-22 18:46:18.323036 45579 46923  14 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
  70852 01-22 18:46:17.134191 45579 46923  14 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 537 for fragment_id 0.
  72058 01-22 18:46:16.973497 45579 46923  14 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
  72065 01-22 18:46:16.967478 45579 46923  14 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
  72464 01-22 18:46:16.175442 45579 46923  14 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 536 for fragment_id 0.
  73670 01-22 18:46:16.016226 45579 46923  14 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
  74070 01-22 18:46:15.222939 45579 46923  14 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 535 for fragment_id 0.
  75276 01-22 18:46:15.060446 45579 46923  14 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
  75675 01-22 18:46:14.193306 45579 46923  14 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 534 for fragment_id 0.
  76881 01-22 18:46:14.108449 45579 46923  14 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
  77522 01-22 18:46:12.711935 45579 46923  14 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 533 for fragment_id 0.
  78728 01-22 18:46:12.550862 45579 46923  14 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
  79047 01-22 18:46:11.956771 45579 46923  14 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 532 for fragment_id 0.
  80253 01-22 18:46:11.798033 45579 46923  14 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
  80572 01-22 18:46:11.207246 45579 46923  14 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 531 for fragment_id 0.
  81778 01-22 18:46:11.045750 45579 46923  14 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
  82338 01-22 18:46:09.852530 45579 46923  14 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 530 for fragment_id 0.
  83544 01-22 18:46:09.692481 45579 46923  14 trigger_0_CommandableFragmentGenerator d09 . applyRequestsWindowMode BEGIN
  83863 01-22 18:46:09.098313 45579 46923  14 trigger_0_CommandableFragmentGenerator d09 . Finished Processing Event 529 for fragment_id 0.

#9 Updated by Eric Flumerfelt 22 days ago

Notes from code review:
QuickVec.hh:

void resizeWithCushion(size_t size, double growthFactor = 1.3);

1.3 seems to be a bit of a "magic number" here. Is this something we should define as a constant somewhere? e.g.

const double QUICKVEC_RECOMMENDED_GROWTH_FACTOR = 1.3;
void resizeWithCushion(size_t size, double growthFactor = QUICKVEC_RECOMMENDED_GROWTH_FACTOR);

ContainerFragmentLoader.hh:

artdaq_Fragment_.resizeBytesWithCushion(bytes + currSize, 1.3);

Since resizeBytesWithCushion has a default for the second argument, and we're using that default, recommend removing second argument.

Fragment.hh:

void resizeBytesWithCushion(std::size_t szbytes, double growthFactor=1.3);

Same as comment in QuickVec.hh

#10 Updated by Eric Flumerfelt 22 days ago

clang-tidy with default checks shows no issues when run with following command line (compile_commands.json generated by MRB build):
clang-tidy -p $MRB_BUILDDIR -header-filter=.* $MRB_SOURCE/artdaq_core/test/Data/ContainerFragment_t.cc

#11 Updated by Eric Flumerfelt 21 days ago

demo run with request_based_dataflow_example shows that Container Fragments are still being filled and transported through the system.

#12 Updated by Eric Flumerfelt 20 days ago

  • Status changed from Assigned to Resolved
  • Co-Assignees Eric Flumerfelt added

#13 Updated by Eric Flumerfelt 20 days ago

  • Status changed from Resolved to Reviewed

#14 Updated by Eric Flumerfelt 18 days ago

  • Target version set to artdaq_core v3_04_09
  • Status changed from Reviewed to Closed


Also available in: Atom PDF