Project

General

Profile

Feature #3978

Rework the handling of enqueue timeouts

Added by Kurt Biery over 6 years ago. Updated over 5 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
Start date:
06/04/2013
Due date:
% Done:

0%

Estimated time:
Duration:

Description

There are a number of places in the artdaq and ds50daq code (e.g. EventStore.cc and Aggregator.cc) that push events onto the GlobalQueue (that provides events to the art thread). At the moment, the size of the queue is generally set to 20 and the timeout for pushing a new event onto this queue is set to 5 seconds. However, we don't have a good model for what to do when the timeout expires. At the moment, the event is dropped on the floor and a MessageFacility message is generated. It would be much better to not drop data on the floor, but we also want to avoid getting into a retry loop which we can't break out of (for example, if the run is ended).

Resolving this issue should include doing a survey of the artdaq and ds50daq code bases to look for all places where we enqueue events and make sure that failures/timeouts are being handled well.
--Kurt

History

#1 Updated by Kurt Biery over 6 years ago

Just a comment on this issue... In a long test run (~45 hours) of V1495 firmware 2b, there were instances in which events failed to be enqueued. The run continued fine after these errors, so the backpressure seems to have been temporary. But, this issue comes up even in cases where there has been no catastrophic failure.

#2 Updated by Kurt Biery over 5 years ago

  • Status changed from New to Closed

This is a duplicate of many other Issues that talk about this problem. And, it is fixed in artdaq v1_07_00 and ds50daq v1_01_00.



Also available in: Atom PDF