Project

General

Profile

Feature #6315

Detect stopped FEBs and recover with Sync

Added by Peter Shanahan over 5 years ago. Updated over 5 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Start date:
05/21/2014
Due date:
% Done:

0%

Estimated time:
Duration:

Description

Background:

FEBs with too high a triggered hit rate can overflow their output buffer, causing production of hit data to stop. Data flow can be recovered by issuing a sync from the timing system (which I think issues a start DAQ? Is that the critical bit?).

We do not want to blindly issue periodic sync, since there is a non-negligible probability of a sync tripping up a DCM.

We therefore need to issue a sync specifically when we detect that an FEB has shut off.

The path forward that seems to involve the least new coding is:

  1. DCMApp checks bit 4 in microslice header for "FEB off"
  2. In case bit is set, DCM issues warning message to the effect of "FEB Buffer Overflow Shutoff Detected"
  3. MessageAnalyzer has a condition to detect this message, and a rule to request Run Control to issue a sync
Potential Issues
  1. we believe this is implemented in 2E/2D FEB V4 firmware, but not 100% sure
  2. this requires 11Dec13 DCM firmware. In use at NDOS, but not yet FarDet
  3. Run Control needs to hold off for several seconds after issuing a sync before responding to Message Analyzer, or else we risk an infinite sync loop. This is in HEAD, but we're using a branch on FarDet.
    • maybe not, though, since the infinite loop comes from rules that trigger a sync based on corruption messages that often follow a sync. Leaving those rules off may avoid this problem.

History

#1 Updated by Peter Shanahan over 5 years ago

Another detail on DCM app side:

We don't want to be perpetually requesting syncs if there's an FEB that's too hot to stay live, so we would need a configurable parameter:

  • Minimum time between subsequent Warnings (and therefore sync) - default 5 minutes?


Also available in: Atom PDF