Project

General

Profile

Support #17992

Don't send DPM_PEND right away.

Added by Richard Neswold about 3 years ago. Updated almost 3 years ago.

Status:
Closed
Priority:
Normal
Category:
Data Pool Manager
Target version:
Start date:
10/20/2017
Due date:
% Done:

0%

Estimated time:
Duration:

Description

Back when Andrey was working on new DPM, he added code that would immediately return DPM_END status for every device being read. This was to mimic the behavior of CLIB where the status was DPM_PEND until the first reading arrived. This makes Java, Javascript, Erlang, and Python clients have to ignore those initial status values. Rather than complicate everyone's life, we should change DPM to do the following:

  • When a job is started, no DPM_PENDs are sent out but a timer is started.
  • When the timer expires, DPM_PEND is sent for any devices that haven't yet sent a reading.

CLIB should pre-stuff its DPM entries with a DPM_PEND status (it might already do this) which will get overwritten with data when it arrives. Other clients will simply get the data or a DPM_PEND when some time has passed, which is a more natural interface.

The timer could be set to 10 seconds? 30 seconds?

History

#1 Updated by Dennis Nicklaus about 3 years ago

yes, please! If you want to be fancy, you could look at the ftd for a clue as to how long the timer should be.

#2 Updated by Richard Neswold about 3 years ago

Dennis Nicklaus wrote:

If you want to be fancy, you could look at the ftd for a clue as to how long the timer should be.

Sure! Periodic events are easy. For TCLK events, maybe waiting one full supercycle would be appropriate (there may be a set of TCLK events we could attach some knowledge to so we could come up with a better time-out. For instance, $8F would simply be a 1 second timeout.) State events may just have a fixed, longer timeout, like a minute.

#3 Updated by Beau Harrison about 3 years ago

Richard Neswold wrote:

Dennis Nicklaus wrote:

If you want to be fancy, you could look at the ftd for a clue as to how long the timer should be.

Sure! Periodic events are easy. For TCLK events, maybe waiting one full supercycle would be appropriate (there may be a set of TCLK events we could attach some knowledge to so we could come up with a better time-out. For instance, $8F would simply be a 1 second timeout.) State events may just have a fixed, longer timeout, like a minute.

I think the supercycle is a potentially arbitrary length. Especially consider NML who doesn't subscribe to the HEP timeline.
Sorry for the naive question but is it necessary to send a DPM_PEND? From my standpoint I can wait forever without issue. Possibly, wait forever on event and timeout after the event with DPM_PEND?

#4 Updated by Michael Wren about 3 years ago

A question, from the perspective of the Java, Javascript, Erlang, and Python clients is the data always delivered via a callback function? If so, this to me makes them push based, instead of the pull based that clib clients are.

If they all are, then to me the thing that makes sense to me is to not send the initial DPM_PEND. Instead, what would be useful to a pull-based programmer would be for the DPM to watch for the FTD requested and if that FTD passes plus 10? seconds without a response then an error is passed to the callback function. This would mean if an event is requested that isn't happening the callback function just would not be called ever, and a rare event the call back function would be called at either the FTD event+delay or FTD event+delay+timeout if the data times-out.

#5 Updated by Richard Neswold about 3 years ago

Beau Harrison wrote:

Sorry for the naive question but is it necessary to send a DPM_PEND? From my standpoint I can wait forever without issue. Possibly, wait forever on event and timeout after the event with DPM_PEND?

I think the intention is to give the programmer extra information so everyone doesn't have to re-invent the wheel. If you had a set of devices and one never returned any data, you'd probably end up writing some time-out detection in your application to react to missing data. If DPM sends you a PEND status associated with your device's ref_id, then the hard work is done.

So, do we let clients write the time-out code over and over? Or make them write the code to ignore non-fatal status values? I'm not sure what the right answer is. Maybe an option in the DPM protocol (and, therefore, in the client API) indicating how you want slow data to be handled.

#6 Updated by Richard Neswold about 3 years ago

Michael Wren wrote:

A question, from the perspective of the Java, Javascript, Erlang, and Python clients is the data always delivered via a callback function? If so, this to me makes them push based, instead of the pull based that clib clients are.

Yes, Charlie King and I believe event-driven APIs are more powerful than polling APIs, so all clients libraries we've written are push-based:

  • In Erlang, DPM data is appended to the task's message queue.
  • In Python, the DPM API returns a generator object which returns DPM results.
  • In Javascript, you associate your device with a callback.

(Full disclosure requires me to say we did create a polling API in Python for those combining data acquisition with a Python GUI.)

If they all are, then to me the thing that makes sense to me is to not send the initial DPM_PEND. Instead, what would be useful to a pull-based programmer would be for the DPM to watch for the FTD requested and if that FTD passes plus 10? seconds without a response then an error is passed to the callback function. This would mean if an event is requested that isn't happening the callback function just would not be called ever, and a rare event the call back function would be called at either the FTD event+delay or FTD event+delay+timeout if the data times-out.

It's hard to tell if most clock events haven't fired. The BOOSTER events are easy, as is $8F.

#7 Updated by Beau Harrison about 3 years ago

Richard Neswold wrote:

Beau Harrison wrote:

Sorry for the naive question but is it necessary to send a DPM_PEND? From my standpoint I can wait forever without issue. Possibly, wait forever on event and timeout after the event with DPM_PEND?

So, do we let clients write the time-out code over and over? Or make them write the code to ignore non-fatal status values? I'm not sure what the right answer is. Maybe an option in the DPM protocol (and, therefore, in the client API) indicating how you want slow data to be handled.

I think you are on the right path. Having the option to specify give the developer the power to decide without off-loading the work. The next question is, what is a reasonable default? I think receiving the PEND is the simplest scenario and maybe the one that is expected. Otherwise, if I wanted to write a persistent application that waited forever on an event that rarely happens I can go through the effort of handling that.

#8 Updated by Richard Neswold about 3 years ago

  • Status changed from New to Assigned
  • Target version set to DPM v1.6

#9 Updated by Richard Neswold about 3 years ago

  • Target version changed from DPM v1.6 to DPM v1.7

Move to v1.7 so we don't delay the release of bug fixes in v1.6.

#10 Updated by Richard Neswold almost 3 years ago

  • Status changed from Assigned to Closed

Done: 288a4a4f

The only time we were sending DPM_PENDS was when we saw V:FEUP report a front-end rebooted and we had requests going to it. Unfortunately, the Java Engines sometimes get impatient with front-ends and mark them down and eventually back to "up" using V:FEUP! So you could have a parameter page happily displaying data and then, for no apparent reason, a bunch of devices flip momentarily to DPM_PEND.

So I removed the code entirely. Any DPM_PENDS you see on a console are generated locally by CLIB.

This feature will be available when DPM v1.7 is released.

Also available in: Atom PDF