Project

General

Profile

Bug #15629

PA4187 (R52, I52) DPM_PEND using new DPM

Added by Kyle Hazelwood almost 3 years ago. Updated about 2 years ago.

Status:
Closed
Priority:
Normal
Category:
Data Pool Manager
Target version:
Start date:
02/22/2017
Due date:
% Done:

0%

Estimated time:
Duration:

Description

I52/R52 are not able to use the new DPM. The program throws a DPM_PEND for the bpm_get_data_c routine consistently. The proram has no problem on the old DPM.

2017-02-22.14.13.13.382.png (11.6 KB) 2017-02-22.14.13.13.382.png DPM_PEND Kyle Hazelwood, 02/22/2017 02:16 PM

Related issues

Related to DPM - Bug #13703: Inconsistent behavior with D105Assigned08/29/2016

Is duplicate of DPM - Support #14699: I52 was having problems with DPMClosed11/29/2016

History

#1 Updated by Richard Neswold almost 3 years ago

  • Is duplicate of Support #14699: I52 was having problems with DPM added

#2 Updated by Richard Neswold almost 3 years ago

  • Status changed from New to Assigned
  • Assignee set to Richard Neswold

We need to know which underlying devices are being accessed to figure this out. Since it's a BPM, it may a non-linearly addressed device. A database edit may fix this problem (Brian and I recently went through DPM's length/offset calculations for non-linear devices, so we think it's working -- the database entry may need an update, however.)

#3 Updated by Richard Neswold almost 3 years ago

From Kyle Hazelwood:

As many of you are aware the Operators have been reporting that the MI and RR loss plots have been failing (SA4001 and SA4022). I thought I’d list what I know in hopes of finding a fix.

  • I originally thought this was a DPM issue
    • Toggling the DPM to old seemed to fix the displays on a couple occasions
    • Yesterday both the MI loss plot and RR loss plot died at around the same time. The plots were on different consoles with one on the new DPM and the other on the old DPM. It is probably not a DPM issue.
  • I believe the majority of instances the plots are not actually dying but rather showing stale blm readings, the majority of stacktraces I get are from Operators aborting the displays
  • The stacktraces from instances when the display was not aborted are not consistent.
  • The plots run without issue for a day or so before “dying”, though sometimes they “die” immediately upon restart.
    • Occasionally, when the plots are restarted they show one pulses losses with an old timestamp (Sometimes from hours ago) and never show further updates. I believe this suggests the BLM profile buffer is stale.
    • Twice now the Operators have resorted to restarting all the BLM nodes to get the displays working and this does fix the problem

I’m adding some error checking to the displays to try and capture what is actually happening.

#4 Updated by Richard Neswold about 2 years ago

  • Category set to Data Pool Manager

Is this still a problem?

#5 Updated by Richard Neswold about 2 years ago

  • Status changed from Assigned to Closed

The resolution of this will be documented in issue #15629.

#6 Updated by Richard Neswold about 2 years ago

  • Status changed from Closed to Assigned

Oops. Apparently when an issue is marked as "duplicate", closing one issue closes both! (Who knew?) So I'm re-opening it. Sorry for the extra emails.

#7 Updated by Kyle Hazelwood about 2 years ago

Richard Neswold wrote:

Is this still a problem?

I think it may be, I'll check once we get beam into MI/RR. Thanks.

#8 Updated by Richard Neswold about 2 years ago

  • Target version set to DPM v1.6

This fix: 945cf54f

Fixed a bug when a request timed out, it wasn't restarted (so DPM_PENDs were reported forever.)

These fixes: 50673006 and 60705a39

Fixed a bug when requests were being removed from a combined request.

Please test your application against FGATE by entering the following on your console and then running your app (the title bar of the window should show FGATE as the name of the DPM being used.)

lnmctl NEWDPM_NODE=FGATE

Let me know if this seems to fix your problem.

#9 Updated by Richard Neswold about 2 years ago

  • Status changed from Assigned to Feedback

#10 Updated by Kyle Hazelwood about 2 years ago

Using the FGATE DPM node seems to have fixed the problem, I no longer get DPM_PENDING errors. Thanks!

#11 Updated by Richard Neswold about 2 years ago

Excellent! There's another Redmine issue that I believe is resolved using these commits. If I get positive feedback from them as well, I'll push out these changes to the operational DPMs.

#12 Updated by Richard Neswold about 2 years ago

  • Related to Bug #13703: Inconsistent behavior with D105 added

#13 Updated by Richard Neswold about 2 years ago

  • Status changed from Feedback to Closed

This issue is resolved. The code will be pushed out when I release v1.6 of DPM.



Also available in: Atom PDF