Project

General

Profile

Bug #21882

inappropriate messages getting to ftp plt cont_main process

Added by Dennis Nicklaus about 1 year ago. Updated 11 months ago.

Status:
Closed
Priority:
Normal
Category:
ACSys/FE Framework
Target version:
Start date:
02/11/2019
Due date:
% Done:

0%

Estimated time:
Duration:

Description

On PRAC18 today, we noticed two instances of bad messages getting to the FTP plot task. A separate instance of this task gets spawned off for each FTP request, so throwing an exception and dying is a reasonable thing to do as it just affects that one FTP request. However, these two instances are some internally known messages that shouldn't be directed to these FTP tasks.

==== Mon Feb 11 11:00:24 CST 2019
=WARNING REPORT==== 11-Feb-2019::11:00:24.618597 ===
FTP cont_main <0.220.0>: terminated with throw exception
{unhandled_message,
    {fastreading,#Ref<0.3442310942.4251189249.196816>,
        {device_reply,0,
            {1549,904424,388645},
            <<40,55,0,0,122,188,60,55,0,128,137,188,80,55,0,0,150,188,100,
              55,0,0,150,188,120,55,0,0,150,188,140,55,0,0,150,188,160,55,0,
              0,150,188,180,55,0,128,137,188,200,55,0,128,137,188,220,55,0,
              0,150,188,240,55,0,128,137,188,4,56,0,128,137,188,24,56,0,0,
              150,188,44,56,0,0,122,188,64,56,0,128,137,188,84,56,0,128,137,
              188,104,56,0,128,137,188,124,56,0,0,150,188,144,56,0,0,122,
              188,164,56,0,0,122,188,184,56,0,0,122,188,204,56,0,0,122,188,
              224,56,0,128,137,188,244,56,0,0,150,188,8,57,0,0,150,188,28,
              57,0,0,150,188,48,57,0,128,137,188,68,57,0,128,137,188,88,57,
              0,128,137,188,108,57,0,128,137,188,128,57,0,128,137,188,148,
              57,0,0,150,188,168,57,0,128,162,188,188,57,0,0,150,188,208,57,
              0,0,122,188,228,57,0,128,137,188,248,57,0,128,137,188,12,58,0,
              128,137,188,32,58,0,0,150,188,52,58,0,0,150,188,72,58,0,0,150,
              188,92,58,0,0,150,188,112,58,0,128,137,188,132,58,0,0,150,188,
              152,58,0,0,150,188,172,58,0,128,137,188,192,58,0,128,162,188,
              212,58,0,0,150,188,232,58,0,0,150,188,252,58,0,128,137,188,16,
              59,0,128,137,188,36,59,0,0,150,188,56,59,0,128,162,188,76,59,
              0,128,162,188,96,59,0,0,150,188,116,59,0,0,150,188,136,59,0,
              128,137,188,156,59,0,128,162,188,176,59,0,128,137,188,196,59,
              0,128,137,188,216,59,0,0,150,188,236,59,0,128,137,188,0,60,0,
              0,150,188,20,60,0,128,137,188,40,60,0,0,150,188,60,60,0,0,150,
              188,80,60,0,0,150,188,100,60,0,0,150,188,120,60,0,0,150,188,
              140,60,0,0,150,188,160,60,0,0,150,188,180,60,0,0,150,188,200,
              60,0,128,137,188,220,60,0,0,150,188,240,60,0,128,137,188,4,61,
              0,128,137,188,24,61,0,0,150,188,44,61,0,0,150,188,64,61,0,128,
              137,188,84,61,0,128,137,188,104,61,0,128,162,188,124,61,0,0,
              150,188,144,61,0,128,162,188,164,61,0,0,150,188,184,61,0,128,
              137,188,204,61,0,128,162,188,224,61,0,128,137,188,244,61,0,0,
              150,188,8,62,0,0,150,188,28,62,0,0,150,188,48,62,0,0,150,188,
              68,62,0,0,150,188,88,62,0,0,150,188,108,62,0,128,137,188,128,
              62,0,128,137,188,148,62,0,0,150,188,168,62,0,128,162,188,188,
              62,0,0,150,188,208,62,0,0,150,188,228,62,0,128,162,188,248,62, 
              0,0,150,188,12,63,0,0,150,188,32,63,0,0,150,188,52,63,0,128,
              137,188,72,63,0,0,150,188,92,63,0,0,150,188>>}}}
[{plt,contPlt_loop,6,[{file,"plt.erl"},{line,218}]},
 {plt,cont_main,6,[{file,"plt.erl"},{line,157}]}]

=WARNING REPORT==== 11-Feb-2019::14:05:19.826423 ===
FTP cont_main <0.503.0>: terminated with throw exception
{unhandled_message,
    {fastreading,#Ref<0.3442310942.4251713537.100225>,
        {device_reply,0,
            {1549,915519,492558},
            <<10,60,0,0,150,188,30,60,0,0,122,188,50,60,0,0,150,188,70,60,0,
              0,150,188,90,60,0,0,150,188,110,60,0,128,137,188,130,60,0,128,
              137,188,150,60,0,0,150,188,170,60,0,0,150,188,190,60,0,0,150,
              188,210,60,0,0,150,188,230,60,0,0,150,188,250,60,0,0,150,188,
              14,61,0,0,150,188,34,61,0,128,137,188,54,61,0,0,150,188,74,61,
              0,0,150,188,94,61,0,128,162,188,114,61,0,0,150,188,134,61,0,0,
              150,188,154,61,0,0,150,188,174,61,0,128,137,188,194,61,0,0,
              150,188,214,61,0,0,150,188,234,61,0,0,150,188,254,61,0,0,150,
              188,18,62,0,0,150,188,38,62,0,128,162,188,58,62,0,0,150,188,
              78,62,0,0,150,188,98,62,0,128,162,188,118,62,0,0,150,188,138,
              62,0,0,150,188,158,62,0,128,162,188,178,62,0,0,150,188,198,62,
              0,0,150,188,218,62,0,0,150,188,238,62,0,128,162,188,2,63,0,
              128,137,188,22,63,0,128,162,188,42,63,0,128,137,188,62,63,0,
              128,162,188,82,63,0,0,150,188,102,63,0,0,150,188,122,63,0,0,
              150,188,142,63,0,0,150,188,162,63,0,0,150,188,182,63,0,0,150,
              188,202,63,0,128,137,188,222,63,0,0,150,188,242,63,0,0,175,
              188,6,64,0,0,175,188,26,64,0,128,137,188,46,64,0,0,150,188,66,
              64,0,0,150,188,86,64,0,0,150,188,106,64,0,128,137,188,126,64,
              0,128,137,188,146,64,0,128,162,188,166,64,0,128,162,188,186,
              64,0,0,150,188,206,64,0,0,150,188,226,64,0,0,150,188,246,64,0,
              128,137,188,10,65,0,0,150,188,30,65,0,0,150,188,50,65,0,0,150,
              188,70,65,0,0,150,188,90,65,0,128,137,188,110,65,0,0,150,188,
              130,65,0,0,150,188,150,65,0,0,150,188,170,65,0,128,162,188,
              190,65,0,0,150,188,210,65,0,0,150,188,230,65,0,128,162,188,
              250,65,0,128,162,188,14,66,0,0,150,188,34,66,0,0,150,188,54,
              66,0,0,150,188,74,66,0,0,150,188,94,66,0,0,150,188,114,66,0,0,
              150,188,134,66,0,0,150,188,154,66,0,128,137,188,174,66,0,0,
              150,188,194,66,0,128,162,188,214,66,0,0,150,188,234,66,0,0,
              150,188,254,66,0,128,137,188,18,67,0,0,150,188,38,67,0,0,150,
              188,58,67,0,0,150,188,78,67,0,0,150,188,98,67,0,0,150,188>>}}}
[{plt,contPlt_loop,6,[{file,"plt.erl"},{line,218}]},
 {plt,cont_main,6,[{file,"plt.erl"},{line,157}]}]

History

#1 Updated by Kevin Martin about 1 year ago

I've noticed that the busier the FE is the more likely it is that this failure occurs.

#2 Updated by Richard Neswold about 1 year ago

  • Assignee set to Richard Neswold
  • Status changed from New to Assigned

Weird. For some reason, I added a clause (c045c8ba) to throw an exception when an unhandled message is received.

I removed the clause so the process won't terminate anymore (260b49c1). The code is back to the way Jerry originally had it. I don't know if there's an outer receive loop that will handle these unexpected messages or whether they'll cause a slow memory leak.

#3 Updated by Kevin Martin about 1 year ago

I tested this change out and the problem of my plots terminating is gone. :)

As for whether it causes a memory leak, I had two plots with 4 channels each running for around 30 minutes and from what I could see the main erlang acsys process increase it's memory size by around 20 bytes. Even if this was causes by this issue, I can live with it.

#4 Updated by Kevin Martin about 1 year ago

Correction: 20k bytes not 20 bytes

#5 Updated by Richard Neswold about 1 year ago

  • Status changed from Assigned to Resolved
  • Description updated (diff)

Richard Neswold wrote:

Weird. For some reason, I added a clause (c045c8ba) to throw an exception when an unhandled message is received.

For the record, I call it "weird" because we usually add a catch-all clause to remove spurious messages from the queue and then log them. We don't normally throw an exception so I'm not sure what I was thinking at the time.

Kevin Martin wrote:

As for whether it causes a memory leak ... I can live with it.

These messages aren't huge and they happen every once in a while, so it shouldn't impact your system much. I'm marking this "Resolved" since it fixed your immediate problem but we still need to know whether these messages are getting sent to the wrong process. We'll "Close" it once we understand it fully.

#6 Updated by Richard Neswold 11 months ago

  • Target version set to ACSys/FE v1.7

#7 Updated by Richard Neswold 11 months ago

  • Assignee changed from Richard Neswold to Jerry Firebaugh
  • Status changed from Resolved to Assigned

Now that I restored the code to the way Jerry wrote it, the issue is re-assigned to him so he can determine whether these spurious messages will be handled properly.

#8 Updated by Jerry Firebaugh 11 months ago

  • Status changed from Assigned to Closed

Receiving a plot data message sometimes times out, upon which an error
is returned to the calling loop. The now unusable message may later
arrive and now be received by the calling loop as an unexpected
message. It has to be received so the message buffer doesn't
accumulate orphaned messages. Warnings are added for when times out
and when calling loop receives such an unexpected message.



Also available in: Atom PDF