Project

General

Profile

Bug #21796

ps_lxi_driver problems with init -- undef

Added by Dennis Nicklaus 8 months ago. Updated 8 months ago.

Status:
Assigned
Priority:
Normal
Category:
Sorenson XG/SG Power Supply Driver
Target version:
-
Start date:
01/29/2019
Due date:
% Done:

0%

Estimated time:
Duration:

Description

Darren called me from NML saying they were having problems with the quad magnet power supply devices on N14/magnets/18 giving 57 -10 errors.
In the log, I find things were working fine until 28-Jan-2019::20:10:16 when this message appears in the log:

 
=WARNING REPORT==== 28-Jan-2019::20:10:16.467743 ===
ACNET handle 'SETS32@CLX30E' terminated -- {bad_return_value,not_unique_key}
=ERROR REPORT==== 28-Jan-2019::20:10:16.483250 ===
** State machine sets32 terminating 
** Last message in was {udp,#Port<0.175>, 
                            {127,0,0,1},
                            6802,
                            <<2,0,128,46,13,81,12,93,156,119,8,124,13,0,27,9,
                              62,0,1,1,0,13,81,1,0,0,36,0,0,0,1,0,0,0,2,0,105,
                              32,4,139,4,13,37,4,0,0,0,0,0,0,4,0,0,0,0,0,0,0,
                              41,156,161,66>>}
** When State == running
**      Data  == {state,
                     {{state,#Port<0.174>,#Port<0.175>},
                      #Fun<acnet_local.0.108004090>,
                      #Fun<acnet_local.1.108004090>,
                      #Fun<acnet_local.2.108004090>,
                      #Fun<acnet_local.3.108004090>},
                     2080929692,3539276984,
                     {handler,<0.28619.2>,
                         fun sets32_protocol:marshal_reply/1,
                         fun sets32_protocol:unmarshal_request/1},
                     #Ref<0.772063878.2534146050.25774>,
                     #{11074 =>
                           {#Ref<0.772063878.2532835331.226041>,11074,

(and a lot more state dumps out.)

Someone tried restarting clx30e 29-Jan-2019::13:08 but when it started, it just printed a series of messages like this:

=ERROR REPORT==== 29-Jan-2019::13:08:20.402756 ===
Driver 'ps_lxi_driver' threw an exception in init/1: 'undef'
Stack:[{ps_lxi_driver,init,[[{node,[110,45,105,113,97,49,114,105]},{current_limit,{0.0,120.0}},{init_voltage,12.5}]],[]},{gen_server,init_it,2,[{file,[103,101,110,95,115,101,114,118,101,114,46,101,114,108]},{line,374}]},{gen_server,init_it,6,[{file,[103,101,110,95,115,101,114,118,101,114,46,101,114,108]},{line,342}]},{proc_lib,init_p_do_apply,3,[{file,[112,114,111,99,95,108,105,98,46,101,114,108]},{line,249}]}]

Apparently the drivers couldn't start up properly.

It was after this point that Darren called me.
I restarted the frontend a second time at about 18:00 and it seems to be working fine.
Perhaps the driver couldn't connect to the supplies at the first restart and whatever hangup was preventing that cleared in the meantime?

History

#1 Updated by Richard Neswold 8 months ago

  • Status changed from New to Assigned
  • Category set to Sorenson XG/SG Power Supply Driver

The driver starts a 15 second timer before trying to communicate with the power supply so I don't think communications was the problem. I'll look further into it.

The termination of the SETS32 handle is a little troubling because would imply a bug in Erlang's ACNET client library. Admittedly, it's a rare bug because I don't think I've seen it before. But rarely occurring bugs are hard to fix...

#2 Updated by Richard Neswold 8 months ago

The second portion of the issue is that ps_lxi_driver:init/1 threw an undef exception which means it couldn't find a function. Unfortunately it doesn't tell us which one it was trying to call. Looking at the code, that particular function only calls queue_worker_start/0, proplists:get_value/2, proplists:get_value/3, and array:from_list/1.

queue_worker_start/0 is defined in the module, so that can't be a problem. The others are in the standard library and shouldn't be a problem unless the search path wasn't set up right.

I'd like to restart this front-end sometime using my new scripts because they do more error checking when they build the front-end and they set up the environment such that you know right away if code is missing.



Also available in: Atom PDF