Project

General

Profile

DAQ Expert Troubleshooting

RC - Expert Troubleshooting Guide

Failed to start the new run

%MSG-i WorkerThread: Seb 08-Jun-2015 04:57:54 CDT MF-online
StateMachineEventProcessor::processEvent started running.
%MSG
DDS domain participant is not connected.
Unable to connect DomainParticipant in DDSConnection constructor.

Problems with SEB02 provided by Wes:

Solution:

uboonedaq was unable to start DDS. Not sure why we have this problem on this machine now, but the following is the normal routine to reset it (what I did):

(1) Log in to machine with problem as uboonedaq (here: ssh uboonedaq@seb02)
(2) source $UBOONEDAQ_DIR/slf6.x86_64.e7.debug/bin/setup_daq.sh
(3) do “configure-online-daq-prod” Important
(4) Try: ospl stop (Probably returns a "Ready" status super fast, when it should take a second or two).
(5) Try: ospl start (If it says "Splice System with domain name "uboonedaq uboone DAQ prod DDS Domain" is found running, ignoring command" then it's bad.)
(6) Do: ipcs. Should see output like this:


------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 0 root 777 262144 1 dest
0x5303000e 3047425 uboonedaq 600 104857600 3

(7) Look for segments owned by uboonedaq of size 104857600. For each one, do: ipcrm -m <shmid>, where <shmid> is the shmid number listed for the entry.
(8) Do: rm /tmp/sppdskey_*. It may complain that it is not permitted for some of those files (due to ownership), but we want any uboonedaq owns to be gone (you can do an ls to check).
(9) Try: ospl stop. Should do nothing (no output).
(10) Try: ospl start. Should take a few seconds to set up, and then say "Ready" and print log locations.
(11) Try: ospl stop. Should take a few seconds to stop it, and then say "Ready".

Then, you're all good!

Run crashed owing to PMT HV off

NOTE: Since uboonedaq_datatypes v6_21_00, this shouldn't be an issue. The addition of the block listed below in projects/datatypes/ub_PMT_WindowDataCreatorHelperClass.cpp prevents it from occurring.

if(curr_rawData.size() < 2) {
return;
}

In the case that a run starts, but crashes in about 1 minute, plus no event is assembled and recorded, you could check the err log file of seb10. One possibility is that PMT HV is off, and therefore no input to the PMT FEM on slot 6, which has the cosmic los gain configuration. An example of the err log files in this case is shown below - there are a bunch of the messages "Channel 0 with 1 windows Channel 1 with 1 windows Channel 2 with 1 windows Channel 3 with 1 windows ..."
Our DAQ code cannot process a channel input without anything and that's why it crashes.

Caught exception in ub_PMT_WindowDataCreatorHelperClass::populateChannelDataVector() Message:  datatypes_exception Message: Junk data: Left with a PMT window header that is too small..

Raw Card Data
Buffer size is 2 bytes, or 1 elements 2 bytes each.

c000
Object gov::fnal::uboone::datatypes::ub_PMT_CardHeader_v6 const*.
Module[6], ID[0], Marker[ffff], RAW[0xf006ffff]
WordCount[0], RAW[0xf001f000]
Event[866], RAW[0xf866f000]
Frame[45fb1], RAW[0xffb1f045]
Checksum[4000], RAW[0xf000f004]
TrigSample[3c7], RAW[0xf0c7f023]
TrigFrameMod16[2], RAW[0xf0c7f023]
DataStartMarker[4000], RAW[0x4000]

Object gov::fnal::uboone::datatypes::ub_PMT_CardTrailer_v6 const*.
DataEndMarker[c000], RAW[0xc000]

Exception: datatypes_exception Message: Junk data: Left with a PMT window header that is too small..
Object gov::fnal::uboone::datatypes::ub_MarkedRawCrateData<gov::fnal::uboone::datatypes::ub_PMT_CardData_v6, gov::fnal::uboone::datatypes::ub_XMITEventHeader, gov::fnal::uboone::datatypes::ub_XMITEventTrailer> const*.
Object gov::fnal::uboone::datatypes::ub_XMITEventHeader const*.
00  RAW[ffffffff]
Object gov::fnal::uboone::datatypes::ub_XMITEventTrailer const*.
00  RAW[e0000000]
 *Found 3 cards.
Card 1
Object gov::fnal::uboone::datatypes::ub_MarkedRawCardData<gov::fnal::uboone::datatypes::ub_PMT_ChannelData_v6, gov::fnal::uboone::datatypes::ub_PMT_CardHeader_v6, gov::fnal::uboone::datatypes::ub_PMT_CardTrailer_v6> const*.
Object gov::fnal::uboone::datatypes::ub_PMT_CardHeader_v6 const*.
Module[4], ID[0], Marker[ffff], RAW[0xf004ffff]
WordCount[eb1c], RAW[0xfb1df00e]
Event[866], RAW[0xf866f000]
Frame[45fb1], RAW[0xffb1f045]
Checksum[a326c9], RAW[0xf6c9fa32]
TrigSample[3c7], RAW[0xf0c7f023]
TrigFrameMod16[2], RAW[0xf0c7f023]
DataStartMarker[4000], RAW[0x4000]
Object gov::fnal::uboone::datatypes::ub_PMT_CardTrailer_v6 const*.
DataEndMarker[c000], RAW[0xc000]
 *Found 48 channels.
  Channel 0 with 1 windows   Channel 1 with 1 windows   Channel 2 with 1 windows   Channel 3 with 1 windows   Channel 4 with 1 windows   Channel 5 with 1 windows   Channel 6 with 1 windows   Channel 7 with 1 windows   Channel 8 with 1 windows   Channel 9 with 1 windows   Channel 10 with 1 windows   Channel 11 with 1 windows   Channel 12 with 1 windows   Channel 13 with 1 windows   Channel 14 with 1 windows   Channel 15 with 1 windows   Channel 16 with 1 windows   Channel 17 with 1 windows   Channel 18 with 1 windows   Channel 19 with 1 windows   Channel 20 with 1 windows   Channel 21 with 1 windows   Channel 22 with 1 windows   Channel 23 with 1 windows   Channel 24 with 1 windows   Channel 25 with 1 windows   Channel 26 with 1 windows   Channel 27 with 1 windows   Channel 28 with 1 windows   Channel 29 with 1 windows   Channel 30 with 1 windows   Channel 31 with 1 windows   Channel 32 with 1 windows   Channel 33 with 1 windows   Channel 34 with 1 windows   Channel 35 with 1 windows   Channel 36 with 1 windows   Channel 37 with 1 windows   Channel 38 with 1 windows   Channel 39 with 1 windows   Channel 40 with 0 windows   Channel 41 with 0 windows   Channel 42 with 0 windows   Channel 43 with 0 windows   Channel 44 with 0 windows   Channel 45 with 0 windows   Channel 46 with 0 windows   Channel 47 with 0 windows Card 2
Object gov::fnal::uboone::datatypes::ub_MarkedRawCardData<gov::fnal::uboone::datatypes::ub_PMT_ChannelData_v6, gov::fnal::uboone::datatypes::ub_PMT_CardHeader_v6, gov::fnal::uboone::datatypes::ub_PMT_CardTrailer_v6> const*.
Object gov::fnal::uboone::datatypes::ub_PMT_CardHeader_v6 const*.
Module[5], ID[0], Marker[ffff], RAW[0xf005ffff]
WordCount[eb14], RAW[0xfb15f00e]
Event[866], RAW[0xf866f000]
Frame[45fb1], RAW[0xffb1f045]
Checksum[9ec787], RAW[0xf787f9ec]
TrigSample[3c7], RAW[0xf0c7f023]
TrigFrameMod16[2], RAW[0xf0c7f023]
DataStartMarker[4000], RAW[0x4000]
Object gov::fnal::uboone::datatypes::ub_PMT_CardTrailer_v6 const*.
DataEndMarker[c000], RAW[0xc000]
 *Found 48 channels.
  Channel 0 with 1 windows   Channel 1 with 1 windows   Channel 2 with 1 windows   Channel 3 with 1 windows   Channel 4 with 1 windows   Channel 5 with 1 windows   Channel 6 with 1 windows   Channel 7 with 1 windows   Channel 8 with 1 windows   Channel 9 with 1 windows   Channel 10 with 1 windows   Channel 11 with 1 windows   Channel 12 with 1 windows   Channel 13 with 1 windows   Channel 14 with 1 windows   Channel 15 with 1 windows   Channel 16 with 1 windows   Channel 17 with 1 windows   Channel 18 with 1 windows   Channel 19 with 1 windows   Channel 20 with 1 windows   Channel 21 with 1 windows   Channel 22 with 1 windows   Channel 23 with 1 windows   Channel 24 with 1 windows   Channel 25 with 1 windows   Channel 26 with 1 windows   Channel 27 with 1 windows   Channel 28 with 1 windows   Channel 29 with 1 windows   Channel 30 with 1 windows   Channel 31 with 1 windows   Channel 32 with 1 windows   Channel 33 with 1 windows   Channel 34 with 1 windows   Channel 35 with 1 windows   Channel 36 with 1 windows   Channel 37 with 1 windows   Channel 38 with 1 windows   Channel 39 with 1 windows   Channel 40 with 0 windows   Channel 41 with 0 windows   Channel 42 with 0 windows   Channel 43 with 0 windows   Channel 44 with 0 windows   Channel 45 with 0 windows   Channel 46 with 0 windows   Channel 47 with 0 windows Card 3
Object gov::fnal::uboone::datatypes::ub_MarkedRawCardData<gov::fnal::uboone::datatypes::ub_PMT_ChannelData_v6, gov::fnal::uboone::datatypes::ub_PMT_CardHeader_v6, gov::fnal::uboone::datatypes::ub_PMT_CardTrailer_v6> const*.
Object gov::fnal::uboone::datatypes::ub_PMT_CardHeader_v6 const*.
Module[6], ID[0], Marker[ffff], RAW[0xf006ffff]
WordCount[0], RAW[0xf001f000]
Event[866], RAW[0xf866f000]
Frame[45fb1], RAW[0xffb1f045]
Checksum[4000], RAW[0xf000f004]
TrigSample[3c7], RAW[0xf0c7f023]
TrigFrameMod16[2], RAW[0xf0c7f023]
DataStartMarker[4000], RAW[0x4000]
Object gov::fnal::uboone::datatypes::ub_PMT_CardTrailer_v6 const*.
DataEndMarker[c000], RAW[0xc000]
 *Found 0 channels.
Object gov::fnal::uboone::datatypes::ub_MarkedRawDataBlock<gov::fnal::uboone::datatypes::ub_XMITEventHeader, gov::fnal::uboone::datatypes::ub_XMITEventTrailer> const*.
  RAW Data: Buffer size is 240832 bytes, or 120416 elements 2 bytes each.

Solution:

You have to take off the pmt6 block in the DAQ configuration, so that it won't run the FEM.

Circular buffer occupancy high

  • A possibility is that the readfragment cannot interpret the data, and doesn't know what to do. The data is not transferred to the next step (outbound to the assembler for the NU stream data), and the circular buffer occupancy starts increasing.

High memory use

Here is an example (elog #54980) of how to troubleshoot an issue when the memory use is high.

runConsoleDAQ won't start

See here for the problem: elog 55664
And here for the solution: elog 55665

GPS_Satellite_Status alarm

GPS_Satellite_Status goes into alarm with a status of 7 after SEB10 has been powered down and back up.

Solution: If restarting the run does nothing, reboot SEB10.