Project

General

Profile

RCE Troubleshooting

There are two commands on the gateway:

  • rce_status_gui: lets you monitor the RCE status
  • rce_control_gui: same but also lets you control

Normally, if you'd like to know what's going on with the RCEs, we'd prefer shifters run rce_status_gui.

IF you see errors on initialization to the effect that the RCEs are not communicating to the FEB (i.e. register read/write errors), pop the rce_control_gui GUI up, double click on the specific RCE in question and hit the "HardReset" button.

If you still have register errors, the transceiver in the RTM (or flange) probably needs reseating and an RCE expert should do that (Lee Scott, Alan Hahn, Mark Convery, Matt Graham).

TPC Specific Data Taking Instructions

Copied in part from DAQ users cheat sheet, might overlap DAQ instructions in part:

Startup DAQ

John Freeman, Nov-10-2015: before you do this, make sure that someone else isn't running the DAQ. To learn how to determine whether this is the case, please take a look at the official DAQ instructions at https://cdcvs.fnal.gov/redmine/projects/lbne-daq/wiki/Running_DAQ_Interface - specifically the section "Getting the Status", although it's recommended you take the time to read the full document as it covers numerous situations you'll encounter if you plan to frequently run the DAQ

cd /data/lbnedaq/daqarea
source fireup
check_daq_applications.sh
kill_daq_applications.sh -d -t -c
launch_daq_applications.sh -t -c -r -d docs/config.txt

Simplified calibration data taking script:

After DAQ startup, use the following script to take a standard calibration run:

cd ~/bkirby
source doCalibRun.sh

Check the online monitor to verify that a run was recorded and that all 16 RCEs produced data. During cooldown please run this script every ~30 minutes.

Simplified data taking script:

After DAQ startup, use the following script to take data easily:

just_do_it.sh <time> <daq comps>

Example 1 - RCE00 readout only for 5 seconds :

just_do_it.sh 5 rce00

Example 2 - RCE00 + RCE02 in readout for 1 seconds :

just_do_it.sh 1 rce00 rce02

Advanced: Configure TPC readout manually

lbnecmd setconfig rces_and_ssps
lbnecmd setdaqcomps <rce#1> <rce#2> ...

Example 1 - RCE00 readout only:

lbnecmd setdaqcomps rce00

Example 2 - RCE00 + RCE02 in readout:

lbnecmd setdaqcomps rce00 rce02

Example 3 - RCE00-07 in readout:

lbnecmd setdaqcomps rce0{0,7}

Example 4 - All RCEs in readout:

lbnecmd setdaqcomps rce0{0..9} rce1{0..5}

Note: RCEs are numbered rce00 to rce15, 16 in total

Advanced: Take data manually

lbnecmd init daq
lbnecmd start daq
lbnecmd stop daq
lbnecmd terminate daq

New instructions to include PTB in daq (Jan. 05, 2016):

cd /data/lbnedaq/daqarea/lbnerc/
launch_daq_applications.sh -c -t -r -d docs/config_penn_giles.txt
lbnecmd setconfig rces_and_ssps_and_ptb
lbnecmd setdaqcomps penn01 #For just penn
lbnecmd setdaqcomps penn01 rce{“..”} ssp0{“ ..”} #For penn + rces + ssps
lbnecmd init daq
lbnecmd start daq

After run

to check the number of the run, the log and fcl files, go to: /data/lbnedaq/run_records/
to check if the files were created: ssh lbnedaq6 "ls -ltrh /storage/data | tail -10"

Useful command

to check the DAQ status and the most recent log file:

check_daq_applications.sh

to show the start time and configuration of the 5 recent runs:
show_recent_runs.sh 5

Reset RCEs

If you have trouble initialize DAQ because some RCEs have problems (this usually happens when previous run ran out of buffer on some RCEs), here are instructions to reset RCEs.

First ssh to lbnedaq3 from lbne35t-gateway01

ssh lbnedaq3

Check OS
rce_talk FNAL-1 guess_os
rce_talk FNAL-2 guess_os

if you get an “UNKNOWN” instead of linux, then:
reset_frozen_rces.py

This will reset only the RCEs that are in a bad state. Wait ~30 seconds for things to work again.
Remember to logoff:
exit