Project

General

Profile

SlowMon Considerations When Changing Online Machines

If we change the machine on which Ganglia is running

The primary way that slowmon gets data from the DAQ is via Ganglia multicast. As long as multicast continues to work on the network and the Ganglia configuration is not changed, it won't matter which machine is broadcasting. It looks for the broadcasts on multicast group 239.2.11.7 port 8016. This parameter is set in /etc/ganglia/gmond.conf, so as long as /etc/ganglia/gmond.conf is unchanged between hosts, there's nothing to change.

The slowmon also has a backup to the Ganglia multicast mechanism that uses the port on which the Ganglia master daemon will deliver a XML-based representation of all the latest Ganglia metrics. This has been handy when the network has lost multicast ability due to some unexpected router change. Right now the code is hard-coded to use evb's IP address. The code is in ~uboonesmc/slowmoncon/apps/GangliaReader/GangliaReader.py, and the address is clearly indicated in a very obvious global variable in line 16. But this is only a backup: slowmon would still get the data even without this as long as the Ganglia metrics are being sent and received in the normal way.

If we change other machines and need to ensure that the Ganglia metrics are reading the correct machine

The DAQ constructs the names in Ganglia automatically from the hosts. The correspondence between Ganglia names and slow control names is given in the file slowmoncon/apps/GangliaReader/ganglia_reader_config.csv. That's a data file that the software reads. The first column is the Ganglia metric name, the second is the Ganglia host, and the third column is the EPICS variable name.

We think all of the PCStatus variables should be left alone regardless of what happens to the computers, since whatever they read or don't read, they still represent what is actually happening. Likewise we would probably want to leave the DAQ seb variables alone.

For the evb-related DAQ variables, if you run the event builder on a machine that Ganglia thinks is something other than ubdaq-prod-evb-priv, then we would have to change the expected host names in column 2 of the .csv file referenced above. We have agreed to NOT change the name of the EPICS variable, for two reasons: (1) it would completely confuse the archiver, alarm server, and displays, and (2) the way the string "evb" is used in the EPICS variable names can just as well refer to the event builder software as to a host name.

Thus, for example, if we started running the event builder on near1, we would change

Receiver-ReceivedDataRate_Aassemblerappevb,ubdaq-prod-evb-priv,uB_DAQStatus_DAQX_evb/Received_Data_Rate_assembler

to

Receiver-ReceivedDataRate_Aassemblerappevb,ubdaq-prod-near1-priv,uB_DAQStatus_DAQX_evb/Received_Data_Rate_assembler

But leave the third column as is.

Someone from slow controls could take care of editing the file. It's a simple .csv file, easily editable in any text editor, and the required change isn't hard. The slow controls expert has an advantage in that they can easily stop and restart the Ganglia-to-EPICS process until they get it right.