Project

General

Profile

Initial daq cluster setup checklist » History » Version 6

« Previous - Version 6/40 (diff) - Next » - Current version
Pengfei Ding, 01/13/2020 10:29 AM


Initial DAQ cluster setup checklist.

Objective: To reduce the number service desk tickets during the initial setup of DAQ development / production clusters.

Networking

  1. define subnets for IPMI, fnal/public and data interfaces
  2. define host names for all network interfaces and make them consistent
    • mydaq-br01, mydaq-eb01, mydaq-ipmi-br01, mydaq-data-br01
    • the list of host names should be complete as if all hardware is available
    • put all host names into /etc/hosts and distribute it across all servers
  3. make a consistent IP address assignment across all subnets
    • use address blocks for the same server roles
    • make the last octet of an IP address being the same across all NICs of the same host
  4. configure authentication
    • Kerberos for the public interface
    • publickey for the data interface
  5. create instructions for rebooting servers using IPMI
  6. enable the 9000 MTU frames on all interfaces and networking equipment by default
  7. configure and verify that multicasting is enabled and working all networking equipment

Users

  1. define a shared user for
    • managing UPS products
    • running daq, dcs, databases
  2. add all people from the RSI group to the /root/.k5login
  3. add all known daq users to the daq and dcs shared accounts
  4. shared user profiles are not expected to have any customizations

Storage areas

  1. setup a reliable NFS server for /home, /daq/products, /daq/database, /daq/log, /daq/database, /daq/tmp,.... /data /scratch, /daq/backup
  2. reserve adequate disk space for each area
  3. create a designated scratch area for doing builds on a local NVMe derive, preferably on the fastest server
    • a faster NVMe drive such as Samsung 970 Pro or faster is preferred
  4. setup a nightly backup for /home and a weekly backup for /daq/backup areas
  5. the performance of the NFS should be monitored

Software

  1. any base software such as the OS and productivity RPMs should be identical on all servers
  2. a default list of installed software packages should not be impeding the development/ testing work, e.g. emacs, vim, mc, tmux, perf, iperf, strace, dstat,..... VNC/MATE should be installed by default
  3. implement system monitoring using ganglia
    or similar software

System Services

  1. Optional: DNS, Kerberos, NIS, Supervisord, influxdb, prometheus.
  2. Ganglia, graphite.