Initial daq cluster setup checklist » History » Version 5
« Previous -
Version 5/40
(diff) -
Next » -
Current version
Pengfei Ding, 01/13/2020 10:28 AM
Initial DAQ cluster setup checklist.¶
Objective: To reduce the number service desk tickets during the initial setup of DAQ development / production clusters.
Networking¶
- define subnets for IPMI, fnal/public and data interfaces
- define host names for all network interfaces and make them consistent
- mydaq-br01, mydaq-eb01, mydaq-ipmi-br01, mydaq-data-br01
- the list of host names should be complete as if all hardware is available
- put all host names into /etc/hosts and distribute it across all servers
- make a consistent IP address assignment across all subnets
- use address blocks for the same server roles
- make the last octet of an IP address being the same across all NICs of the same host
- configure authentication
- Kerberos for the public interface
- publickey for the data interface
- create instructions for rebooting servers using IPMI
- enable the 9000 MTU frames on all interfaces and networking equipment by default
- configure and verify that multicasting is enabled and working all networking equipment
Users¶
- define a shared user for
- managing UPS products
- running daq, dcs, databases
- add all people from the RSI group to the /root/.k5login
- add all known daq users to the daq and dcs shared accounts
- shared user profiles are not expected to have any customizations
Storage areas¶
- setup a reliable NFS server for /home, /daq/products, /daq/database, /daq/log, /daq/database, /daq/tmp,.... /data /scratch, /daq/backup
- reserve adequate disk space for each area
- create a designated scratch area for doing builds on a local NVMe derive, preferably on the fastest server
- a faster NVMe drive such as Samsung 970 Pro or faster is preferred
- setup a nightly backup for /home and a weekly backup for /daq/backup areas
- the performance of the NFS should be monitored
Software¶
- any base software such as the OS and productivity RPMs should be identical on all servers
- a default list of installed software packages should not be impeding the development/ testing work, e.g. emacs, vim, mc, tmux, perf, iperf, strace, dstat,..... VNC/MATE should be installed by default
- implement system monitoring using ganglia
or similar software
System Services¶
- DNS, Kerberos, NIS?
- Supervisord?
- Ganglia, graphite, influxdb, prometheus, mongoDB?