SISPI Database Guide¶
The SISPI database uses PostgreSQL as the underlying database server. Instructions on how to install and initially configure the database can be found in the DES DocDB 4029. The notes below describe the SISPi-specfic configuration of the database. They assume you are familiar with performing admin tasks on a PostgreSQL database.
Table of Contents¶
- Database Configuration
- Database Replication
- Database Web Viewers
- Database Monitoring
- Database Backups
Database Configuration¶
The database software is installed as a eups product in the SISPI products tree, /usr/remote/software/products
. We are currently using 9.03. The master database is running on server2. The database directory on server2 is /raid2/database/sispi_db
. The mirror is running on server3. The database directory on server3 is /raid3/database/sispi_db
. The database runs as user postgres
. It is started at boot time by a System V style start script in /etc/init.d/postgresql
. The database directories are owned by the postgres
account.
The following configuration options in @postgresql.conf@are set differently from the defaults:
# CONNECTIONS AND AUTHENTICATION listen_addresses = '*' max_connections = 400 # ERROR REPORTING AND LOGGING logging_collector = on log_directory = 'pg_log' log_filename = 'postgresql-%Y-%m.log' log_truncate_on_rotation = on log_rotation_age = 43200 # CLIENT CONNECTION DEFAULTS search_path = 'constants,telemetry,alarms,users,exposure,bss,obstac,"$user",public' timezone = 'UTC'
pg_hba.conf
# All SISPI machines in subnet 139.229.14 (addresses < 128) host all all 139.229.14.0/25 md5
Database Replication¶
We are using PostgreSQL 9 streaming replication to make a standby copy of the database. We are also streaming a copy to Fermilab to des10. Here are the notes on setting up the replication and doing the failover. The primary is running on server2 and the standby on server3. Note that the configuration of each database must be the same,e.g., if the primary has max_connections = 400 then the standby must also.
CTIO Primary and standby database configuration- In postgresql.conf
wal_level = hot_standby max_wal_senders = 3 wal_keep_segments = 64
- In postgresql.conf
wal_level = hot_standby max_wal_senders = 3 wal_keep_segments = 64 hot_standby = on
The wal_level=hot_standby allows the FNAL database mirror to support read-only queries. If wal_keep_segments is large enough then we don't need the disk storage of WAL segments. The current value is good for about 1.5 weeks. If the standby gets further out of synch then it must be re-synched as described below. Note that we have added the hot_standby=on option which allows read-only queries to the FNAL database. The standby on server3 is not configured to do this. - In pg_hba.conf on the primary
# Replication for the standby server3.ctio.noao.edu host replication dbadmin 139.229.14.36/32 md5 # Replication for the standy server at FNAL host replication dbadmin 131.225.58.29/32 md5 # Access from des10.fnal.gov host decam_dev dbadmin 131.225.58.29/32 md5
- To start up a new standby or to re-synch an existing one you need to make a base backup by copying the primary database data directory to the standby. The rsync will only work between server2 and server3 at CTIO.
psql -c "SELECT pg_start_backup('label', true)" -U dbadmin postgres rsync -a ${PGDATA}/ standby:/srv/pgsql/standby/ --exclude postmaster.pid psql -c "SELECT pg_stop_backup()" -U dbadmin postgres
If the FNAL mirror gets out of synch then instead of rsync you will need to make a tar file of the /raid2/database/sispi_db directory. You should exclude the files pg_hba.conf, postgresql.conf and postmaster.pid and the pg_log directory when you make the tar file. Instructions on copying this tarfile to FNAL will be forthcoming (due to the large size we cannot use scp). - Create a recovery.conf file in the standby data directory on server3 (/raid3/database/sispi_db).
# this parameter must to be set to on. standby_mode = 'on' # Specifies a connection string which is used for the standby server to connect # with the primary. primary_conninfo = 'host=server2.ctio.noao.edu port=5442 user=dbadmin' # Specifies a trigger file whose presence should cause streaming replication to # end (i.e., failover). trigger_file = '/raid3/database/failover_trigger'
- Start standby server
- You should see a wal receiver process on the standby and a wal sender on the primary
- Create the trigger file in the specified path by running the
failover.sh
script in/usr/remote/user/postgres
. The standby will detect this file and switch out of replication mode. It will copy therecovery.conf
file torecovery.done
in the data directory. - When the original machine that contained the primary is restored you can make this machine the new standby by running the script
recover_standby.sh
in/usr/remote/user/postgres
. Note that there will need to be changes made to the SISPI configuration for it to see the new database. Also the FNAL mirror would have to be switched. The pg_hba.conf file would also need to be edited. This needs some more thought.
Database Web Viewers¶
Configuring Apache
The web space is in /usr/remote/software/www
. The Apache server runs on system1. The following configuration options are set in httpd.conf
:
# This needs to be on a local disk LockFile /data_local/web_logs/accept.lock # Pick up our version of mod_python and wsgi LoadModule python_module /usr/remote/software/products/mod_python-3.3.1/lib/mod_python.so LoadModule wsgi_module /usr/remote/software/products/mod_wsgi-3.3/lib/mod_wsgi.so
The modules
directory is a link to /etc/httpd/modules
. The server should be started/stopped/restarted by running /etc/httpd start/stop/restart in ~codemanager/bin
. It should be started as codemanager. This should normally be taken care of when the machine boots.
The *.conf
files for TV, ConstantsDB, QE and AV are in the conf.d
directory. These applications all use wsgi. The ECL application is also in the conf.d
but still uses mod_python.
Notes on installing mod_wsgi at CTIO
Installing a mod_wsgi Web Viewer
python.conf
<Directory ""> AddHandler mod_python .py PythonHandler handler PythonOption session_directory /data_local/web_logs/tmp/mp_sessions PythonOption SessionStoragePath /data_local/web_logs/tmp/mp_sessions PythonDebug on </Directory>
Install robots.txt
in html
.
User-agent: * Disallow: /
The application configurations are in html/cfg
. There is one file for the primary database server and one for the standby. There is a link to the current version.
constants.cfg -> constants.cfg_server2 constants.cfg_server2 constants.cfg_ser@@ver3 des_alarms.cfg -> des_alarms.cfg_server2 des_alarms.cfg_server2 des_alarms.cfg_server3 des_tv.cfg -> des_tv.cfg_server2 des_tv.cfg_server2 des_tv.cfg_server3
In the event of the database failing over to the standby on server3 you will need to run the script switch_web_tools.sh
in ~codemanager/bin
(you will need to be user codemanager) to switch the links to the server3 version.
The applications are installed in /usr/remote/software/dbwebtools
. The support applications ChartDirector
, jinja2
and dbiapp_pylib
are in the webpylib
subdirectory:
Versions
ConstantsDB v1_5b.wsgi TV 2.0a.wsgi AV 1.0.wsgi dbiapi_pylib v1_4a wsgi_py v1_3 ChartDirector 5.0.2 (64 bit version)
Install ChartDirector-5.0.2/lib
into ChartDirector
. It was necessary to add a python __init__.py
file
from pychartdir import *
The current version of each product has a link to current. In the www/html
directory there is a link to this current version
AV -> /usr/remote/software/dbwebtools/des_alarms/current/product ConstantsDB -> /usr/remote/software/dbwebtools/ConstantsDB/current TV -> /usr/remote/software/dbwebtools/tv/current/product
Database Monitoring¶
# # Every hour, check if database is running. If not send mail to postgres-status alias. 1 * * * * /usr/remote/user/postgres/bin/db_status.sh alias postgres-status buckley@fnal.gov elliotta@mps.ohio-state.edu
Database Backups¶
The SISPI database is backed up on a weekly basis. The eups package pgTools contains the script used for the backup as well as a set of utilities that are used by the script. The backup currently runs via cron every Sunday morning at 11:30 am local CTIO time. On completion it sends email to the alias defined above.
# # Run weekly database backups at 2 minutes past 11 am on Sunday # 2 11 * * Sun source /usr/remote/software/products/eups/bin/setups.sh; setup pgTools; run_backup.shThe backup has the following features:
- We use pg_dump with the -F c option to reduce the size of the dump files - this is the PostgreSQL custom format, these are binary dumps and must be restored using pg_restore.
- The backup directory is /raid2/database/db_backup/server2. Each backup has its own directory.
- The script dumps out the current size of the database tables.
- Large tables are backed up into individual dump files. The tables in this list currently are:
photo_diodes focal_plane_temperature_cntrl image_health facility_utilities ion_pump ln2_process environmental_data cage_temperatures telescope_data donut_fn1 donut_fn2 donut_fn3 donut_fn4 donut_fs1 donut_fs2 donut_fs3 donut_fs4
- The monsoon telemetry tables are put into another dump file as are all the rest of the of the small telemetry tables.
- The other schema are fairly small and are each put into their own dump file (obstac, constants, alarms, exposure).
- The desbrdev (Quick Reduce) database and the decamcrl (ECL) database each have their own dump file. It may be necessary to divide up the desbrdev into individual dump files if some of the tables are expected to be large. I am still waiting for this information from the QR team.
- The user accounts and other administrative things are put into another dump file.
- A copy of the configuration files is made.
- Old dump files are cleaned up.
- The backup directory are rsynced to /raid3/database/db_backup on server3
- The plan is to also copy the files to FNAL but this is not in place yet.