Skip to content

Database backup and restore

Dave Lawrence edited this page Nov 21, 2022 · 2 revisions

Simple backup / restore

Backup

To backup, or move between versions:

  • A dump will work with a live DB but ideally stop all VariantGrid services to stop writes to the database
sudo su postgres
# Use the appropriate pg_dump for your database if you are upgrading versions
pg_dump snpdb | gzip > /mnt/database_dumps/vg-dump-$(date --iso).sql.gz

#Restore

sudo su postgres
# Create database as per 
echo "drop database snpdb;" | psql
psql < ${vg_dir}/dbscripts/pgsql_database_create.sql
zcat /mnt/database_dumps/181023_vg_dot_com_dump.gz | psql -d snpdb

Note if you're using macos zcat doesn't handle .gz file, instead you can use

gunzip -c database_dumps/181023_vg_dot_com_dump.gz | psql -d snpdb

WAL based backup strategy

Backups are based around a daily full backup created using pg_basebackup and transaction logs (in the form of WAL, or write ahead logs) that are created and archived throughout the day. This allows for a point in time recovery by allowing transactions to be rerun on top of a backup to the desired time.

Create a backup directory (writable by postgres)

mkdir -p /data/backup/pg_log_archive
chown postgres /data/backup/pg_log_archive

Major elements of the configuration of the postgresql.conf file (in /etc):

wal_level = replica # use 'archive' for Postgres < v.10
archive_mode = on # Enables the wal files to be archived

# instructs the wal archive process to copy wal files to /data/backup/pg_log_archive
archive_command = 'test ! -f /data/backup/pg_log_archive/%f && cp %p /data/backup/pg_log_archive/%f'

Backup script

mkdir -p /data/backup/scripts
cp ${VARIANTGRID_DIR}/scripts/backup/backup_pg.sh /data/backup/scripts/
cp ${VARIANTGRID_DIR}/scripts/backup/dump_pg.sh /data/backup/scripts/
# Appropriate for environment
cp ${VARIANTGRID_DIR}/scripts/backup/local_pg.env /data/backup/scripts

chown postgres -r /data/backup/scripts

Backup script: ${VARIANTGRIDDIR}/scripts/backup_pg.sh

  • The script has an environment file (local_pg.env) which sets the location of the backup files, the archived wal files and the IP address of the server to be backed up (local so 127.0.0.1)
  • The first step taken by the script is to copy the archived wal files into a timestamped folder and archive them as tar.gz.
  • The backup script then runs a full backup to the folder data/backup/db_file_backup in a timestamped folder in which the files are created as tar files.

Schedule Task

Use the postgres user to run the backups. In order to supply the password to the pg_basebackup command, a .pgpass file needs to be created in the postgres user home directory:

sudo su postgres
touch ~/.pgpass
# file should be set to user read/write only - remove other/group rights
chmod 600 ~/.pgpass
# use text editor of choice and add the following line
#ServerIP:PORT:Database:User:Password
127.0.0.1:5432:*:postgres:passw0rd$

Finally, enter the script and environment file into the scheduled tasks

For help on crontab codes, see: https://crontab.guru

sudo su postgres
# Add onto end of existing postgres crontab
crontab -l;cat ${VARIANTGRID_DIR}/scripts/backup/crontab_command.txt | crontab -

Restore Process

# base db files
tar xvf /data/backup/db_file_backup/[timestamp]/base.tar -C /var/lib/postgresql/10/main
# wal files
tar xvf /data/backup/db_file_backup/[timestamp]/pg_wal.tar -C /var/lib/postgresql/10/main

This restores the database to the time that the main backup was taken

To restore wal files to the last archive point:

create a recovery.conf file in the main folder of the postgresql applications

nano /data/postgresql/recovery.conf
# in this file place the following line:
restore_command = ‘cp /data/backup/pg_log_archive/%f %p’

Starting up postgresql will then automatically restore the database to the last archive point. The application will apply each write ahead log in order until all are applied. If any are missing since the last checkpoint then the restore will fail.

To restore to a point in time you need to find the correct archive of wal files. As these are collected and backed up at the same time as the next backup, they are in a tar.gz file with the next timestamp on it.

To restore the archived wal files:

tar xvf /data/backup/pg_log_archive/[timestamp+1]/wal_backup.tar

create a recovery.conf file in the main folder of the postgresql applications

nano /data/postgresql/recovery.conf
# in this file place the following lines:
restore_command = ‘cp /data/backup/pg_log_archive/[timestamp+1]/%f %p’
recovery_target_time = ‘2018-08-06 17:15:00 ACST’

Start the Database application and it will apply the write ahead logs up until the requested time, or until the logs end. Again if there are gaps in the logs then the restore process will fail.

Clone this wiki locally