Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support all postgres version + use server postgres version in client tools #2

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open
8 changes: 6 additions & 2 deletions mysql/docker-compose.test.yml
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
services:
db:
image: mysql:8
image: mariadb:10.5
platform: linux/amd64
command: --default-authentication-plugin=mysql_native_password
environment:
MYSQL_ROOT_PASSWORD: password
volumes:
- db_data:/var/lib/mysql
s3:
image: localstack/localstack
platform: linux/amd64
Expand All @@ -25,3 +26,6 @@ services:
AWS_ENDPOINT_URL: http://s3:4566
AWS_ACCESS_KEY_ID: foo
AWS_SECRET_ACCESS_KEY: bar

volumes:
db_data:
36 changes: 31 additions & 5 deletions postgres/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,9 +1,35 @@
FROM --platform=linux/amd64 python:3.12-alpine3.19
FROM debian:bookworm-slim

ENV AWS_CONFIG_FILE=/.aws_config
RUN set -ex && \
apk add --no-cache postgresql16-client bash && \
pip install --no-cache-dir awscli && \
aws configure set default.s3.multipart_chunksize 200MB
ENV PATH="/opt/postgresql/bin:/opt/awscli/bin:$PATH"

# Add the PostgreSQL Apt repository
RUN apt-get update && apt-get install -y wget gnupg && \
echo "deb http://apt.postgresql.org/pub/repos/apt bookworm-pgdg main" > /etc/apt/sources.list.d/pgdg.list && \
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add -

# Install PostgreSQL client tools for versions 11–17 and other dependencies
RUN apt-get update && \
apt-get install -y --no-install-recommends \
curl \
postgresql-client-11 \
postgresql-client-12 \
postgresql-client-13 \
postgresql-client-14 \
postgresql-client-15 \
postgresql-client-16 \
postgresql-client-17 \
python3-venv python3-pip && \
mkdir -p /opt/postgresql && \
for version in 11 12 13 14 15 16 17; do \
ln -s /usr/lib/postgresql/$version/bin/pg_dump /usr/bin/pg_dump-$version && \
ln -s /usr/lib/postgresql/$version/bin/pg_restore /usr/bin/pg_restore-$version; \
done && \
# Use a virtual environment to install Python packages
python3 -m venv /opt/awscli && \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for the virtualenv here. You can install on the system. There may be an apt installable version.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there was an issue when I was trying to install it system-wide (but I might be wrong since it has been some time I did this). I'll give it another try.

/opt/awscli/bin/pip install --no-cache-dir awscli && \
ln -s /opt/awscli/bin/aws /usr/local/bin/aws && \
apt-get clean && rm -rf /var/lib/apt/lists/*

COPY ./bin/ /bin/
ENTRYPOINT ["/bin/entrypoint.sh"]
23 changes: 16 additions & 7 deletions postgres/bin/dump-to-s3.sh
Original file line number Diff line number Diff line change
@@ -1,19 +1,28 @@
#!/bin/bash
# Usage: dump-to-s3.sh <s3://...dump> [dbname]
# Expects a DATABASE_URL environment variable that is the DB to dump
# Optionally, a dbname can be supplied as the second argument
# to override the name from the DATABASE_URL
# Expects a DATABASE_URL environment variable and SERVER_VERSION exported by the entrypoint.
# Optionally, a dbname can be supplied as the second argument to override the name from DATABASE_URL.

set -euf -o pipefail

cleanup() { rv=$?; if [ -f /tmp/db.dump ]; then shred -u /tmp/db.dump; fi; exit $rv; }
trap cleanup EXIT

NAME=${2:-$NAME}
CONNECT_DB_URL="postgres://$USER@$HOST:$PORT/$NAME"
# Extract database name from DATABASE_URL if not provided as an argument
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this change needed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that having one variable as opposed to several ones will be easier. Plus, we try to extract the database name directly from the URL if it's not passed as an argument.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not following. Where are we using several variables?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ipmb: The idea behind this change was to simplify things by combining multiple variables (like USER, HOST, PORT, etc.) into one CONNECT_DB_URL. This makes the code easier to manage and avoids dealing with too many individual variables.

The part where we extract the database name from DATABASE_URL is just a fallback in case it's not passed explicitly, so it ensures everything still works without needing extra inputs.

If something's unclear or there’s a better way to handle this, happy to discuss!

DBNAME=${2:-$(echo "$DATABASE_URL" | sed -E 's|.*/([^/?]+).*|\1|')}
CONNECT_DB_URL="${DATABASE_URL%/*}/$DBNAME"

# Use SERVER_VERSION set by the entrypoint
PG_DUMP="pg_dump-$SERVER_VERSION"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this work if $SERVER_VERSION is not supplied? It can default to the latest version.


if ! command -v "$PG_DUMP" &>/dev/null; then
echo "WARNING: pg_dump for version $SERVER_VERSION is not installed. Defaulting to pg_dump-17."
PG_DUMP="pg_dump-17"
fi

echo "Dumping $CONNECT_DB_URL to $1..."
echo "Dumping $CONNECT_DB_URL to $1 using $PG_DUMP..."
set -x
pg_dump --no-privileges --no-owner --format=custom "$CONNECT_DB_URL" --file=/tmp/db.dump
"$PG_DUMP" --no-privileges --no-owner --format=custom "$CONNECT_DB_URL" --file=/tmp/db.dump
aws s3 cp --acl=private --no-progress /tmp/db.dump "$1"
{ set +x; } 2>/dev/null
echo "Done!"
33 changes: 33 additions & 0 deletions postgres/bin/entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,48 @@ set -euf -o pipefail

export PGSSLMODE=require

wait_for_db() {
local retries=30
local sleep_time=2

echo "Waiting for PostgreSQL server to be ready..."
until psql "$DATABASE_URL" -c '\q' 2>/dev/null || [ "$retries" -eq 0 ]; do
echo "PostgreSQL is unavailable - sleeping ($((retries--)) retries left)..."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove " sleeping " please

sleep "$sleep_time"
done

if [ "$retries" -eq 0 ]; then
echo "ERROR: PostgreSQL server did not become ready in time."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ERROR: PostgreSQL server did not respond.

exit 1
fi

echo "PostgreSQL server is ready!"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this one please

}

if [ -z "${DATABASE_URL:-""}" ]; then
echo "WARNING: DATABASE_URL not found in environment."
else

# Extract connection details from DATABASE_URL
# shellcheck disable=SC2046
export $(parse_database_url.py | xargs)

# Setup PGSERVICE so `psql` just does the right thing
/bin/echo -e "[$NAME]\nhost=$HOST\nport=$PORT\ndbname=$NAME\nuser=$USER" > ~/.pg_service.conf
export PGSERVICE="$NAME"

# Wait for PostgreSQL to be ready
wait_for_db

# Detect PostgreSQL server version
SERVER_VERSION=$(psql "$DATABASE_URL" -tAc "SHOW server_version;" | cut -d '.' -f 1 || true)
if [ -z "$SERVER_VERSION" ]; then
echo "WARNING: Unable to detect PostgreSQL server version. Defaulting to latest (17)."
SERVER_VERSION="17"
else
echo "Detected PostgreSQL server version: $SERVER_VERSION"
fi
export SERVER_VERSION
fi

exec "$@"
18 changes: 15 additions & 3 deletions postgres/bin/load-from-s3.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,24 @@ S3_PATH=$1
echo "Downloading $S3_PATH ..."
aws s3 cp --no-progress "$S3_PATH" /tmp/db.dump

echo "Dropping $NAME..."
# Ensure SERVER_VERSION is set by the entrypoint
if [ -z "${SERVER_VERSION:-}" ]; then
echo "Warning: SERVER_VERSION not detected. Defaulting to latest (17)."
SERVER_VERSION="17"
fi

PG_RESTORE="pg_restore-$SERVER_VERSION"

if ! command -v "$PG_RESTORE" &>/dev/null; then
echo "Warning: pg_restore for version $SERVER_VERSION is not installed. Defaulting to pg_restore-17."
PG_RESTORE="pg_restore-17"
fi

echo "Dropping all objects owned by \"$USER\" in the database..."
psql --echo-all -c "DROP OWNED BY \"$USER\" CASCADE;"

echo "Loading dump from S3..."
echo "Loading dump from S3 using $PG_RESTORE..."
set -x
pg_restore --jobs="${PG_RESTORE_JOBS:-2}" --no-owner --no-privileges --dbname="$NAME" /tmp/db.dump
"$PG_RESTORE" --jobs="${PG_RESTORE_JOBS:-2}" --no-owner --no-privileges --dbname="$NAME" /tmp/db.dump
{ set +x; } 2>/dev/null
echo "Done!"
2 changes: 2 additions & 0 deletions postgres/docker-compose.test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@ services:
db:
image: postgres:14
command: -c ssl=on -c ssl_cert_file=/etc/ssl/certs/ssl-cert-snakeoil.pem -c ssl_key_file=/etc/ssl/private/ssl-cert-snakeoil.key
volumes:
- ./init.sql:/docker-entrypoint-initdb.d/init.sql
environment:
POSTGRES_PASSWORD: password
s3:
Expand Down
2 changes: 2 additions & 0 deletions postgres/init.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
CREATE ROLE test WITH LOGIN PASSWORD 'password';
CREATE DATABASE test OWNER test;
Loading