Skip to main content

Backup & Disaster Recovery

Contract Lucidity processes and stores high-value legal documents. A robust backup strategy is essential to meet your organisation's RPO (Recovery Point Objective) and RTO (Recovery Time Objective) requirements.

What to Back Up

ComponentPriorityContainsBackup Method
PostgreSQL databaseCriticalAll document metadata, analysis results, clause data, embeddings, user accounts, AI provider config, playbook entriespg_dump or continuous WAL archiving
Document storage (/data/storage)CriticalOriginal uploaded files (PDF, DOCX, etc.)File-level backup or volume snapshot
.env configurationCriticalDatabase credentials, JWT secret, CORS settingsVersion control or secrets manager
Redis (cl-redisdata)LowCelery task queue, result cacheNot critical -- rebuilds automatically. Active tasks will be lost and need reprocessing
Container imagesLowApplication codeRebuilt from source code with docker compose build
danger

The .env file contains your JWT_SECRET_KEY. If this is lost, all existing user sessions and refresh tokens become invalid. Users will need to log in again. If POSTGRES_PASSWORD is lost, you cannot connect to the database.

PostgreSQL Backup

Option 1: pg_dump (Simple, Scheduled)

Best for small to mid-size deployments with nightly backup windows.

#!/bin/bash
# backup-db.sh -- Run nightly via cron
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/opt/backups/cl-postgres"
RETENTION_DAYS=30

mkdir -p "$BACKUP_DIR"

# Dump the entire database (compressed)
docker exec cl-postgres pg_dump \
-U cl_user \
-d contract_lucidity \
--format=custom \
--compress=9 \
> "$BACKUP_DIR/cl_backup_${TIMESTAMP}.dump"

# Verify the backup is not empty
FILESIZE=$(stat -c%s "$BACKUP_DIR/cl_backup_${TIMESTAMP}.dump" 2>/dev/null || echo 0)
if [ "$FILESIZE" -lt 1000 ]; then
echo "ERROR: Backup file suspiciously small ($FILESIZE bytes)" >&2
exit 1
fi

# Remove backups older than retention period
find "$BACKUP_DIR" -name "cl_backup_*.dump" -mtime +$RETENTION_DAYS -delete

echo "Backup complete: cl_backup_${TIMESTAMP}.dump ($FILESIZE bytes)"

Schedule with cron:

# Run at 2:00 AM daily
0 2 * * * /opt/scripts/backup-db.sh >> /var/log/cl-backup.log 2>&1

Option 2: Continuous WAL Archiving (Point-in-Time Recovery)

Best for enterprise deployments requiring sub-hour RPO.

  1. Enable WAL archiving in PostgreSQL:
# Add to your PostgreSQL configuration or docker-compose environment
POSTGRES_EXTRA_ARGS: >
-c wal_level=replica
-c archive_mode=on
-c archive_command='cp %p /backups/wal/%f'
  1. Take a base backup periodically:
docker exec cl-postgres pg_basebackup \
-U cl_user \
-D /backups/base \
--format=tar \
--gzip \
--checkpoint=fast
  1. WAL files are archived continuously, enabling point-in-time recovery to any moment.
pgvector Data

The backup includes all pgvector embedding data. Embeddings are stored as standard PostgreSQL columns and are fully captured by pg_dump and WAL archiving. No special handling is required.

Option 3: Cloud-Managed Database

If you deploy PostgreSQL as a managed service (AWS RDS, Azure Database for PostgreSQL, GCP Cloud SQL), automated backups are typically included:

CloudServiceAutomated BackupsPoint-in-Time Recovery
AWSRDS for PostgreSQLDaily snapshots, 35-day retentionYes, to any second
AzureAzure Database for PostgreSQLDaily snapshots, 35-day retentionYes, to any second
GCPCloud SQL for PostgreSQLDaily snapshots, 365-day retentionYes, to any second

Document Storage Backup

The cl-storage Docker volume (mounted at /data/storage inside containers) contains all original uploaded documents.

Option 1: Volume-Level Backup

#!/bin/bash
# backup-storage.sh
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/opt/backups/cl-storage"
mkdir -p "$BACKUP_DIR"

# Create a tarball of the Docker volume
docker run --rm \
-v cl-storage:/source:ro \
-v "$BACKUP_DIR":/backup \
alpine tar czf "/backup/cl_storage_${TIMESTAMP}.tar.gz" -C /source .

echo "Storage backup complete: cl_storage_${TIMESTAMP}.tar.gz"

Option 2: Rsync to Remote Storage

# Sync to a remote backup server (incremental)
rsync -avz --delete \
/var/lib/docker/volumes/contract-lucidity_cl-storage/_data/ \
backup-server:/backups/cl-storage/

Option 3: Cloud Object Storage

For cloud deployments, sync documents to S3, Azure Blob, or GCS:

# AWS S3
aws s3 sync /data/storage/ s3://cl-backups/storage/ --storage-class STANDARD_IA

# Azure Blob
az storage blob upload-batch \
--destination cl-backups \
--source /data/storage/ \
--account-name clbackupstorage

# Google Cloud Storage
gsutil -m rsync -r /data/storage/ gs://cl-backups/storage/

Configuration Backup

# Back up the .env file (contains secrets -- encrypt at rest!)
cp .env /opt/backups/cl-config/env_$(date +%Y%m%d_%H%M%S)

# Or better: use a secrets manager
# AWS: aws secretsmanager create-secret --name cl-env --secret-string file://.env
# Azure: az keyvault secret set --vault-name cl-vault --name cl-env --file .env
warning

Never commit .env to a Git repository. Use a secrets manager or encrypted backup for production credentials.

Recovery Procedures

Full Recovery from Backup

Step-by-Step: Restore PostgreSQL

# 1. Stop services that write to the database
docker compose stop cl-backend cl-worker

# 2. Drop and recreate the database
docker exec cl-postgres psql -U cl_user -c "DROP DATABASE IF EXISTS contract_lucidity;"
docker exec cl-postgres psql -U cl_user -c "CREATE DATABASE contract_lucidity;"

# 3. Ensure pgvector extension exists
docker exec cl-postgres psql -U cl_user -d contract_lucidity \
-c "CREATE EXTENSION IF NOT EXISTS vector;"

# 4. Restore from backup
docker exec -i cl-postgres pg_restore \
-U cl_user \
-d contract_lucidity \
--no-owner \
--no-privileges \
< /opt/backups/cl-postgres/cl_backup_20260319_020000.dump

# 5. Restart services (migrations will run automatically on backend startup)
docker compose up -d cl-backend cl-worker

# 6. Verify
curl -s https://contractlucidity.com/api/health
docker exec cl-postgres psql -U cl_user -d contract_lucidity \
-c "SELECT count(*) FROM documents;"

Step-by-Step: Restore Document Storage

# 1. Stop services
docker compose stop cl-backend cl-worker

# 2. Clear existing volume and restore
docker run --rm \
-v cl-storage:/target \
-v /opt/backups/cl-storage:/backup:ro \
alpine sh -c "rm -rf /target/* && tar xzf /backup/cl_storage_20260319_020000.tar.gz -C /target"

# 3. Restart services
docker compose up -d cl-backend cl-worker

RTO / RPO Guidelines

Deployment TierRPO TargetRTO TargetBackup Strategy
Demo / POC24 hours4 hoursDaily pg_dump + manual storage backup
Small firm12 hours2 hoursDaily pg_dump + nightly storage sync
Mid-size1 hour1 hourWAL archiving + hourly storage sync + warm standby
Am Law 20015 minutes30 minutesManaged DB with PITR + cloud storage replication + hot standby
Am Law 100< 5 minutes< 15 minutesMulti-region managed DB + real-time storage replication + failover automation

Backup Verification

Never Assume Backups Work

Schedule monthly backup restoration tests. An untested backup is not a backup.

Monthly Verification Checklist

  1. Restore the database backup to a test environment
  2. Verify document counts match production
  3. Verify a random sample of documents can be opened and viewed
  4. Confirm embeddings are present (SELECT count(*) FROM document_embeddings;)
  5. Test the full pipeline by uploading a new document
  6. Document the restoration time (this is your actual RTO)
  7. Record results in your DR runbook

Automated Verification Script

#!/bin/bash
# verify-backup.sh -- Run monthly
BACKUP_FILE=$(ls -t /opt/backups/cl-postgres/cl_backup_*.dump | head -1)

echo "Testing backup: $BACKUP_FILE"

# Restore to a test database
docker exec cl-postgres psql -U cl_user -c "DROP DATABASE IF EXISTS cl_backup_test;"
docker exec cl-postgres psql -U cl_user -c "CREATE DATABASE cl_backup_test;"
docker exec cl-postgres psql -U cl_user -d cl_backup_test \
-c "CREATE EXTENSION IF NOT EXISTS vector;"

docker exec -i cl-postgres pg_restore \
-U cl_user -d cl_backup_test --no-owner \
< "$BACKUP_FILE"

# Verify key tables
for TABLE in documents document_metadata document_embeddings users; do
COUNT=$(docker exec cl-postgres psql -U cl_user -d cl_backup_test -t \
-c "SELECT count(*) FROM $TABLE;" 2>/dev/null | tr -d ' ')
echo " $TABLE: $COUNT rows"
done

# Cleanup
docker exec cl-postgres psql -U cl_user -c "DROP DATABASE cl_backup_test;"
echo "Verification complete."