Tuesday, August 12, 2008

Using pg_standby for high availability of Postgresql

PostgreSQL is a famous database server which can be compared with Oracle or any other commercial database servers. It is said that Postgresql doesn't care about replication because Postgresql wants to focus. But fortunately there is a simple opensource solution to this. Which is called pg_standby.

Pg Standby is used as a simple way to make PostgreSQL highly available. To make this short, I'll start the tutorial which I've been created from many scattered sites (it's really hard to find how to use pg_standby for me)

Condition: I used Ubuntu 8.04 (Hardy) 64-bit, PostgreSQL 8.2 (If you're using version 8.3 or more, just skip the following installation steps). Please make sure that both servers (or workstations) installed using the same Linux version (the same distro, release and architecture. e.g. Ubuntu 8.04 i386)

For now on, we'll do these steps in primary server.

1. Install required packages to create PostgreSQL binary for Debian
sudo apt-get install cdbs bison libperl-dev tk8.4-dev flex libreadline5-dev libssl-dev libpam0g-dev libxml2-dev libkrb5-dev libxslt1-dev python-dev bzip2 fakeroot.

2. Install required packages to install pg_standby
sudo apt-get install sysvconfig postgresql-contrib-8.2 gcc make libc6-dev libreadline5-dev zlib1g-dev libpgeasy-dev unzip checkinstall.

4. Download PostgreSql source from Debian repository
cd ~
apt-get source postgresql-8.2

3. Download pg_standby source files
Download both Makefile and pg_standby.c from http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/contrib/pg_standby/ and put them into postgresql-8.2-8.2.7/build-tree/postgresql-8.2.7/contrib/pg_standby

1. Create binary for PostgreSql
cd ~/postgresql-8.2-8.2.7
fakeroot debian/rules binary

2. Copy pg_standby files into PostgreSQL source,
so that we can see 2 files inside ~/postgresql-8.2-8.2.7/build-tree/postgresql-8.2.7/contrib/pg_standby:

- Makefile
- pg_standby.c

3. Compile and install the pg_standby
by using configured PostgreSQL

cd ~/postgresql-8.2-8.2.7/build-tree/postgresql-8.2.7/contrib/pg_standby
sudo make install

4. Make the pg_standby command accessible
Create a symbolic link so that the pg_standby command is recognized in the system or terminal

cd /usr/bin
sudo ln -s ../share/postgresql-common/pg_wrapper /usr/bin/pg_standby

*) Another way is to compile it using Debian's packaging way:
Edit postgresql-8.2-8.2.7/build-tree/postgresql-8.2.7/contrib/Makefile to insert pg_standby on WANTED_DIRS variable before doing fakeroot (I've never try this)

We need at least 2 servers. The first server acts as primary server which does logging. While the second one is acting as standby warm server ready to become primary server as soon as the primary server is down.

On primary server (server 1)
1. Create a remote directory for the archive
which is only readable and writable to user postgres. So that we'll have /psql-archive mounted from server2:/usr/local/psql-archive

sudo mkdir /psql-archive

Please googling for tutorial to mount nfs file system automatically. Or my be a bit here:

in server2 (
sudo apt-get install nfs-kernel-server
sudo mkdir /usr/local/psql-archive
sudo chown postgres.postgres /usr/local/psql-archive
sudo su -c "echo '/usr/local/psql-archive,sync,no_subtree_check)' >> /etc/exports"
sudo exportfs -ra

in server1 (
sudo apt-get install nfs-kernel-server
sudo mkdir /psql-archive
sudo su -c "echo ' /psql-archive nfs timeo=14,rsize=8192,wsize=8192,intr 0 0' >> /etc/fstab"
sudo mount -a

2. Activate logging
Add into /etc/postgresql/8.2/main/postgresql.conf:

archive_command = 'cp "%p" /psql-archive/"%f"'
archive_timeout = 120

reload the database server configuration or restart!
sudo /etc/init.d/postgresql force-reload

3. Make sure it is running
You should see /psql-archive is filled with archived WAL files. Some people could only see this after base backup is created

4. Create base backup
To make both PostgreSQL server is a clone of each other

4.1 Do this
echo "SELECT pg_start_backup('pertama');" |psql -U postgres

4.2 Backup cluster database (does not need any restart)
cd /var/lib/postgresql/8.2
sudo tar zcvf ~/base-main-backup.tar.gz main

4.3 End the backup
echo "SELECT pg_stop_backup();" |psql -U postgres

On standby server (server 2)
5. Stop the server
sudo /etc/init.d/postgresql stop

6. Copy the backup
from primary server into the standby server on the correct location. May be you need to backup repository first before replace with backup from primary server.

sudo mv /var/lib/postgresql/8.2/main ~/main.backup
sudo tar zxvf base-main-backup.tar.gz -C /var/lib/postgresql/8.2

7. Create recovery file
Create /etc/postgresql/8.2/main/recovery.conf which contains

restore_command = 'pg_standby -l -d -s 2 -k 50 -t /tmp/pgsql.trigger.5442 /usr/local/psql-archive %f %p %r 2>>standby.log'

and symlink it into cluster directory
sudo ln -s /etc/postgresql/8.2/main/recovery.conf /var/lib/postgresql/8.2/main

8. Empty pg_xlog directory
sudo rm -Rf /var/lib/postgresql/8.2/main/pg_xlog/*

9. Start the server back
sudo /etc/init.d/postgresql start

10. Monitor standby.log
sudo tail -f /var/lib/postgresql/8.2/main/standby.log

Testing the configuration
On primary server
1. Create new record
We'll test that this record will also exist on the standby server. Just create one or two tables with data on primary server

2. Shutdown db server
sudo /etc/init.d/postgresql stop

On standby server
3. Standby server, now is primary server
As you see in /var/lib/postgresql/8.2/main/standby.log, there is text stating 'store'. Also, the recovery.conf has changed into recovery.done which means standy server is now fully operating and ready to use.

nb: If it failed to recover which is caused by /tmp/pgsql.trigger.5442 is not triggered, touch it using postgres user

sudo su -
su postgres
touch /tmp/pgsql.trigger.5442

4. Make sure the record exists
You may login into the standby server as it already become a primary server and make sure that the last record exist

That's all

If the primary server fails and then immediately restarts, you must have a mechanism for informing it that it is no longer the primary. This is sometimes known as STONITH (Shoot the Other Node In The Head), which is necessary to avoid situations where both systems think they are the primary, which can lead to confusion and ultimately data loss.

So, switching from primary to standby server can be fast but requires some time to re-prepare the failover cluster. Regular switching from primary to standby is encouraged, since it allows regular downtime on each system for maintenance. This also acts as a test of the failover mechanism to ensure that it will really
work when you need it. Written administration procedures are advised.

No comments: