Postgres Replication Slot & pg_basebackup

On #postgresql IRC channel, someone was having problem bringing up standby database server from backups taken using pg_basebackup and recovery.conf was configured to use replication slot.

The reporter was following  below steps:

  1. Set max_replication_slots=5, wal_keep_segments=0 and restart primary database
  2. create replication slot :
  3. SELECT * FROM pg_create_physical_replication_slot('standby')
  4. Call pg_basebackup -D – -Ft | bzip2 > backup.tar.bz2
  5. copy backup.tar.bz2 to standby server
  6. start postgresql on standby

The standby server was throwing following error during startup ..

 requested WAL segment 000000010000000C000000AE has already been removed.

Question from Reporter:

Should I set wal_keep_segments to some non-zero value if I want to use replication slots because while reading docs I understand that I shouldn’t set wal_keep_segments if I use max_replication_slots?

Short answer :

As of now, pg_basebackup doesn’t have support for replication slot so it doesn’t have details on status of WAL files being created or removed.

While the feature is being worked on, you can use one of the following solutions to get around the problem for now.

Solution 1:

On primay database server, set wal_keep_segments to some sane value relative to database traffic volume and time to make & transfer backup to bring up standby server. This setting will keep # of WAL files around on the primary server and will not delete them so when you bring up standby server using replication slot, it will start applying WAL files and eventually connect to physical slot created.

Solution 2:

Create slot before taking backup and use the slot at the least once so primary db server will keep the WAL files around.

  1. On primary db server, create slot
  2. connect to slot using pg_receivexlog
  3. /usr/pgsql-9.4/bin/pg_receivexlog -h primary-db-server -p 5432 -U replication –S 'standby1' -n -v -D /var/lib/pgsql/9.4/archive
  4. You can kill the pg_recievexlog process
  5. Take backup using pg_basebackup
  6. copy the backup and startup  standby database
  7. The standby database should be able to connect to slot now!
  8. Don’t forget to drop the slot otherwise Primary DB server will keep the WALs indefinitely

I would prefer Solution #2 over Solution #1 because you don’t have to figure out optimal value for wal_keep_segments. Plus, you can use pg_receivexlog method  for permanent archive solution. If you want to get more details, I will be presenting tutorial on replication topic “Out of the Box replication in Postgres 9.4” at Postgres Conference in San Francisco on Nov 17-18th. I will be provide hands on tutorial to setup robust replication for production db servers without using any third party tools in Postgres 9.4

If you haven’t registered, this is the time to register because the conference schedule looks amazing!  You could still register with the discount code SeeMeSpeak which provides a 20% discount.

Let me know, if you have better solutions than I discussed in the post. I’m looking forward to see you at the conference.

 

pgbadger log analysis for Postgres RDS instances

Last year, I have presented on Postgres RDS in Amazon at couple of conferences but I haven’t got change to follow up on my experience working with RDS after them. If you aren’t  aware, I recently changed job & started working at WithMe,  a disruptive start-up that focuses on three primary solutions for commerce, including WithMe Business, Financial and Retail.

At WithMe, we are using a fair bit of Postgres RDS instances. After joining, I reviewed the setup and found out the pgabadger log analysis is not setup on the database servers. Even though the databases are still in development phase, it makes sense to have reports that will help to provide more insight to software developers on the slow queries. I had other projects (schema(s) design, database migrations,DW system) going on so i wasn’t able to spend time identify and tuning queries to assist software engineers to expedite their develpement so we decided to prioritize log analysis project.

At previous job, I have setup pgbadger reports on number of servers but most of the time I had to deal with real db servers or at the least had ssh access on the server.  On the RDS, you can only have psql access, no shell level access on the instance.  Additionally, You have to deal with AWS tools and access methods.

I followed following steps to setup pgbadger reports for Postgres RDS instances:

  1. Brought up pgadmin ec2 instance to be used as utility server.
  2. Install AWS tool to interact with API
  3. Make sure /root/.aws/aws_credential_file file in place

    [denishpatel@pgadmin1 ~]$ aws configure –profile oregon
    AWS Access Key ID [None]: XXX
    AWS Secret Access Key [None]: XXX
    Default region name [None]: us-west-2
    Default output format [None]: text

    [denishpatel@pgadmin1 ~]$ aws rds describe-db-instances –profile oregon –output table

  4. Postgres logging parameters are set in parameter groups :

    log_min_duration_statement=5

    log_line_prefix=’%t:%r:%u@%d:[%p]:’

  5. I have come up with quick shell script to run reports. The  script is located @ https://github.com/denishpatel/pgbadger_on_rds
  6. You can easily cron them. As the new servers will be added, I will be adding new cron entries for the server:

    # pgbadger reports

    0 7 * * * ~/run_pgbadger_rds.sh -i devel-postgres -r us-west-2 -c 1 >/dev/null 2>&1
    15 7 * * * ~/run_pgbadger_rds.sh -i prod-postgres-rds -r us-east-1 -c 1 >/dev/null 2>&1

  7. I have setup nginx web server on pgadmin to server to serve html reports so team doesn’t have to download html files to review them.

In case, you are tasked to setup pgbadger report on Postgres RDS instance, this post might help ! I haven’t spent a lot of time writing a script so script provides very bare minimal to run reports  but it can have a lot of improvements. Hope to see your contribution to improve the script 🙂

As always, I’m looking forward to see some comments from your experience working with Postgres RDS 🙂