May 22

Inserting JSON data into Postgres using JDBC driver

One of the clients of OmniTI requested help to provide sample application to insert JSON data into Postgres using Java JDBC driver . I’m not Java expert so it took a while for me to write a simple java code to insert data. TBH, I took help to write test application from one of our Java engineers at OmniTI. Now, test application is ready and next step is to make it work with JSON datatype ! After struggling a little to find out work around for string escaping in JAVA code, I stumbled upon data type issue! Here is the test application code to connect to my local Postgres installation and insert JSON data into sample table:

postgres=# \d sample
Table "public.sample"
Column | Type | Modifiers
--------+---------+-----------
id | integer |
data | json |
denishs-MacBook-Air-2:java denish$ java -cp $CLASSPATH PgJSONExample
-------- PostgreSQL JDBC Connection Testing ------------
PostgreSQL JDBC Driver Registered!
You made it, take control your database now!
Something exploded running the insert: ERROR: column "data" is of type json but expression is of type character varying
Hint: You will need to rewrite or cast the expression.
Position: 42

After some research , I found out that there is no standard JSON type on java side so adding support for json to postgres jdbc is not straight forward ! StackOverflow answer helped me for testing out the JSON datatype handling at psql level. As Craig mentioned in the answer that the correct way to solve this problem is to write a custom Java mapping type that uses the JDBC setObject method. This can be a tricky though.  A simpler workaround is to tell PostgreSQL to cast implicitly from text to json:
postgres=# create cast (text as json) without function as implicit;
CREATE CAST

The WITHOUT FUNCTION clause is used because text and json have the same on-disk and in-memory representation, they’re basically just aliases for the same data type. AS IMPLICIT tells PostgreSQL it can convert without being explicitly told to, allowing things like this to work:

postgres=# prepare test(text) as insert into sample (data) values ($1);
PREPARE
postgres=# execute test('{}');
INSERT 0 1
postgres=# select data from sample;
data
----
{}
(1 row)

Awesome ! That worked :-) Let’s try similar approach in Java application code.

denishs-MacBook-Air-2:java denish$ export CLASSPATH=/usr/share/postgresql/java/postgresql-9.2-1002.jdbc4.jar:
denishs-MacBook-Air-2:java denish$ javac -classpath $CLASSPATH PgJSONExample.java
denishs-MacBook-Air-2:java denish$ java -cp $CLASSPATH PgJSONExample
-------- PostgreSQL JDBC Connection Testing ------------
PostgreSQL JDBC Driver Registered!
You made it, take control your database now!
postgres=# select * from sample;
id | data
----+------------------------------------------------------------------------
1 | {"username":"denish","posts":10122,"emailaddress":"denish@omniti.com"}
(1 row)

Yay! It worked as well :-)

Next in my list to figure out installing PL/Java on Mac and/or Linux !! Let me know, if you have instructions for installation and test application using PL/Java.

Apr 20

When was the database created in Postgres cluster ?

Continuous Integration (CI)  using automated open source tools such as Jenkins and Hudson  is getting adoption rapidly. These tools help developers to gain confidence for creating more robust code rapidly by improving testing and QA process. The flexibility of these softwares add other challenges for the DBAs!

One of our client came across challenge to cleanup databases after X number of days from the Jenkins CI database because each run create seperate database and database names are not standard because they are provided by users. If they don’t cleanup old database, the cluster will have hundreds of databases at the end of the week. We tried to standardize database names but you can’t control users to make mistakes or input db names :-) On the other hand, Postgres’s system catalog view doesn’t provide database creation date.  How can I find out databases older than X days and drop them?

I came across this blog entry that answers my question but I was looking for easier way! I found easier way to get the database creation time with single query! Yay  :-) Following is the query that can be used to find the database creation time. The query should return correct created_date as long as you haven’t run pg_upgrade on the data directory.  I thought to share here so it will be useful for others!

SELECT datname, (pg_stat_file(‘base/’||oid||’/PG_VERSION’)).modification AS datcreated
FROM pg_database;

 

postgres=# SELECT datname, (pg_stat_file(‘base/’||oid||’/PG_VERSION’)).modification AS datcreated
postgres-# FROM pg_database;
datname | datcreated
————+————————
template1 | 2013-03-28 16:04:13-04
template0 | 2013-03-28 16:04:14-04
postgres | 2013-03-28 16:04:14-04
rangetypes | 2013-03-28 16:14:42-04
puppet | 2013-03-28 16:23:13-04
omniti | 2013-04-20 10:02:22-04
(6 rows)

 

Ideally,  pg_database system catalog view should include database_created timestamp ! Hopefully, that day will come sooner than later :-)

Feel free to comment, if you have any other ideas for getting this details or you see any corner cases with above query that I haven’t mentioned here  :-)

Dec 05

pg_repack in action!

Couple of years ago, I started compiling blog post on pg_reorg but that post never made it to published post because of my procrastination !! Though, I wasn’t disappointed because I got opportunity to talk about removing bloat from tables on databases at one of the PostgreSQL conference. Moreover, Depesz ,  colleague at OmniTI,  wrote a detailed post on how pg_reorg actually works :-)  So, you must be thinking , What’s up with this blog post ?

Let’s come to the point :-)

Last month, I received email in pg_reorg mailing list that first release of pg_repack beta1 was released ! So, What is pg_repack ? The first release of pg_repack is simply a fork of  pg_reorg. The author provided the reason of the fork is to revive the development of pg_reorg, which has been stagnated since the release of pg_reorg 1.1.7 in August 2011. That makes sense to me.  The first release doesn’t’ provide any new functionality but add specifically the missing features planned for pg_reorg 1.1.8 and fixes it’s known bugs. That’s good thing ! As I mentioned, it’s provide same functionality as pg_reorg but now you should follow more lively code of pg_repack instead of pg_reog on PGXN project page  :-)

At OmniTI, the engineers contribute new tools to community and use existing tools. We decided to give a swing at pg_repack because we have been using pg_reorg for last couple of years successfully on production systems and familiar with the code base. Now, pg_repack supports extension so it was pretty easy to install tool as extension on one of the PostgreSQL database system and ran it during low peak hours. The pg_repack run helped to trim down database size from 550GB to 400GB. Now a days,if no graph, it never happened :-) Here is the DB size graph for the reference. If you are curious, the graph was created using Circonus monitoring suite.

 

Thanks to pg_repack authors and contributors. If you haven’t join pg_reorg/pg_repack mailing lists, I would recommend you to join. Keep contributing and sharing the results to community!!

Happy Holidays!!

Oct 20

An easy way to reduce outage window for PostgreSQL Upgrade!

PostgreSQL 9.2 release provides lots of great features. Recently, one of the clients at OmniTI required upgrade of their couple of PostgreSQL production databases running on PostgreSQL version 9.0 to PostgreSQL 9.2. The client is running database servers on Amazon EC2 instances. For the failover purpose, they run 3 instances in the setup, one instance is master database and next two instances are slave of master database. Alike others, they were looking for zero outage solution for PostgreSQL upgrade but unfortunately there isn’t one exist now!

There are couple of options for the PostgreSQL upgrade:

  1. pg_dump/pg_restore entire database
  2. Use pg_upgrade for in-place upgrade
  3. Use 3rd party replication system i.e Slony/Bucardo

Evaluation of options:

Option #1 :

Whenever, the upgrade requirement with minimum outage come to us, we always check option #2 . pg_upgrade provides ability to upgrade database without dump/restore all the data. Unfortunately, pg_upgrade –check test failed because the databases are using ltree data type. One of the limitations of pg_upgrade is that it does not work if the ltree contrib module is installed in a database.

Meanwhile, the requirement came from client that they want to consolidate both environment into one server so eliminate cost and maintenance for running 6 instances for 2  production databases and similar number of instances for stage environments. This new requirement eliminated option of even think about using pg_upgrade.

Option #2 :

Next option is to optimize dump/restore process in such a way that the total outage window can be minimized. I started collecting stats about their database size and large tables in the database and dump/restore timing.

1st production database (X DB) :

  • Total database size : 22GB
  • Top 2 large total tables size: 12GB
  • Dump/restore duration  : 25 minutes ( dropping indices before restore and creating at the end , with 4 parallel workers  for restore)
  • Estimated outage required with application testing: 30 minutes

2nd production database  (Y DB)  :

  • Total database size size:  50GB
  • Top 10  large total tables size: 45GB
  • Dump/restore duration   : 1.5 hours (dropping indices before restore and creating at the end , with 4 parallel workers  for restore)
  • Estimated outage required with application testing : 1.45 hours

Both of the above estimations are providing accurate dump/restore timing  because  I ran dump on existing prod servers and restore on proposed new server.

Option # 3:

I could use Slony or Bucardo replication systems for replicating tables for upgrade purpose but even though Slonly/Bucardo systems are around for a while , they are very complicated to setup, manage and debug in-case of the problems. It might be only me but I did not want to introduce complex replication system for upgrade purpose!

Further looking tables/schema usage:

Now, I started looking into more into optimizing dump/restore options. Digging into more details for large tables using pg_stat_all_tables and schema details , I collected following facts:

  1. For X DB, top 2 largest tables are Insert only tables. yay!
  2. For Y DB, same top 2 tables are insert only but there are some more large tables with insert/update/delete
  3. No Foreign Keys on the tables for both databases.

So, I started looking into Simple table level replication options. Let me make it clear that table replication still works with FKs but its is important to know because I don’t have to worry about disabling foreign keys during replication at table level and enable later.

Mimeo

I didn’t have to look far because Keith Fiske,one of my colleagues, recently came up with Mimeo extension. Mimeo is very simple OmniTi’s home grown replication system for replicating databases at table level over dblink between two PostgreSQL databases. Usually, Mimeo helps us replicating production instance to DataWarehouse system but in this case I decided to give a try to use for replicating tables temporarily during upgrade process.

Why Mimeo?

Mimeo is extremely easy to setup and understand because all code resides as sql/pgsql functions. Mimeo provides very good documentation but I will give you overall idea. Mimeo is installed as extension. All the code related to replication resides under mimeo schema and it tracks all the replication functions using another extension called pg_jobmon to keep track of functions executions.  For now, Mimeo supports  following replications:

  • Inserters : Replicate Insert only tables.
  • Updaters : Replicate tables based on updated_tsz. You could place trigger on source table on production to keep the updated_tsz updated but most of the times your application is already taking care of. This replication method does not support DELETEs on tables.
  • DML: This method supports Insert/Update/Delete on the table but the table should have Primary Key to keep track of the changes. Mimeo places trigger on source table on production database to keep track of rows into mimeo.tablename_pgq tables . A pull request from destination (replicated) table to fetch these rows from queue table to grab latest data only for changed rows and apply them on destination table.
  • Snap : Grab entire table from source table and truncate destination table to refresh completely. it’s very useful method for small tables.

That was brief overview of the Mimeo. The tool is still under development and looking for more testers and contributors.

For now, Let’s get back to upgrade !

Upgrade X DB:

First production environment was easy because there are only 2 large tables and both are INSERT-only tables. After you have packages installed for dblink, jobmon and mimeo , you could install them into database as extension.

On new database server on PostgreSQL 9.2:

create schema dblink;
create schema jobmon;
create schema mimeo;
create extension dblink schema dblink ;
create extension pg_jobmon schema jobmon ;
create  extension mimeo schema mimeo;

After mimeo installation and setting up new production DB servers on PostgreSQL 9.2 with master-slave setup, I followed following steps to upgrade X DB:

  1. Freeze schema changes on X DB production database server running on PostgreSQL 9.0.
  2. pg_dump entire database schema dump and restored on PostgreSQL 9.2 database
  3. pg_dump two large tables : t1 & t2 and restored on PostgreSQL 9.2 database
  4. Setup Mimeo replication for t1 and t2 tables using refresh_updater method.
  5. pg_dump all but t1 & t2 tables data and pg_restore with 4 parallel processes (-j 4) on PostgreSQL 9.2 with  (~15 minutes). To expedite the restore process, I dropped indices on couple of large tables before the restore and put it back after the restore.
  6. Reset sequences for t1 and t2 on PostgreSQL 9.2
  7. Open up upgraded database for applications!!

Keep in mind that only step 5, 6 & 7 needs to be executed during outage period. The total outage for upgrade of this production database environment was ~ 15 minutes .

Upgrade Y DB:

Second production environment is using same PostgreSQL cluster on new database server but different database name. This database is larger than first one and have more tables with all kinds of transactions.

After analyzing table stats, I came up with group of tables:

relation | size | n_tup_ins | n_tup_upd | n_tup_del
—————————+———-+———–+———–+———–
<< Replicate tables based on incremental Primary Key, No trigger required . inserter replication >>

1. t1 | 15 GB | 22310924 | 0 | 0 <——– insert only
2. t2 | 789 MB | 3176894 | 0 | 0 <——– insert only
3. t3  | 13 MB | 7379 | 0 | 0 <——– insert only

<<  Insert/Update/Delete on tables , DML replication>>

4. t4 | 4515 MB | 1233555 | 17966613 | 0 <–insert /update/delete
5. t5 | 2520 MB | 5004129 | 21599077 | 0 <–insert /update/delete
6. t6 | 1310 MB | 4041253 | 519 | 0 <–insert /update/delete
7. t7 | 1123 MB | 1020479 | 2050512 | 0 <–insert /update
8. t8 | 75 MB | 875275 | 19047 | 0 <–insert /update
9. t9 | 43 MB | 30509 | 8976539 | 0 <–insert /update
10. t10 | 22 MB | 12338 | 16201 | 0 <–insert /update
11. t11 | 12 MB | 11574 | 199 | 0 <–insert /update
12. t12 | 10168 kB | 12283 | 11192 | 0 <–insert /update

<< static tables >>

13.  t13 | 43 MB | 0 | 0 | 0 <– static tables, never changes
14. t14 | 39 MB | 0 | 0 | 0 <– static tables, never changes
15. t15 | 18 MB | 0 | 0 | 0 <– static tables, never changes
16. t16 | 16 MB | 0 | 0 | 0 <– static tables, never changes
17. t17 | 14 MB | 0 | 0 | 0 <– static tables, never changes

I followed same procedure to install mimeo on this database as described above but for DML replication you need to execute an extra step by creating mimeo schema on source (production database server running on PostgreSQL 9.0) with proper permissions for mimeo replication role. All _pgq tables and trigger functions on source tables reside under this mimeo schema on source database.

On source database server :

CREATE schema mimeo;
ALTER SCHEMA mimeo OWNER TO <mimeo_role>;
GRANT TRIGGER ON <source_table> TO <mimeo_role>;

After mimeo installation and creating up new production DB on same cluster of PostgreSQL 9.2 with master-slave setup, I followed following steps to upgrade Y DB:

  1. Freeze schema changes on Y DB production database server running on PostgreSQL 9.0.
  2. pg_dump entire database schema dump and restored on PostgreSQL 9.2 database
  3. Disable triggers on replicated table on destination database
  4. Setup replication trigger on DML group replicated tables using mimeo.dml_maker function by executing on PostgreSQL 9.2 database server.
  5. pg_dump 17 tables : t1 – t17  from source database and pg_restore on PostgreSQL 9.2 database
  6. Setup Mimeo replication for t1-t3 using refresh_updater and t4-12 tables using refresh_dml. I did not setup replication for static tables.
  7. pg_dump all but t1to t17 tables data and pg_restore with 4 parallel processes (-j 4) on PostgreSQL 9.2 with  (~5 minutes).
  8. Enable triggers and reset sequences for t1 to t17 on PostgreSQL 9.2
  9. Compare and Verify count and/or max(id) for t1 to t17 tables between PostgreSQL 9.0 database and upgraded PostgreSQL 9.2 database server
  10. Open up upgraded database for applications !!

As above, only steps 7-10  need to be executed during outage period. The total outage for upgrade of this production database environment was about 15 minutes.

In conclusion, Mimeo helped our client to upgrade their database servers with minimal outage.  Hopefully, it will help you on your next production database upgrade to reduce outage window.

Mar 01

monitor bucardo replication lag using circonus

         I have been using circonus for monitoring, trending and alerting for any database metrics for quite a long time now. The circonus interface makes the monitoring, trending and alerting setup painless and you can see graph flowing in minutes. Another good thing about Circonus is that you can monitor anything that you can query from database ! This week, the task at my hand was to find a way to monitor bucardo replication lag.  `bucardo_ctl status sync_name` provides very important information that you can rely for trending and alerting purposes.

$ bucardo_ctl status my_slave
Sync name: my_sync
Current state: WAIT:22s (PID = 19500)
Type: pushdelta
Source herd/database: slave_herd / master_herd
Target database: my_slave
Tables in sync: 318
Last good: 23s (time to run: 1m 21s)
Last good time: Feb 29, 2012 15:27:14 Target: my_slave
Ins/Upd/Del: 142 / 0 / 0
Last bad: 1h 45m 9s (time to run: 19m 57s)
Last bad time: Feb 29, 2012 13:42:29 Target: my_slave
Latest bad reason: MCP removing stale q entry
PID file: /var/run/bucardo/bucardo.ctl.sync.my_sync.pid
PID file created: Wed Feb 29 13:42:33 2012
Status: active
Limitdbs: 0
Priority: 0
Checktime: none
Overdue time: 00:00:00
Expired time: 00:00:00
Stayalive: yes Kidsalive: yes
Rebuild index: 0 Do_listen: no
Ping: yes Makedelta: no
Onetimecopy: 0

All the information provided by `bucardo_ctl status` command is important but most interesting thing to monitor is “Last good:”. Last good shows bucardo replication lag on slave server.

Trending in circonus:

Now, I have metric identified for the monitoring. Next step is to find the best way to put the metric into the monitoring tool. After some poking around the output and ways to monitor stuff, I decided to grab the SQL query from bucardo_ctl perl script and stick it into circonus monitoring. Most of the time spent for setting up this monitor was to grab right query from big perl script (bucardo_ctl) and mapping the metric required from the query. After that, here is the query that I plugged  into Circonus in no-time.

SELECT
‘bucardo_last_good’ , round(extract(epoch FROM now()-ended))
FROM
(SELECT * FROM bucardo.q WHERE sync = ‘my_sync’ AND cdate >= now() – interval ’3 days’
UNION ALL
SELECT * FROM freezer.master_q
WHERE sync = ‘my_sync’ AND cdate >= now() – interval ’3 days’) AS foo
WHERE ended is NOT NULL AND aborted IS NULL
ORDER BY ended DESC LIMIT 1;


Alerting in circonus:

bucardo_ctl status shows Last good status to “unknown” if replication is broken.

Name Type State PID Last_good Time I/U/D Last_bad Time
===========+=====+========+====+=========+=====+=====+========+=====
my_sync| P |WAIT:35s|7620| unknown | | |36s |1m58s

       In circonus, you could setup rules and relevant severity levels. The most important part is that if the query doesn’t return any row it should page(“unknown” condition) . Circonus provides rule for alerts in case of a metric is absent. Now, I am all set with alerts as well.

Yay! bucardo replication is under monitoring and trending without any hassle! Hopefully, this post will help you next time when you try to put bucardo replication lag under monitoring.

Jan 05

What is pg_extractor ?

In my recent blog post, I wrote about PostgreSQL DBA Handyman toolset. In the list of tools, getddl is one of them. If you are using getddl to get DDL schema and track the daily changes in SVN for production databases, you should consider moving that process to use pg_extractor instead. pg_extractor is the more advance and robust tool for extracting schema as well data using pg_dump. Keith Fiske, an author of the tool, described tool in detail in his blog post. Thanks to Keith for making the schema extraction tool more robust and taking it to next level !

Hopefully, it will help you to have more control over your database in smarter way!

Dec 06

Faster & Better VACUUM FULL

                In presentation, I discussed in detail about Bloat issue in PostgreSQL and methods to remove Bloat from the tables/indexes. Now a days, PostgreSQL9.0 is common and  the widely used version for the production use and it’s vital to remind about changes in most important bloat removal tool called “VACUUM FULL”. Until PostgreSQL 9.0, VACUUM FULL was tardy and DBA always stayed away from it and used CLUSTER instead. (Checkout presentation for difference between CLUSTER vs VACUUM FULL)
                The VACUUM FULL statement recovers free space from a table to reduce its size from bloated tables, mostly when VACUUM itself hasn’t been run frequently enough. Before PostgreSQL 9.0 , it was tardy and slow because of the way it was executed: records were read and moved one by one from their source block to a block closer to the beginning of the table. Once the end of the table was emptied, this empty part was removed. This method was very inefficient: moving records one by one creates a lot of random IO.  Additionally, during this reorganization, indexes had to be maintained, making everything even more costly, and fragmenting indexes. It was therefore advised to reindex a table just after a VACUUM FULL.
                Now, the VACUUM FULL statement, as of PostgreSQL 9.0, creates a new table from the current one, copying all the records sequentially. Once all records are copied, index are created back, and the old table is destroyed and replaced. This has the advantage of being much faster. VACUUM FULL still needs an EXCLUSIVE LOCK  during entire operation. The only drawback of this method compared to the old one, is that VACUUM FULL can use as much as two times the size of the table and indexes on disk, as it is creating a new versions of it.

Let’s compare run-time of VACUUM FULL on PostgreSQL 8.4 vs PostgreSQL 9.0

postgres=# create table vacuumtest(id int primary key);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index “vacuumtest_pkey” for table “vacuumtest”
CREATE TABLE
postgres=# insert into vacuumtest select generate_series(1,10000000);
INSERT 0 10000000
postgres=# delete from vacuumtest where id%4=0;
DELETE 2500000
postgres=# vacuum vacuumtest;
VACUUM

On 8.4:
postgres=# vacuum full vacuumtest ;
VACUUM
Time: 61418.197 ms
postgres=# reindex table vacuumtest;
REINDEX
Time: 12212.815 ms

On 9.0:
postgres=# vacuum full vacuumtest ;
VACUUM
Time: 32640.714 ms

                Above results show that VACCUM FULL on PostgreSQL 9.0 is way faster than previous versions. Moreover, VACUUM FULL has couple of advantages over CLUSTER :  it’s faster than CLUSTER because it doesn’t have to build new table using ORDER by clause & you can run VACUUM FULL on tables on which there isn’t any index.
                If you are running any bloat removal tool on the production database, i would recommend you to revisit vacuum parameters and tighten them up a little that makes regular vacuum to run more frequent so it will help to reduce frequency of running more intrusive bloat removal tools!!