<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Denish Patel &#187; postgresql</title>
	<atom:link href="http://www.pateldenish.com/category/postgresql/feed" rel="self" type="application/rss+xml" />
	<link>http://www.pateldenish.com</link>
	<description>my thoughts ....</description>
	<lastBuildDate>Wed, 22 May 2013 20:32:01 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Inserting JSON data into Postgres using JDBC driver</title>
		<link>http://www.pateldenish.com/2013/05/inserting-json-data-into-postgres-using-jdbc-driver.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=inserting-json-data-into-postgres-using-jdbc-driver</link>
		<comments>http://www.pateldenish.com/2013/05/inserting-json-data-into-postgres-using-jdbc-driver.html#comments</comments>
		<pubDate>Wed, 22 May 2013 01:22:55 +0000</pubDate>
		<dc:creator>Denish</dc:creator>
				<category><![CDATA[postgresql]]></category>

		<guid isPermaLink="false">http://www.pateldenish.com/?p=368</guid>
		<description><![CDATA[EDIT:  Marcus(1st comment provider) helped me to write much cleaner and secure code. It doesn&#8217;t require CAST function and uses  PGobject with jdbc’s setObject. You could download updated code from git-repo. Thanks Marcus !! One of the clients of OmniTI requested &#8230; <a href="http://www.pateldenish.com/2013/05/inserting-json-data-into-postgres-using-jdbc-driver.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p><span style="color: #ff0000;"><strong>EDIT:  </strong><span style="color: #003300;">Marcus(1st comment provider) helped me to write much cleaner and secure code. It doesn&#8217;t require CAST function and uses  PGobject with jdbc’s setObject. You could download updated code from <span style="color: #0000ff;"><a href="https://github.com/denishpatel/java/blob/master/PgJSONExample.java"><span style="color: #0000ff;">git-repo</span></a></span>. Thanks Marcus !!</span></span></p>
<p><a href="http://omniti.com/does/data-management">One of the clients</a> of <a href="http://omniti.com">OmniTI</a> requested help to provide sample application to insert <a href="http://wiki.postgresql.org/wiki/What's_new_in_PostgreSQL_9.2#JSON_datatype">JSON data into Postgres </a>using Java <a href="http://jdbc.postgresql.org">JDBC driver </a>. I&#8217;m not Java expert so it took a while for me to write a simple java code to insert data. TBH, I took help to write test application from one of our Java engineers at OmniTI. Now, test application is ready and next step is to make it work with JSON datatype ! After struggling a little to find out work around for string escaping in JAVA code, I stumbled upon data type issue! <a href="https://github.com/denishpatel/java/blob/master/PgJSONExample.java">Here is the test application code</a> to connect to my local Postgres installation and insert JSON data into sample table:<br />
<code><br />
postgres=# \d sample<br />
Table "public.sample"<br />
Column | Type | Modifiers<br />
--------+---------+-----------<br />
id | integer |<br />
data | json |<br />
denishs-MacBook-Air-2:java denish$ java -cp $CLASSPATH PgJSONExample<br />
-------- PostgreSQL JDBC Connection Testing ------------<br />
PostgreSQL JDBC Driver Registered!<br />
You made it, take control your database now!<br />
Something exploded running the insert: ERROR: column "data" is of type json but expression is of type character varying<br />
Hint: You will need to rewrite or cast the expression.<br />
Position: 42<br />
</code></p>
<p>After some research , I <a href="http://www.postgresql.org/message-id/4FBB1BCC.8040709@ringerc.id.au">found out</a> that there is no standard JSON type on java side so adding support for json to postgres jdbc is not straight forward ! StackOverflow <a href="http://stackoverflow.com/questions/15974474/mapping-postgresql-json-column-to-hibernate-value-type">answer</a> helped me for testing out the JSON datatype handling at psql level. As Craig mentioned in the answer that the correct way to solve this problem is to write a custom Java mapping type that uses the JDBC setObject method. This can be a tricky though.  A simpler workaround is to tell PostgreSQL to cast implicitly from text to json:<br />
<code> postgres=# create cast (text as json) without function as implicit;<br />
CREATE CAST<br />
</code></p>
<p>The WITHOUT FUNCTION clause is used because text and json have the same on-disk and in-memory representation, they&#8217;re basically just aliases for the same data type. AS IMPLICIT tells PostgreSQL it can convert without being explicitly told to, allowing things like this to work:<br />
<code><br />
postgres=# prepare test(text) as insert into sample (data) values ($1);<br />
PREPARE<br />
postgres=# execute test('{}');<br />
INSERT 0 1<br />
postgres=# select data from sample;<br />
data<br />
----<br />
{}<br />
(1 row)<br />
</code></p>
<p>Awesome ! That worked <img src='http://www.pateldenish.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  Let&#8217;s try similar approach in Java application code.</p>
<p><code> denishs-MacBook-Air-2:java denish$ export CLASSPATH=/usr/share/postgresql/java/postgresql-9.2-1002.jdbc4.jar:<br />
denishs-MacBook-Air-2:java denish$ javac -classpath $CLASSPATH PgJSONExample.java<br />
denishs-MacBook-Air-2:java denish$ java -cp $CLASSPATH PgJSONExample<br />
-------- PostgreSQL JDBC Connection Testing ------------<br />
PostgreSQL JDBC Driver Registered!<br />
You made it, take control your database now!<br />
postgres=# select * from sample;<br />
id | data<br />
----+------------------------------------------------------------------------<br />
1 | {"username":"denish","posts":10122,"emailaddress":"denish@omniti.com"}<br />
(1 row)</code></p>
<p>Yay! It worked as well <img src='http://www.pateldenish.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>Next in my list to figure out installing PL/Java on Mac and/or Linux !! Let me know, if you have instructions for installation and test application using PL/Java.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.pateldenish.com/2013/05/inserting-json-data-into-postgres-using-jdbc-driver.html/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Deploying PostgreSQL on Amazon EC2: A Case Study</title>
		<link>http://www.pateldenish.com/2013/05/deploying-postgresql-on-amazon-ec2-a-case-study.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=deploying-postgresql-on-amazon-ec2-a-case-study</link>
		<comments>http://www.pateldenish.com/2013/05/deploying-postgresql-on-amazon-ec2-a-case-study.html#comments</comments>
		<pubDate>Wed, 08 May 2013 01:28:30 +0000</pubDate>
		<dc:creator>Denish</dc:creator>
				<category><![CDATA[postgresql]]></category>

		<guid isPermaLink="false">http://www.pateldenish.com/?p=333</guid>
		<description><![CDATA[I got opportunity to give talk on &#8220;Deploying PostgreSQL on Amazon EC2: A Case Study&#8221; at  PGDay NYC and LOPSA-East . Here is the slides deck: &#160; Deploying postgre sql on amazon ec2 from Denish Patel]]></description>
				<content:encoded><![CDATA[<p>I got opportunity to give talk on &#8220;Deploying PostgreSQL on Amazon EC2: A Case Study&#8221; at  <a href="http://pgday.nycpug.org/talk/19">PGDay NYC </a>and <a href="http://lopsa-east.org/2013/talks/">LOPSA-East</a> . Here is the slides deck:</p>
<p>&nbsp;</p>
<p><iframe style="border: 1px solid #CCC; border-width: 1px 1px 0; margin-bottom: 5px;" src="http://www.slideshare.net/slideshow/embed_code/20731464" height="486" width="597" allowfullscreen="" frameborder="0" marginwidth="0" marginheight="0" scrolling="no"></iframe></p>
<div style="margin-bottom: 5px;"><strong> <a title="Deploying postgre sql on amazon ec2 " href="http://www.slideshare.net/denishpatel/deploying-postgre-sql-on-amazon-ec2" target="_blank">Deploying postgre sql on amazon ec2 </a> </strong> from <strong><a href="http://www.slideshare.net/denishpatel" target="_blank">Denish Patel</a></strong></div>
]]></content:encoded>
			<wfw:commentRss>http://www.pateldenish.com/2013/05/deploying-postgresql-on-amazon-ec2-a-case-study.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>When was the database created in Postgres cluster ?</title>
		<link>http://www.pateldenish.com/2013/04/when-was-the-database-created-in-postgres-cluster.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=when-was-the-database-created-in-postgres-cluster</link>
		<comments>http://www.pateldenish.com/2013/04/when-was-the-database-created-in-postgres-cluster.html#comments</comments>
		<pubDate>Sat, 20 Apr 2013 14:55:24 +0000</pubDate>
		<dc:creator>Denish</dc:creator>
				<category><![CDATA[postgresql]]></category>

		<guid isPermaLink="false">http://www.pateldenish.com/?p=329</guid>
		<description><![CDATA[Continuous Integration (CI)  using automated open source tools such as Jenkins and Hudson  is getting adoption rapidly. These tools help developers to gain confidence for creating more robust code rapidly by improving testing and QA process. The flexibility of these softwares add &#8230; <a href="http://www.pateldenish.com/2013/04/when-was-the-database-created-in-postgres-cluster.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>Continuous Integration (CI)  using automated open source tools such as <a href="http://jenkins-ci.org">Jenkins</a> and <a href="http://hudson-ci.org">Hudson</a>  is getting adoption rapidly. These tools help developers to gain confidence for creating more robust code rapidly by <a href="http://omniti.com/seeds/seeds-our-experiences-with-chef-enabling-software-quality-assurance">improving testing and QA process</a>. The flexibility of these softwares add other challenges for the DBAs!</p>
<p>One of our client came across challenge to cleanup databases after X number of days from the Jenkins CI database because each run create seperate database and database names are not standard because they are provided by users. If they don&#8217;t cleanup old database, the cluster will have hundreds of databases at the end of the week. We tried to standardize database names but you can&#8217;t control users to make mistakes or input db names <img src='http://www.pateldenish.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  On the other hand, Postgres&#8217;s system catalog view doesn&#8217;t provide database creation date.  How can I find out databases older than X days and drop them?</p>
<p>I came across <a href="http://raghavt.blogspot.com/2011/09/how-to-get-database-creation-time-in.html">this blog entry </a>that answers my question but I was looking for easier way! I <a href="http://www.postgresql.org/message-id/CAFcNs+qMGbLmeUOnjmbna_K7=UP817BPw9QxhbCTGNScPKVoeA@mail.gmail.com">found</a> easier way to get the database creation time with single query! Yay  :-) Following is the query that can be used to find the database creation time. The query should return correct created_date as long as you haven&#8217;t run pg_upgrade on the data directory.  I thought to share here so it will be useful for others!</p>
<blockquote><p>SELECT datname, (pg_stat_file(&#8216;base/&#8217;||oid||&#8217;/PG_VERSION&#8217;)).modification AS datcreated<br />
FROM pg_database;</p>
<p>&nbsp;</p></blockquote>
<blockquote><p>postgres=# SELECT datname, (pg_stat_file(&#8216;base/&#8217;||oid||&#8217;/PG_VERSION&#8217;)).modification AS datcreated<br />
postgres-# FROM pg_database;<br />
datname | datcreated<br />
&#8212;&#8212;&#8212;&#8212;+&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<br />
template1 | 2013-03-28 16:04:13-04<br />
template0 | 2013-03-28 16:04:14-04<br />
postgres | 2013-03-28 16:04:14-04<br />
rangetypes | 2013-03-28 16:14:42-04<br />
puppet | 2013-03-28 16:23:13-04<br />
omniti | 2013-04-20 10:02:22-04<br />
(6 rows)</p>
<p>&nbsp;</p></blockquote>
<p>Ideally,  pg_database system catalog view should include database_created timestamp ! Hopefully, that day will come sooner than later <img src='http://www.pateldenish.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>Feel free to comment, if you have any other ideas for getting this details or you see any corner cases with above query that I haven&#8217;t mentioned here  :-)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.pateldenish.com/2013/04/when-was-the-database-created-in-postgres-cluster.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>pg_repack in action!</title>
		<link>http://www.pateldenish.com/2012/12/pg_repack-in-action.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=pg_repack-in-action</link>
		<comments>http://www.pateldenish.com/2012/12/pg_repack-in-action.html#comments</comments>
		<pubDate>Wed, 05 Dec 2012 21:16:39 +0000</pubDate>
		<dc:creator>Denish</dc:creator>
				<category><![CDATA[postgresql]]></category>

		<guid isPermaLink="false">http://www.pateldenish.com/?p=314</guid>
		<description><![CDATA[Couple of years ago, I started compiling blog post on pg_reorg but that post never made it to published post because of my procrastination !! Though, I wasn&#8217;t disappointed because I got opportunity to talk about removing bloat from tables on &#8230; <a href="http://www.pateldenish.com/2012/12/pg_repack-in-action.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>Couple of years ago, I started compiling blog post on <a href="http://reorg.projects.pgfoundry.org/pg_reorg.html">pg_reorg</a> but that post never made it to published post because of my procrastination !! Though, I wasn&#8217;t disappointed because I got opportunity to <a href="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=p90xyourdatabase-13010796913746-phpapp02&amp;stripped_title=p90-x-your-database">talk</a> about removing bloat from tables on databases at one of the PostgreSQL conference. Moreover, Depesz ,  colleague at <a href="http://omniti.com">OmniTI</a>,  wrote a <a href="http://www.depesz.com/2011/07/06/bloat-happens/">detailed post</a> on how pg_reorg actually works <img src='http://www.pateldenish.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  So, you must be thinking , What&#8217;s up with this blog post ?</p>
<p>Let&#8217;s come to the point <img src='http://www.pateldenish.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>Last month, I received email in pg_reorg mailing list that first release of pg_repack beta1 was released ! So, What is pg_repack ? The first release of pg_repack is simply a fork of  pg_reorg. The author provided the reason of the fork is to revive the development of pg_reorg, which has been stagnated since the release of pg_reorg 1.1.7 in August 2011. That makes sense to me.  The first release doesn&#8217;t&#8217; provide any new functionality but add specifically the missing features planned for pg_reorg 1.1.8 and fixes it&#8217;s known bugs. That&#8217;s good thing ! As I mentioned, it&#8217;s provide same functionality as pg_reorg but now you should follow more lively code of pg_repack instead of pg_reog on <a href="http://pgxn.org/dist/pg_repack/">PGXN project page</a>  <img src='http://www.pateldenish.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>At <a href="http://omniti.com">OmniTI</a>, the engineers <a href="https://labs.omniti.com">contribute</a> new tools to community and use existing tools. We decided to give a swing at pg_repack because we have been using pg_reorg for last couple of years successfully on production systems and familiar with the code base. Now, pg_repack supports extension so it was pretty easy to install tool as extension on one of the PostgreSQL database system and ran it during low peak hours. The pg_repack run helped to trim down database size from 550GB to 400GB. Now a days,if no graph, it never happened <img src='http://www.pateldenish.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  Here is the DB size graph for the reference. If you are curious, the graph was created using <a href="https://circonus.com">Circonus monitoring suite</a>.</p>
<p><a href="http://www.pateldenish.com/wp-content/uploads/2012/12/Screen-Shot-2012-12-05-at-12.10.02-PM1.png"><img class="alignleft size-large wp-image-316" title="DB Size graph" src="http://www.pateldenish.com/wp-content/uploads/2012/12/Screen-Shot-2012-12-05-at-12.10.02-PM1-1024x441.png" alt="" width="688" height="296" /></a></p>
<p>&nbsp;</p>
<p>Thanks to pg_repack authors and contributors. If you haven&#8217;t join pg_reorg/pg_repack mailing lists, I would recommend you to join. Keep contributing and sharing the results to community!!</p>
<p>Happy Holidays!!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.pateldenish.com/2012/12/pg_repack-in-action.html/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>An easy way to reduce outage window for PostgreSQL Upgrade!</title>
		<link>http://www.pateldenish.com/2012/10/an-easy-way-to-reduce-outage-window-for-postgresql-upgrade.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=an-easy-way-to-reduce-outage-window-for-postgresql-upgrade</link>
		<comments>http://www.pateldenish.com/2012/10/an-easy-way-to-reduce-outage-window-for-postgresql-upgrade.html#comments</comments>
		<pubDate>Sat, 20 Oct 2012 18:51:59 +0000</pubDate>
		<dc:creator>Denish Patel</dc:creator>
				<category><![CDATA[postgresql]]></category>

		<guid isPermaLink="false">http://www.pateldenish.com/?p=292</guid>
		<description><![CDATA[PostgreSQL 9.2 release provides lots of great features. Recently, one of the clients at OmniTI required upgrade of their couple of PostgreSQL production databases running on PostgreSQL version 9.0 to PostgreSQL 9.2. The client is running database servers on Amazon EC2 &#8230; <a href="http://www.pateldenish.com/2012/10/an-easy-way-to-reduce-outage-window-for-postgresql-upgrade.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p><a href="http://www.postgresql.org/docs/9.2/static/release-9-2-1.html">PostgreSQL 9.2</a> release provides lots of great features. Recently, one of the <a href="http://omniti.com/does">clients</a> at <a href="http://omniti.com/does/data-management">OmniTI</a> required upgrade of their couple of PostgreSQL production databases running on PostgreSQL version 9.0 to PostgreSQL 9.2. The client is running database servers on <a href="http://aws.amazon.com/ec2/instance-types/">Amazon EC2 instances</a>. For the failover purpose, they run 3 instances in the setup, one instance is master database and next two instances are slave of master database. Alike others, they were looking for zero outage solution for PostgreSQL upgrade but unfortunately there isn&#8217;t one exist now!</p>
<p>There are couple of options for the PostgreSQL upgrade:</p>
<ol>
<li><a href="http://www.postgresql.org/docs/9.2/static/app-pgdump.html">pg_dump</a>/<a href="http://www.postgresql.org/docs/9.2/static/app-pgrestore.html">pg_restore</a> entire database</li>
<li>Use <a href="http://www.postgresql.org/docs/9.2/static/pgupgrade.html">pg_upgrade</a> for in-place upgrade</li>
<li>Use 3rd party replication system i.e Slony/Bucardo</li>
</ol>
<p><strong>Evaluation of options:</strong></p>
<p><strong>Option #1 :</strong></p>
<p>Whenever, the upgrade requirement with minimum outage come to us, we always check option #2 . pg_upgrade provides ability to upgrade database without dump/restore all the data. Unfortunately, pg_upgrade &#8211;check test failed because the databases are using ltree data type. One of the limitations of pg_upgrade is that it does not work if the ltree contrib module is installed in a database.</p>
<p>Meanwhile, the requirement came from client that they want to consolidate both environment into one server so eliminate cost and maintenance for running 6 instances for 2  production databases and similar number of instances for stage environments. This new requirement eliminated option of even think about using pg_upgrade.</p>
<p><strong>Option #2 :</strong></p>
<p>Next option is to optimize dump/restore process in such a way that the total outage window can be minimized. I started collecting stats about their database size and large tables in the database and dump/restore timing.</p>
<p>1st production database (X DB) :</p>
<ul>
<li>Total database size : 22GB</li>
<li>Top 2 large total tables size: 12GB</li>
<li>Dump/restore duration  : 25 minutes ( dropping indices before restore and creating at the end , with 4 parallel workers  for restore)</li>
<li>Estimated outage required with application testing: 30 minutes</li>
</ul>
<p><em>2nd production database  (Y DB)  :</em></p>
<ul>
<li>Total database size size:  50GB</li>
<li>Top 10  large total tables size: 45GB</li>
<li>Dump/restore duration   : 1.5 hours (dropping indices before restore and creating at the end , with 4 parallel workers  for restore)</li>
<li>Estimated outage required with application testing : 1.45 hours</li>
</ul>
<p>Both of the above estimations are providing accurate dump/restore timing  because  I ran dump on existing prod servers and restore on proposed new server.</p>
<p><strong>Option # 3:</strong></p>
<p>I could use Slony or Bucardo replication systems for replicating tables for upgrade purpose but even though Slonly/Bucardo systems are around for a while , they are very complicated to setup, manage and debug in-case of the problems. It might be only me but I did not want to introduce complex replication system for upgrade purpose!</p>
<p><strong>Further looking tables/schema usage:</strong></p>
<p>Now, I started looking into more into optimizing dump/restore options. Digging into more details for large tables using pg_stat_all_tables and schema details , I collected following facts:</p>
<ol>
<li>For X DB, top 2 largest tables are Insert only tables. yay!</li>
<li>For Y DB, same top 2 tables are insert only but there are some more large tables with insert/update/delete</li>
<li>No Foreign Keys on the tables for both databases.</li>
</ol>
<p>So, I started looking into Simple table level replication options. Let me make it clear that table replication still works with FKs but its is important to know because I don&#8217;t have to worry about disabling foreign keys during replication at table level and enable later.</p>
<p><strong>Mimeo</strong></p>
<p>I didn&#8217;t have to look far because <a href="http://keithf4.com">Keith Fiske</a>,one of my colleagues, recently came up with <a href="https://github.com/keithf4/mimeo">Mimeo</a> extension. Mimeo is very simple OmniTi&#8217;s home grown replication system for replicating databases at table level over dblink between two PostgreSQL databases. Usually, Mimeo helps us replicating production instance to DataWarehouse system but in this case I decided to give a try to use for replicating tables temporarily during upgrade process.</p>
<p><strong>Why Mimeo?</strong></p>
<p>Mimeo is extremely easy to setup and understand because all code resides as sql/pgsql functions. Mimeo provides very good <a href="https://github.com/keithf4/mimeo/blob/master/doc/mimeo.md">documentation</a> but I will give you overall idea. Mimeo is installed as extension. All the code related to replication resides under mimeo schema and it tracks all the replication functions using another extension called <a href="https://github.com/omniti-labs/pg_jobmon">pg_jobmon</a> to keep track of functions executions.  For now, Mimeo supports  following replications:</p>
<ul>
<li>Inserters : Replicate Insert only tables.</li>
<li>Updaters : Replicate tables based on updated_tsz. You could place trigger on source table on production to keep the updated_tsz updated but most of the times your application is already taking care of. This replication method does not support DELETEs on tables.</li>
<li>DML: This method supports Insert/Update/Delete on the table but the table should have Primary Key to keep track of the changes. Mimeo places trigger on source table on production database to keep track of rows into mimeo.tablename_pgq tables . A pull request from destination (replicated) table to fetch these rows from queue table to grab latest data only for changed rows and apply them on destination table.</li>
<li>Snap : Grab entire table from source table and truncate destination table to refresh completely. it&#8217;s very useful method for small tables.</li>
</ul>
<p>That was brief overview of the Mimeo. The tool is still under development and looking for more testers and contributors.</p>
<p>For now, Let&#8217;s get back to upgrade !</p>
<p><strong>Upgrade X DB:</strong></p>
<p>First production environment was easy because there are only 2 large tables and both are INSERT-only tables. After you have packages installed for dblink, jobmon and mimeo , you could install them into database as extension.</p>
<p>On new database server on PostgreSQL 9.2:</p>
<p>create schema dblink;<br />
create schema jobmon;<br />
create schema mimeo;<br />
create extension dblink schema dblink ;<br />
create extension pg_jobmon schema jobmon ;<br />
create  extension mimeo schema mimeo;</p>
<p>After mimeo installation and setting up new production DB servers on PostgreSQL 9.2 with master-slave setup, I followed following steps to upgrade X DB:</p>
<ol>
<li>Freeze schema changes on X DB production database server running on PostgreSQL 9.0.</li>
<li>pg_dump entire database schema dump and restored on PostgreSQL 9.2 database</li>
<li>pg_dump two large tables : t1 &amp; t2 and restored on PostgreSQL 9.2 database</li>
<li>Setup Mimeo replication for t1 and t2 tables using refresh_updater method.</li>
<li>pg_dump all but t1 &amp; t2 tables data and pg_restore with 4 parallel processes (-j 4) on PostgreSQL 9.2 with  (~15 minutes). To expedite the restore process, I dropped indices on couple of large tables before the restore and put it back after the restore.</li>
<li>Reset sequences for t1 and t2 on PostgreSQL 9.2</li>
<li>Open up upgraded database for applications!!</li>
</ol>
<p>Keep in mind that only step 5, 6 &amp; 7 needs to be executed during outage period. The total outage for upgrade of this production database environment was ~ 15 minutes .</p>
<p><strong>Upgrade Y DB:</strong></p>
<p>Second production environment is using same PostgreSQL cluster on new database server but different database name. This database is larger than first one and have more tables with all kinds of transactions.</p>
<p>After analyzing table stats, I came up with group of tables:</p>
<p>relation | size | n_tup_ins | n_tup_upd | n_tup_del<br />
&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;+&#8212;&#8212;&#8212;-+&#8212;&#8212;&#8212;&#8211;+&#8212;&#8212;&#8212;&#8211;+&#8212;&#8212;&#8212;&#8211;<br />
&lt;&lt; Replicate tables based on incremental Primary Key, No trigger required . inserter replication &gt;&gt;</p>
<p>1. t1 | 15 GB | 22310924 | 0 | 0 &lt;&#8212;&#8212;&#8211; insert only<br />
2. t2 | 789 MB | 3176894 | 0 | 0 &lt;&#8212;&#8212;&#8211; insert only<br />
3. t3  | 13 MB | 7379 | 0 | 0 &lt;&#8212;&#8212;&#8211; insert only</p>
<p>&lt;&lt;  Insert/Update/Delete on tables , DML replication&gt;&gt;</p>
<p>4. t4 | 4515 MB | 1233555 | 17966613 | 0 &lt;&#8211;insert /update/delete<br />
5. t5 | 2520 MB | 5004129 | 21599077 | 0 &lt;&#8211;insert /update/delete<br />
6. t6 | 1310 MB | 4041253 | 519 | 0 &lt;&#8211;insert /update/delete<br />
7. t7 | 1123 MB | 1020479 | 2050512 | 0 &lt;&#8211;insert /update<br />
8. t8 | 75 MB | 875275 | 19047 | 0 &lt;&#8211;insert /update<br />
9. t9 | 43 MB | 30509 | 8976539 | 0 &lt;&#8211;insert /update<br />
10. t10 | 22 MB | 12338 | 16201 | 0 &lt;&#8211;insert /update<br />
11. t11 | 12 MB | 11574 | 199 | 0 &lt;&#8211;insert /update<br />
12. t12 | 10168 kB | 12283 | 11192 | 0 &lt;&#8211;insert /update</p>
<p>&lt;&lt; static tables &gt;&gt;</p>
<p>13.  t13 | 43 MB | 0 | 0 | 0 &lt;&#8211; static tables, never changes<br />
14. t14 | 39 MB | 0 | 0 | 0 &lt;&#8211; static tables, never changes<br />
15. t15 | 18 MB | 0 | 0 | 0 &lt;&#8211; static tables, never changes<br />
16. t16 | 16 MB | 0 | 0 | 0 &lt;&#8211; static tables, never changes<br />
17. t17 | 14 MB | 0 | 0 | 0 &lt;&#8211; static tables, never changes</p>
<p>I followed same procedure to install mimeo on this database as described above but for DML replication you need to execute an extra step by creating mimeo schema on source (production database server running on PostgreSQL 9.0) with proper permissions for mimeo replication role. All _pgq tables and trigger functions on source tables reside under this mimeo schema on source database.</p>
<p>On source database server :</p>
<p>CREATE schema mimeo;<br />
ALTER SCHEMA mimeo OWNER TO &lt;mimeo_role&gt;;<br />
GRANT TRIGGER ON &lt;source_table&gt; TO &lt;mimeo_role&gt;;</p>
<p>After mimeo installation and creating up new production DB on same cluster of PostgreSQL 9.2 with master-slave setup, I followed following steps to upgrade Y DB:</p>
<ol>
<li>Freeze schema changes on Y DB production database server running on PostgreSQL 9.0.</li>
<li>pg_dump entire database schema dump and restored on PostgreSQL 9.2 database</li>
<li>Disable triggers on replicated table on destination database</li>
<li>Setup replication trigger on DML group replicated tables using mimeo.dml_maker function by executing on PostgreSQL 9.2 database server.</li>
<li>pg_dump 17 tables : t1 &#8211; t17  from source database and pg_restore on PostgreSQL 9.2 database</li>
<li>Setup Mimeo replication for t1-t3 using refresh_updater and t4-12 tables using refresh_dml. I did not setup replication for static tables.</li>
<li>pg_dump all but t1to t17 tables data and pg_restore with 4 parallel processes (-j 4) on PostgreSQL 9.2 with  (~5 minutes).</li>
<li>Enable triggers and reset sequences for t1 to t17 on PostgreSQL 9.2</li>
<li>Compare and Verify count and/or max(id) for t1 to t17 tables between PostgreSQL 9.0 database and upgraded PostgreSQL 9.2 database server</li>
<li>Open up upgraded database for applications !!</li>
</ol>
<p>As above, only steps 7-10  need to be executed during outage period. The total outage for upgrade of this production database environment was about 15 minutes.</p>
<p>In conclusion, Mimeo helped our client to upgrade their database servers with minimal outage.  Hopefully, it will help you on your next production database upgrade to reduce outage window.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.pateldenish.com/2012/10/an-easy-way-to-reduce-outage-window-for-postgresql-upgrade.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2 Elephants in the Room!!</title>
		<link>http://www.pateldenish.com/2012/08/2-elephants-in-the-room.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=2-elephants-in-the-room</link>
		<comments>http://www.pateldenish.com/2012/08/2-elephants-in-the-room.html#comments</comments>
		<pubDate>Wed, 22 Aug 2012 14:08:58 +0000</pubDate>
		<dc:creator>Denish Patel</dc:creator>
				<category><![CDATA[postgresql]]></category>

		<guid isPermaLink="false">http://www.pateldenish.com/?p=269</guid>
		<description><![CDATA[Yesterday, I gave a lightning talk at Hadoop DC- Hadoop User Group (HUG) in Columbia,MD. It was pleasure to talk about PostgreSQL and Hadoop to enthusiastic crowd. Please check out the slides!]]></description>
				<content:encoded><![CDATA[<p style="text-align: left;">Yesterday, I gave a lightning talk at <a href="http://www.meetup.com/Hadoop-DC/">Hadoop DC- Hadoop User Group (HUG)</a> in Columbia,MD. It was pleasure to talk about PostgreSQL and Hadoop to enthusiastic crowd. Please check out the slides!</p>
<div style="width: 425px; text-align: left;"><object style="margin: 0px;" width="425" height="355" classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowScriptAccess" value="always" /><param name="src" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=twoelephantsintheroom-13456438366247-phpapp01-120822090028-phpapp01&amp;stripped_title=two-elephants-inthe-room" /><param name="allowscriptaccess" value="always" /><param name="allowfullscreen" value="true" /><embed style="margin: 0px;" width="425" height="355" type="application/x-shockwave-flash" src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=twoelephantsintheroom-13456438366247-phpapp01-120822090028-phpapp01&amp;stripped_title=two-elephants-inthe-room" allowFullScreen="true" allowScriptAccess="always" allowscriptaccess="always" allowfullscreen="true" /></object></div>
]]></content:encoded>
			<wfw:commentRss>http://www.pateldenish.com/2012/08/2-elephants-in-the-room.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Deploying Maximum HA Architecture with PostgreSQL</title>
		<link>http://www.pateldenish.com/2012/04/deploying-maximum-ha-architecture-with-postgresql.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=deploying-maximum-ha-architecture-with-postgresql</link>
		<comments>http://www.pateldenish.com/2012/04/deploying-maximum-ha-architecture-with-postgresql.html#comments</comments>
		<pubDate>Mon, 02 Apr 2012 22:57:00 +0000</pubDate>
		<dc:creator>Denish Patel</dc:creator>
				<category><![CDATA[postgresql]]></category>

		<guid isPermaLink="false">http://www.pateldenish.com/2012/04/deploying-maximum-ha-architecture-with-postgresql.html</guid>
		<description><![CDATA[Today, I gave talk on &#8220;Deploying Maximum HA Architecture with PostgreSLQ&#8221; at PG Day New York. You could check out slides here !]]></description>
				<content:encoded><![CDATA[<p>Today, I gave talk on &#8220;Deploying Maximum HA Architecture with PostgreSLQ&#8221; at <a href="http://pgday.nycpug.org/schedule/">PG Day New York</a>. You could check out slides here !</p>
<div style="width: 425px; text-align: left;"><object style="margin: 0px;" width="425" height="355" classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowScriptAccess" value="always" /><param name="src" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=deployingmaximumhaarchitecturewithpostgresql-13334016362821-phpapp02-120402163920-phpapp02&amp;stripped_title=deploying-maximum-ha-architecture-with-postgresql-12261838" /><param name="allowscriptaccess" value="always" /><param name="allowfullscreen" value="true" /><embed style="margin: 0px;" width="425" height="355" type="application/x-shockwave-flash" src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=deployingmaximumhaarchitecturewithpostgresql-13334016362821-phpapp02-120402163920-phpapp02&amp;stripped_title=deploying-maximum-ha-architecture-with-postgresql-12261838" allowFullScreen="true" allowScriptAccess="always" allowscriptaccess="always" allowfullscreen="true" /></object></div>
]]></content:encoded>
			<wfw:commentRss>http://www.pateldenish.com/2012/04/deploying-maximum-ha-architecture-with-postgresql.html/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>monitor bucardo replication lag using circonus</title>
		<link>http://www.pateldenish.com/2012/03/monitor-bucardo-replication-lag-using-circonus.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=monitor-bucardo-replication-lag-using-circonus</link>
		<comments>http://www.pateldenish.com/2012/03/monitor-bucardo-replication-lag-using-circonus.html#comments</comments>
		<pubDate>Thu, 01 Mar 2012 17:06:00 +0000</pubDate>
		<dc:creator>Denish Patel</dc:creator>
				<category><![CDATA[postgresql]]></category>

		<guid isPermaLink="false">http://denishjpatel.wordpress.com/2012/03/01/monitor-bucardo-replication-lag-using-circonus</guid>
		<description><![CDATA[         I have been using circonus for monitoring, trending and alerting for any database metrics for quite a long time now. The circonus interface makes the monitoring, trending and alerting setup painless and you can see graph &#8230; <a href="http://www.pateldenish.com/2012/03/monitor-bucardo-replication-lag-using-circonus.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<div dir="ltr" style="text-align:left;">         I have been using <a href="https://circonus.com/">circonus</a> for monitoring, trending and alerting for any database metrics for quite a long time now. The circonus interface makes the monitoring, trending and alerting setup painless and you can see graph flowing in minutes. Another good thing about Circonus is that you can monitor anything that you can query from database ! This week, the task at my hand was to find a way to monitor bucardo replication lag.  `<a href="http://bucardo.org/wiki/Bucardo_ctl">bucardo_ctl status </a>sync_name` provides very important information that you can rely for trending and alerting purposes.</p>
<p>$ bucardo_ctl status my_slave<br />Sync name:            my_sync<br />Current state:        WAIT:22s (PID = 19500)<br />Type:                 pushdelta<br />Source herd/database: slave_herd / master_herd<br />Target database:      my_slave<br />Tables in sync:       318<br />Last good:            23s (time to run: 1m 21s)<br />Last good time:       Feb 29, 2012 15:27:14  Target: my_slave<br />Ins/Upd/Del:          142 / 0 / 0<br />Last bad:             1h 45m 9s (time to run: 19m 57s)<br />Last bad time:        Feb 29, 2012 13:42:29  Target: my_slave<br />Latest bad reason: MCP removing stale q entry<br />PID file:             /var/run/bucardo/bucardo.ctl.sync.my_sync.pid<br />PID file created:     Wed Feb 29 13:42:33 2012<br />Status:               active<br />Limitdbs:             0<br />Priority:             0<br />Checktime:            none<br />Overdue time:         00:00:00<br />Expired time:         00:00:00<br />Stayalive:            yes      Kidsalive: yes<br />Rebuild index:        0        Do_listen: no <br />Ping:                 yes      Makedelta: no <br />Onetimecopy:          0</p>
<p>All the information provided by `bucardo_ctl status` command is important but most interesting thing to monitor is &#8220;Last good:&#8221;. Last good shows bucardo replication lag on slave server.</p>
<p><b>Trending in circonus</b>:</p>
<p>Now, I have metric identified for the monitoring. Next step is to find the best way to put the metric into the monitoring tool. After some poking around the output and ways to monitor stuff, I decided to grab the SQL query from bucardo_ctl perl script and stick it into circonus monitoring. Most of the time spent for setting up this monitor was to grab right query from big perl script (bucardo_ctl) and mapping the metric required from the query. After that, here is the query that I plugged  into Circonus in no-time.</p>
<p> SELECT <br />&#8216;bucardo_last_good&#8217; , round(extract(epoch FROM now()-ended)) <br />FROM<br />(SELECT * FROM bucardo.q WHERE sync = &#8216;my_sync&#8217; AND cdate &gt;= now() &#8211; interval &#8217;3 days&#8217;<br />UNION ALL<br />SELECT * FROM freezer.master_q <br />WHERE sync = &#8216;my_sync&#8217; AND cdate &gt;= now() &#8211; interval &#8217;3 days&#8217;) AS foo<br />WHERE ended is NOT NULL AND aborted IS NULL<br />ORDER BY ended DESC LIMIT 1;</p>
<p><b><br /></b></p>
<div class="separator" style="clear:both;text-align:center;"><a href="http://denishjpatel.files.wordpress.com/2012/03/screen2bshot2b2012-03-012bat2b3-23-542bpm.png" style="clear:left;float:left;margin-bottom:1em;margin-right:1em;"><img border="0" height="344" src="http://denishjpatel.files.wordpress.com/2012/03/screen2bshot2b2012-03-012bat2b3-23-542bpm.png?w=300" width="640" /></a></div>
<p><b>Alerting in circonus:</b></p>
<p>bucardo_ctl status shows Last good status to &#8220;unknown&#8221; if replication is broken.</p>
<p>Name     Type  State PID     Last_good Time  I/U/D Last_bad Time <br />===========+=====+========+====+=========+=====+=====+========+=====<br />my_sync| P   |WAIT:35s|7620| unknown  |     |     |36s     |1m58s</p>
<p>       In circonus, you could setup rules and relevant severity levels. The most important part is that if the query doesn&#8217;t return any row it should page(&#8220;unknown&#8221; condition) . Circonus provides rule for alerts in case of a metric is absent. Now, I am all set with alerts as well.</p>
<p>Yay! bucardo replication is under monitoring and trending without any hassle! Hopefully, this post will help you next time when you try to put bucardo replication lag under monitoring.</p></div>
]]></content:encoded>
			<wfw:commentRss>http://www.pateldenish.com/2012/03/monitor-bucardo-replication-lag-using-circonus.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What is pg_extractor ?</title>
		<link>http://www.pateldenish.com/2012/01/what-is-pg_extractor.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=what-is-pg_extractor</link>
		<comments>http://www.pateldenish.com/2012/01/what-is-pg_extractor.html#comments</comments>
		<pubDate>Thu, 05 Jan 2012 23:03:00 +0000</pubDate>
		<dc:creator>Denish Patel</dc:creator>
				<category><![CDATA[postgresql]]></category>

		<guid isPermaLink="false">http://denishjpatel.wordpress.com/2012/01/05/what-is-pg_extractor</guid>
		<description><![CDATA[In my recent blog post, I wrote about PostgreSQL DBA Handyman toolset. In the list of tools, getddl is one of them. If you are using getddl to get DDL schema and track the daily changes in SVN for production &#8230; <a href="http://www.pateldenish.com/2012/01/what-is-pg_extractor.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>In my recent blog post, I wrote about <a href="http://denishjpatel.blogspot.com/2011/11/postgresql-handyman-toolset.html">PostgreSQL DBA Handyman toolset</a>. In the list of tools, getddl is one of them. If you are using <a href="https://labs.omniti.com/pgtreats/trunk/getddl/README">getddl</a> to get DDL schema and track the daily changes in SVN for production databases, you should consider moving that process to use pg_extractor instead. <a href="https://github.com/omniti-labs/pg_extractor">pg_extractor</a> is the more advance and robust tool for extracting schema as well data using pg_dump. <a href="http://omniti.com/is/keith-fiske">Keith Fiske</a>, an author of the tool, described tool in detail in <a href="http://keithf4.com/pg_extractor">his blog post</a>. Thanks to Keith for making the schema extraction tool more robust and taking it to next level !</p>
<p>Hopefully, it will help you to have more control over your database in smarter way!
<div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6356223856264576429-648562429227133913?l=denishjpatel.blogspot.com' alt='' /></div>
]]></content:encoded>
			<wfw:commentRss>http://www.pateldenish.com/2012/01/what-is-pg_extractor.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Faster &amp; Better VACUUM FULL</title>
		<link>http://www.pateldenish.com/2011/12/faster-better-vacuum-full.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=faster-better-vacuum-full</link>
		<comments>http://www.pateldenish.com/2011/12/faster-better-vacuum-full.html#comments</comments>
		<pubDate>Tue, 06 Dec 2011 17:02:00 +0000</pubDate>
		<dc:creator>Denish Patel</dc:creator>
				<category><![CDATA[postgresql]]></category>

		<guid isPermaLink="false">http://denishjpatel.wordpress.com/2011/12/06/faster-better-vacuum-full</guid>
		<description><![CDATA[&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; In presentation, I discussed in detail about Bloat issue in PostgreSQL and methods to remove Bloat from the tables/indexes. Now a days, PostgreSQL9.0 is common and&#160; the widely used version for the production use and it&#8217;s vital to remind &#8230; <a href="http://www.pateldenish.com/2011/12/faster-better-vacuum-full.html">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<div dir="ltr" style="text-align:left;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; In <a href="http://denishjpatel.blogspot.com/2011/03/p90x-your-database-presentation-slides.html">presentation</a>, I discussed in detail about Bloat issue in PostgreSQL and methods to remove Bloat from the tables/indexes. Now a days, PostgreSQL9.0 is common and&nbsp; the widely used version for the production use and it&#8217;s vital to remind about changes in most important bloat removal tool called &#8220;VACUUM FULL&#8221;. Until PostgreSQL 9.0, VACUUM FULL was tardy and DBA always stayed away from it and used CLUSTER instead. (Checkout <a href="http://denishjpatel.blogspot.com/2011/03/p90x-your-database-presentation-slides.html">presentation</a> for difference between CLUSTER vs VACUUM FULL) <br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; The VACUUM FULL statement recovers free  space from a table to reduce its size from bloated tables, mostly when VACUUM itself hasn&#8217;t  been run frequently enough. Before PostgreSQL 9.0 , it was tardy and slow because of the way it was executed: records were read and  moved one by one from their source block to a block closer to the  beginning of the table. Once the end of the table was emptied, this  empty part was removed. This method was very inefficient: moving records one by one  creates a lot of random IO.&nbsp; Additionally, during this reorganization,  indexes had to be maintained, making everything even more costly, and  fragmenting indexes. It was therefore advised to reindex a table just  after a VACUUM FULL. <br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Now, the VACUUM FULL statement, as of PostgreSQL 9.0, creates a new table  from the current one, copying all the records sequentially. Once all  records are copied, index are created back, and the old table is  destroyed and replaced. This has the advantage of being much faster. VACUUM FULL still  needs an <b>EXCLUSIVE LOCK</b>&nbsp; during entire operation. The only drawback of  this method compared to the old one, is that VACUUM FULL can use as much  as two times the size of the table and indexes on disk, as it is creating a new  versions of it.</p>
<p>Let&#8217;s compare run-time of VACUUM FULL on PostgreSQL 8.4 vs PostgreSQL 9.0 </p>
<p>postgres=# create table vacuumtest(id int primary key);<br />NOTICE:&nbsp; CREATE TABLE / PRIMARY KEY will create implicit index &#8220;vacuumtest_pkey&#8221; for table &#8220;vacuumtest&#8221;<br />CREATE TABLE<br />postgres=# insert into vacuumtest select generate_series(1,10000000);<br />INSERT 0 10000000<br />postgres=# delete from vacuumtest where id%4=0;<br />DELETE 2500000<br />postgres=# vacuum vacuumtest;<br />VACUUM</p>
<p>On 8.4:<br /> postgres=# vacuum full vacuumtest ;<br />VACUUM<br />Time: 61418.197 ms<br />postgres=# reindex table vacuumtest;<br />REINDEX<br />Time: 12212.815 ms</p>
<p>On 9.0:<br /> postgres=# vacuum full vacuumtest ;<br />VACUUM<br />Time: 32640.714 ms</p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Above results show that VACCUM FULL on PostgreSQL 9.0 is way faster than previous versions. Moreover, VACUUM FULL has couple of advantages over CLUSTER :&nbsp; it&#8217;s faster than CLUSTER because it doesn&#8217;t have to build new table using ORDER by clause &amp; you can run VACUUM FULL on tables on which there isn&#8217;t any index.<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; If you are running any bloat removal tool on the production database, i would recommend you to revisit vacuum parameters and tighten them up a little that makes regular vacuum to run more frequent so it will help to reduce frequency of running more intrusive <a href="http://denishjpatel.blogspot.com/2011/03/p90x-your-database-presentation-slides.html">bloat removal tools</a>!!</p>
</div>
<div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6356223856264576429-5686316937428265085?l=denishjpatel.blogspot.com' alt='' /></div>
]]></content:encoded>
			<wfw:commentRss>http://www.pateldenish.com/2011/12/faster-better-vacuum-full.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
