“Shrink” MS SQL Server transaction logfile

          Microsoft SQL Server transactions log file “too large” is common problem. We need to manually shrink the transaction log file as one time operation and after that setting up proper recovery policy and/or keeping mirror server available all the time , we could  avoid  infinite growth of transactions log in future.
          Transaction log file growth is dependent on one of the following Recovery method used for the instance:
 1. Simple
 2. Full
 3. Bulk-Logged

          If your database is in the Simple recovery mode, then the transaction log is truncated of inactive transaction after the checkpoint process occurs.  The checkpoint process writes all modified data pages from memory to disk.  When the checkpoint is performed, the inactive portion of the transaction log is marked as reusable.
          SQL Server 2005 databases are set to the Full recovery model by default.  With the Full or Bulk-Logged recovery mode, inactive transactions remain in the transaction log file until after a Checkpoint is processed and a transaction log backup is made.  It’s important to note that a full backup does not remove inactive transactions from the transaction log.  If  database recovery model is set to Full or Bulk-Logged, then it is absolutely IMPORTANT that we make transaction log backups to go along with  full backups because full backup does not remove inactive transactions from the transaction log. The transaction log backup performs a truncation of the inactive portion of the transaction log and allow them to used for future transactions. Note that truncation of log file does not  shrink already bloated log file that’s the operation we have to do manually one time  and if we set proper procedure in future, we could avoid unnecessary growth of log file. Other factors affecting transaction log to grow are:

  1. Replicated mirrored server falls behind master server
  2. Long running transactions

To solve #1:
          Database mirroring is uses transaction logs for replication in that it requires that the transactions remain in the log until the record has been written to disk on the mirror server. If the mirror server  falls behind the master server , the amount of active log space will grow. In this case, you might need to stop database mirroring, take a log backup that truncates the log, apply that log backup to the mirror database and restart mirroring.

To solve #2: ( it’s non-issue for most of the OLTP servers) :
          If there are open transactions, DBCC OPENTRAN will provide a session_id or SPID of the connection that has the transaction open.  You can pass this session_id to sp_who2 to determine which user has the connection open.

              Following queries can be ued to detemine recovery model and transaction log truncation interval:

SELECT name, recovery_model_desc  FROM sys.databases;

master SIMPLE
tempdb SIMPLE
model FULL
msdb SIMPLE
ReportServer SIMPLE
ReportServerTempDB SIMPLE
PRODUCTION_DB FULL

SELECT name, log_reuse_wait_desc FROM sys.databases;

master ACTIVE_TRANSACTION
tempdb ACTIVE_TRANSACTION
model NOTHING
msdb NOTHING
ReportServer NOTHING
ReportServerTempDB NOTHING
PRODUCTION_DB LOG_BACKUP

          Results from above queries reveals that PRODUCTION_DB database is setup for FULL database backup and transaction log will not be truncated until LOG_BACKUP.  Make sure that you take full backup followed by log backup that will make transactions log file’s inactive blocks to be available for re-write and prevent from growing further. Keep in mind that it will not shrink transaction log file . To shrink the transaction log file you can follow steps described in official docs  OR following is the simple example of identifying and shrinking log file after taking log file backup:

                SELECT name FROM sys.database_files WHERE type_desc = ‘LOG’

         Once I have log file name, I can use the DBCC command to shrink the file. In the following command I try to shrink my log file down to 1GB.
                 DBCC SHRINKFILE (’PRODUCTION_DB_log.log’, 1000)     
       Run the shrink command couple of times during the off-peak hours. Also, please  make sure that your databases are NOT set to auto-shrink. Auto-shrink database might encounter performance problems.

          Hopefully, it will help you next time you come across transaction log file “too large” issue on MS SQL server installation.

Faster & Better VACUUM FULL

                In presentation, I discussed in detail about Bloat issue in PostgreSQL and methods to remove Bloat from the tables/indexes. Now a days, PostgreSQL9.0 is common and  the widely used version for the production use and it’s vital to remind about changes in most important bloat removal tool called “VACUUM FULL”. Until PostgreSQL 9.0, VACUUM FULL was tardy and DBA always stayed away from it and used CLUSTER instead. (Checkout presentation for difference between CLUSTER vs VACUUM FULL)
                The VACUUM FULL statement recovers free space from a table to reduce its size from bloated tables, mostly when VACUUM itself hasn’t been run frequently enough. Before PostgreSQL 9.0 , it was tardy and slow because of the way it was executed: records were read and moved one by one from their source block to a block closer to the beginning of the table. Once the end of the table was emptied, this empty part was removed. This method was very inefficient: moving records one by one creates a lot of random IO.  Additionally, during this reorganization, indexes had to be maintained, making everything even more costly, and fragmenting indexes. It was therefore advised to reindex a table just after a VACUUM FULL.
                Now, the VACUUM FULL statement, as of PostgreSQL 9.0, creates a new table from the current one, copying all the records sequentially. Once all records are copied, index are created back, and the old table is destroyed and replaced. This has the advantage of being much faster. VACUUM FULL still needs an EXCLUSIVE LOCK  during entire operation. The only drawback of this method compared to the old one, is that VACUUM FULL can use as much as two times the size of the table and indexes on disk, as it is creating a new versions of it.

Let’s compare run-time of VACUUM FULL on PostgreSQL 8.4 vs PostgreSQL 9.0

postgres=# create table vacuumtest(id int primary key);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index “vacuumtest_pkey” for table “vacuumtest”
CREATE TABLE
postgres=# insert into vacuumtest select generate_series(1,10000000);
INSERT 0 10000000
postgres=# delete from vacuumtest where id%4=0;
DELETE 2500000
postgres=# vacuum vacuumtest;
VACUUM

On 8.4:
postgres=# vacuum full vacuumtest ;
VACUUM
Time: 61418.197 ms
postgres=# reindex table vacuumtest;
REINDEX
Time: 12212.815 ms

On 9.0:
postgres=# vacuum full vacuumtest ;
VACUUM
Time: 32640.714 ms

                Above results show that VACCUM FULL on PostgreSQL 9.0 is way faster than previous versions. Moreover, VACUUM FULL has couple of advantages over CLUSTER :  it’s faster than CLUSTER because it doesn’t have to build new table using ORDER by clause & you can run VACUUM FULL on tables on which there isn’t any index.
                If you are running any bloat removal tool on the production database, i would recommend you to revisit vacuum parameters and tighten them up a little that makes regular vacuum to run more frequent so it will help to reduce frequency of running more intrusive bloat removal tools!!