PostgreSQL: Do a VACUUM FULL without exclusive locks!

So, a strange title today. What’s an exclusive lock, what’s a vacuum, why can it be full and what has all this to do with PostgreSQL you might ask yourself.

How PostgreSQL deletes data

In very short: If you delete a row or dataset (also called tuple in the PostgreSQL world) in a table, this is marked as ‘please delete me from disk later whenever you have time but please before the disk is actually full’. PostgreSQL runs a vacuum in the background from time to time. This deletes old tuples from disk and frees up diskspace. This is called autovacuum.

Performance problems with autovacuum

Scanning the data on disk for tuples that are marked to-be-deleted costs IO. People using the database and doing lots of updates/inserts also require IO. If autovacuum is too agressive, the other database operations will be slow. If autovacuum is disabled/ or too slow your disk will fill up. That’s the moment you have to run a VACUUM FULL. This is a manual command that locks the tables with an exclusive lock (nobody else can write) and cleans up the tuples. Also the manual vacuum can often delete a few tuples that the autovacuum cannot.

Using pg_repack

Now comes pg_repack into play! This is a PostgreSQL extension that can cleanup dead tuples like VACUUM FULL, but without an exclusive lock! pg_repack is currently not distributed as a package from the PostgreSQL people, so you’ve to compile it yourself. Afterwards load it into the database and it’s ready to be used! Their documentation is quite good so I won’t copy and paste their installation docs: reorg.github.io/pg_repack/

This entry was posted in General, Linux. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.