Monday, 31 January 2011

Performance changes in 9.1

The upcoming PostgreSQL 9.1 release is packed full of awesome new features, but let's not forget the performance improvements which will also arrive with it. Every version of PostgreSQL brings about performance improvements in various areas, and 9.1 is no different. Here's a brief overview of some, but not all, of the changes affecting performance:

Checkpoints now with less spikes

When checkpoints occur on a database, they can result in an I/O spike. 9.1 has the option to spread the checkpoint fsync calls out so that the system doesn't grind to a halt when flushing. In addition to this, there's been a change which will remove duplicate requests to the background writer, meaning even less I/O, so checkpoints are now lighter and smoother than ever. These changes haven't yet made it in, but they're in-progress.

Integer to text casts quicker

The conversion of a signed integer to text now happens in hand-coded internal code as it's faster than the previous standard function being used. This means any queries where many rows have columns being cast from an integer to text will now spend less time converting the data, resulting in faster query times.

More efficient commits

The setting commit_siblings currently requires a scan of all active commits to see whether there are enough transactions committed before enforcing the commit delay. This will now stop checking once it's counted enough active commits, or not count at all if commit_siblings is set to 0. This improvement will be most noticeable in high transaction environments.

pg_upgrade even faster

pg_upgrade (a tool to quickly upgrade between versions of PostgreSQL without dump/reload) is already fast, but for databases with a large number of tables, it will be noticeably quicker in 9.1 as relations are joined rather than looped through with a nested loop join. Previously it would have to keep checking a list of new tables against each old table, but now the lists will be in sync, so they can be checked in parallel.

Unlogged tables

In most cases, you probably need to ensure the data you place into a table is never lost, so everything in that table is written to the Write Ahead Log. But what if the table stored information which wasn't that important, such as data which expires fairly quickly, or just a place to put data that's stored elsewhere which you merely wish to create a quick report on? This will be possible with unlogged tables which are much faster to insert to, update and delete from, as there's no extra step to write to the WAL files. The data will still be in the database if you cleanly shut down, but if the database crashes or has an unexpected shutdown, the table will be empty. It will also be possible to specify that unlogged tables be skipped in backups, meaning database backups will be faster too if unlogged tables are present in the database.