Monday, 7 November 2011

Writable Common Table Expressions

There have previously been various posts on Writable Common Table Expressions (wCTEs) which demonstrate either performance gains or simple examples of how they work, but recently when giving some training, I wanted to come up with some new material to show folk that demonstrates an example of when you might wish to use it. Note that writable common table expressions are only possible in PostgreSQL, and only from version 9.1 onwards. But because 9.1 has been out for almost 2 months now, that means you can go and use this feature straight away!

So in case you don't know what a writable common table expression is, in summary it's the same as a normal common table expression, except that you can also use UPDATEs, DELETEs and INSERTs with it. If you don't know what a normal common table expression is, it effectively let's you make the results of queries available to other queries in the same statement.

What follows is a rather basic case, but the data is kept simple so as to make it easier to follow the examples.

First, we create a company payroll table:

CREATE TABLE company_payroll
department text PRIMARY KEY,
staff_count int NOT NULL,
staff_cost numeric(16,2) NOT NULL

Which we'll populate with some data:

INSERT INTO company_payroll VALUES
('development', 45, 2839281.33),
('marketing', 8, 412424.45),
('management', 4, 839432.53);

Then we want an employees table:

CREATE TABLE employees
id serial PRIMARY KEY,
department text REFERENCES company_payroll (department),
salary numeric(16,2),
consecutive_days_late int NOT NULL DEFAULT 0

Then stick some values in it:

INSERT INTO employees (department, salary, consecutive_days_late) VALUES
('development', 12834.23, 0),
('development', 37853.55, 6),
('marketing', 24488.23, 2),
('development', 91824.29, 14),
('marketing', 24829.23, 16),
('marketing', 24488.23, 2),
('management', 100293.11, 14),
('management', 40287.11, 18),
('development', 72843.22, 11);

It doesn't matter that the employees don't add up to the details provided in the company payroll. Just pretend there are many other rows, but no-one else has been late.

So we now have 2 tables looking like this:

test=# select * from company_payroll;
department | staff_count | staff_cost
development | 45 | 2839281.33
marketing | 8 | 412424.45
management | 4 | 839432.53
(3 rows)

test=# select * from employees;
id | department | salary | consecutive_days_late
1 | development | 12834.23 | 0
2 | development | 37853.55 | 6
3 | marketing | 24488.23 | 2
4 | development | 91824.29 | 14
5 | marketing | 24829.23 | 16
6 | marketing | 24488.23 | 2
7 | management | 100293.11 | 14
8 | management | 40287.11 | 18
9 | development | 72843.22 | 11
(9 rows)

Okay, so we have 9 employees, but quite a few of them keep turning up late for work. Management have decided to put their foot down and fire anyone who's been late for more than 10 days in a row (and probably break a few employment laws in the process). We want to delete these employees from the employees table, then update the company payroll to reflect these changes. Then finally we want to see some before and after figures to compare the new figures against the old.

So let's present a writable common table expression to solve the problem, then we'll break it down.

WITH fired_employees AS (
DELETE FROM employees
WHERE consecutive_days_late > 10
summarise_fired_staff AS (
SELECT department, COUNT(*) AS fired_count, SUM(salary) AS salary_sum
FROM fired_employees
GROUP BY department),
updated_financials AS (
UPDATE company_payroll
SET staff_count = staff_count - fired_count,
staff_cost = staff_cost - salary_sum
FROM summarise_fired_staff
WHERE company_payroll.department = summarise_fired_staff.department
RETURNING company_payroll.*
cp.department, cp.staff_count AS old_staff_count,
uf.staff_count AS new_staff_count,
cp.staff_cost AS old_staff_cost,
uf.staff_cost AS new_staff_cost
FROM company_payroll cp
INNER JOIN updated_financials uf ON uf.department = cp.department;

So let's take this one step at a time. The first thing we're doing is deleting all employees from the employees table where they've been late for more then 10 consecutive days:

DELETE FROM employees
WHERE consecutive_days_late > 10

Okay, so that's simple enough. But what we've done here is put it in a WITH clause and labelled the result as fired_employees. This can now be treated like a temporary table, but one which only exists for the duration of the query. Note that we're also using the RETURNING clause of the DELETE statement to have it return all rows that were deleted by the statement. It's the output of this which is now represented by the fired_employees table expression name. We can now use this for other purposes, and we will. So fired_employees now contains:

id | department | salary | consecutive_days_late
4 | development | 91824.29 | 14
5 | marketing | 24829.23 | 16
7 | management | 100293.11 | 14
8 | management | 40287.11 | 18
9 | development | 72843.22 | 11
(5 rows)

Next we summarise how many people we've fired from each department, and how much they cost in total per department:

SELECT department, COUNT(*) AS fired_count, SUM(salary) AS salary_sum
FROM fired_employees
GROUP BY department

Again, very straightforward. Note that we're referring to the fired_employees expression which we can treat like a regular table, and the output of this has been named summarise_fired_staff. This contains:

department | fired_count | salary_sum
development | 2 | 164667.51
management | 2 | 140580.22
marketing | 1 | 24829.23
(3 rows)

As you can see, we fired 2 developers worth 164667.51 in whatever currency you want to pretend this represents. We've also fired 2 from management and 1 from marketing.

Now we want to update the figures we maintain in the company payroll table with this information:

UPDATE company_payroll
SET staff_count = staff_count - fired_count,
staff_cost = staff_cost - salary_sum
FROM summarise_fired_staff
WHERE company_payroll.department = summarise_fired_staff.department
RETURNING company_payroll.*

Here we're joining the company_payroll to the summarise_fired_staff expression based on the department name. The update involves decrementing the staff_count by the fired_count returned in the summarise_fired_staff result, and the same for the staff_cost, which is decremented by the salary_sum. Now we want to return the updated rows from this action, so we use "RETURNING company_payroll.*". And the output of this part of the query we've labelled updated_financials. As you can see, it's similar to how bash commands are used; you pass the output from one to another, and so on. The result of each one gets passed to the next. And now updated_financials contains:

department | staff_count | staff_cost
development | 43 | 2674613.82
management | 2 | 698852.31
marketing | 7 | 387595.22
(3 rows)

So finally we want to see a side-by-side comparison of before and after figures:

cp.department, cp.staff_count AS previous_staff_count,
uf.staff_count AS new_staff_count,
cp.staff_cost AS previous_staff_cost,
uf.staff_cost AS new_staff_cost
FROM company_payroll cp
INNER JOIN updated_financials uf ON uf.department = cp.department

So we've joined the still yet-to-be-actually-updated company_payroll table with the updated version (which we named updated_financials) based on department name. We then select the department, followed by the original staff_count from the company_payroll table (labelling it previous_staff_count for clarity), and then the updated_financials own staff_count colum (labelled new_staff_count). We also do the same for the staff cost.

So running all of this in one go returns the following:

department | old_staff_count | new_staff_count | old_staff_cost | new_staff_cost
development | 45 | 43 | 2839281.33 | 2674613.82
management | 4 | 2 | 839432.53 | 698852.31
marketing | 8 | 7 | 412424.45 | 387595.22
(3 rows)

If you're curious to know what the explain plan for this query is, it's as follows:

Hash Join (cost=38.33..39.40 rows=3 width=80) (actual time=0.256..0.258 rows=3 loops=1)
Hash Cond: (cp.department = uf.department)
CTE fired_employees
-> Delete on employees (cost=0.00..21.75 rows=313 width=6) (actual time=0.059..0.066 rows=4 loops=1)
-> Seq Scan on employees (cost=0.00..21.75 rows=313 width=6) (actual time=0.022..0.023 rows=4 loops=1)
Filter: (consecutive_days_late > 10)
Rows Removed by Filter: 12
CTE summarise_fired_staff
-> HashAggregate (cost=8.61..10.61 rows=200 width=48) (actual time=0.107..0.108 rows=3 loops=1)
-> CTE Scan on fired_employees (cost=0.00..6.26 rows=313 width=48) (actual time=0.062..0.071 rows=4 loops=1)
CTE updated_financials
-> Update on company_payroll (cost=1.07..5.88 rows=3 width=198) (actual time=0.184..0.198 rows=3 loops=1)
-> Hash Join (cost=1.07..5.88 rows=3 width=198) (actual time=0.159..0.167 rows=3 loops=1)
Hash Cond: (summarise_fired_staff.department = company_payroll.department)
-> CTE Scan on summarise_fired_staff (cost=0.00..4.00 rows=200 width=168) (actual time=0.127..0.131 rows=3 loops=1)
-> Hash (cost=1.03..1.03 rows=3 width=62) (actual time=0.008..0.008 rows=3 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on company_payroll (cost=0.00..1.03 rows=3 width=62) (actual time=0.002..0.003 rows=3 loops=1)
-> Seq Scan on company_payroll cp (cost=0.00..1.03 rows=3 width=56) (actual time=0.029..0.030 rows=3 loops=1)
-> Hash (cost=0.06..0.06 rows=3 width=56) (actual time=0.205..0.205 rows=3 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> CTE Scan on updated_financials uf (cost=0.00..0.06 rows=3 width=56) (actual time=0.188..0.203 rows=3 loops=1)
Total runtime: 0.514 ms
(23 rows)

You could also easily add in delta (or diff) columns to show how much they've changed by, because you have 2 sets of columns to reference and do a straight-forward calculation, then label those columns accordingly.

So what initially appears as a complicated behemoth of a statement is actually fairly simple once you break it down. And while the example shown isn't necessarily realistic, it should hopefully demonstrate its potential applications.

And you can try out this example right away if you're running PostgreSQL 9.1. Just run the CREATE TABLE and INSERT statements above, then run the big writable common table expression query and see the results for yourself. Note that because the wCTE query actually makes changes, running it more than once will do nothing since the employees that were fired were removed by the query the first time it was run.

You can read other blog entries on wCTEs at these links:

Thursday, 1 September 2011

The first PostgreSQL Conference Europe talks!

The first round of PostgreSQL Conference Europe talks have now been selected and already it's shaping up to be a great conference if these are any to go by. You can see the list so far at

If we take a brief look at a few of the talks, you'll see that you really don't want to miss this year's line-up.

Firstly we have a talk by PostgreSQL veteran and core team member Bruce Momjian who will be "Explaining the Postgres Query Optimizer". This will enlighten attendees as to how PostgreSQL makes its decisions in response to queries which is important to understand in order to know how best to ask PostgreSQL what you want. Bruce gave a very popular MVCC talk last year, and no doubt this one will be one to watch.

Also, Michael Meskes's talk entitled "Mission impossible? Can I replace my most important databases with PostgreSQL?" won't be a brief "yes", but will elaborate on Michael's experience with over 12 years of using open source in business, and how to make migrations work. He'll also provide some real life examples of the benefits which have been realised over the years.

And you'll also get the chance to attend the "Synchronous Replication and Durability Tuning" talk by Simon Riggs, who developed the synchronous replication functionality, and Greg Smith, who wrote the book on PostgreSQL performance. No really, he did. In this talk Simon and Greg will go through how synchronous replication works in PostgreSQL 9.1, what considerations should be taken into account, and typical performance users of this feature can expect. Then they will cover how to go about improving performance for this particular set up.

We don't just have talks though, there are also training sessions available, and in addition to the ones previously announced we now also have "Streaming replication, hot standby and synchronous replication" by Redpill Linpro's own Magnus Hagander. There's also "Hands-On with PostGIS" by Vincent Picavet, a GIS specialist who's providing training on behalf of Dalibo. Go check out the complete list at

These talks shown so far are just the start. At the time of writing this we have confirmed about 15 talks (probably more as you read this), but there will be many many more, so keep your eye on the conference website and subscribe to the conference Twitter feed for the latest updates. And get don't forget to make your booking for training sessions in advance as there are a limited number of allocations for each class.

Tuesday, 26 July 2011

Could Clang displace GCC among PostgreSQL developers? Part I: Intro and compile times

Peter Geoghegan (@sternocera) has just joined us on Planet PostgreSQL but unfortunately his post (which looks like he's put quite a bit of work into it) became too old to list on the main site's front page before his account was approved, so here's me waving my arms to give you a second chance to find it. You can read it here.

Tuesday, 19 April 2011

The lies of No2AV

So I went to the No2AV website, and clicked on the link entitled "Why vote no" to see what reasons they've cooked up. Let's take a look:

AV is costly

They argue, as they keep arguing, that AV will cost up to an additional £250 million: "Local councils would have to waste money on costly electronic vote counting machines and expensive voter education campaigns."

What about the extra £800 million they'll also have to spend on red carpets, gold-trimmed ballot papers and free meringue and strawberries served to voters as the booths? What about those? Yes, of course, there's no such cost, because they won't be used. No-one said they would need to be used. They harp on about how Australia is the only other country to use AV to vote in the lower house, but they conveniently ignore the fact that Australia doesn't even use electronic voting machines.

"With ordinary families facing tough times can we really afford to spend a quarter of a billion pounds of taxpayers' money bringing in a new voting system?"

And why are they facing tough times in the first place eh? Because of the government voted in using the First Past The Post system.

"Schools and hospitals, or the Alternative Vote – that's the choice in this referendum."

You will be taking money away from schools and hospitals to fund the new electoral system? Why not the military? Why not higher taxes for the rich? And also bear in mind that this won't be paid for all in a single year.

AV is complex and unfair

There's no denying that it's more complex than FPTP, but are you seriously saying that people are too dumb to say "I want this party to win, but if not then this one, if not that one, then this one... and I don't want anyone else to win, so won't put anything." People fill out questionnaires every day with questions like that.

"The winner should be the candidate that comes first, but under AV the candidate who comes second or third can actually be elected."

That's not true for a start. The candidate that comes first, and gets the majority of the vote wins, instead of the one with more votes than other people, even if it's less than 50%. The No2AV campaign use a race analogy in their campaign leaflets and broadcasts, saying that the race horse or runner that comes third wins under AV. But the analogy doesn't apply. Who are the horses representing for a start? Using the race analogy, there's a finite speed which gets divided up between horses. The horse which uses more average speed than the others win. That's not how horse racing works. So one horse can be left right at the back slowly limping along and never making it to the finish line, and they attend the race every time? And the others make it to the finish line but they won't win anything? Yes, this analogy is terrible.

At least under AV, if your candidate gets the least votes, you can transfer it to your next favourite so your vote is still counted. Under FPTP many people feel their votes are wasted because of the endless tactical voting between 2 parties, so people tend to vote based on who they *don't* want to win. Now that's not fair. At least people who vote for minority parties can still get a say.

I saw the broadcast of the No2AV campaign, and it showed a woman teaching a class of people how it works. They seriously can't explain it any clearer? That's as clear as it gets? I've seen many examples of very simple and clear explanations, and I guess it's because of the No2AV's inability to understand it that they are against it. I'm sure I could make FPTP sound incomprehensible if I wanted.

" That’s why it is used by just three countries in the world – Fiji, Australia and Papua New Guinea."

Yes, these backward, savage and unfair countries are prime examples of where democracy has failed. These countries use this system entirely because they want the second or third person to win.

"We can't afford to let the politicians off the hook by introducing a loser's charter."

Politicians are always off the hook under the current system. It favours the same old people who have safe seats, don't really need to do much, and very very rarely gives newer candidates a chance.

AV is a politician's fix

"AV leads to more hung parliaments, backroom deals and broken promises like the Lib Dem tuition fees U-turn."

So rather than have parties compromise, there has to be one supreme ruler who makes all the decisions for everyone, even if the majority of the voters didn't vote for them. People are much less represented under FPTP. Using the tuition fees U-turn as an example is trying to make it sound like it's entirely the Lib Dem's fault. Of course, the Conservatives wouldn't have broken such a promise... because they would have never made it in the first place. They'd cut it without a second thought... and they did.

"Instead of the voters choosing the government, politicians would hold power."

It's amazing how they state something without any explanation. This statement is exactly the opposite of the case. At the moment, the politicians hold the power because you have so little to choose from, and people are under-represented in government. It's the same old party politics we've been locked in for goodness knows how long.

"Under AV, the only vote that really counts is Nick Clegg's."

There's no attempt to explain what they mean by this, or how they came to this conclusion.

"We can't afford to let the politicians decide who runs our country."

Agreed. What's your point? What's this got to do with AV? You may as well say, "We can't let the terrorists win.", "We musn't have children's right to education removed.", "We can't afford to increase food prices ten-fold." How is this relevant to AV? Again, no explanation.

"NOtoAV is a campaign that has support from right across the country."

The BNP can make the same claim. What's your point?

So the only points they make, they either don't explain, or if they do explain, doesn't make sense when applied to AV.

When the Conservatives were running for government last year, their slogan was "Vote For Change". They seem to have changed their tune.

So if I were to run a Yes2AV campaign like the No2AV campaign run theirs, I'd say the following:

- FPTP is used by all terrorists around the world.

- To continue to use FPTP will cost billions of pounds of tax-payers money

- The people you don't want to get in will be the only ones who get in

- Many children die under FPTP

- A vote against AV is a vote for Hitler

See, no explanation needed. Facts and explanations are clearly complicated and confusing. Just use vague statements and outrageous claims and you'll be fine.

Monday, 7 February 2011

PostgreSQL @ FOSDEM 2011

I returned from Brussels with Dave Page yesterday where I attended FOSDEM. It was my first time at this event and had a great time. Loads of organisations were represented there including Mozilla (promoting Firefox), LibreOffice, OpenOffice, Debian, OpenSUSE, Ubuntu, Fedora, FreeBSD, Perl, FSFE, Mandriva, Gnome, KDE, CentOS, CAcert, and many others.

And of course, PostgreSQL, where I was helping out, selling various items and talking to people about Postgres itself. All the plush blue elephants were sold (and there were quite a lot), all the stress balls sold, many t-shirts and backpacks sold, and all the pins and pens were given away too. In fact the last pin was taken by a friendly Monty Widenius (creator MySQL and now MariaDB) which he put on there and then. Perl, who were our next-door neighbours, bought one of the plush PostgreSQL elephants and placed it atop their massive Perl camel, which we then declared made it pl/Perl. And one person even bought a batch of 100 stress balls which were placed in a PostgreSQL backpack for him to carry them off with. We also introduced a promotion which stated that if you could prove that you're an Oracle employee, you got a stress ball completely free of charge. Sadly no-one took us up on that offer.

Unfortunately I didn't make it to many talks. I only attended three which were Heikki Linnakangas from EnterpriseDB talking about creating custom data types and operators in PostgreSQL, Tatsuo Ishii from SRA OSS Inc introducing pgPool II version 3, and Damien Clochard of Dalibo explaining PGXN. I had hoped to make it to at least one of the LibreOffice talks and get to talk to their guys at the booth, but didn't get round to it, and there was also a talk from folk at Sirius about open source software and whether the UK government is backing it, which I had hoped to see but was too late for. But the whole thing was a complete geek-fest, and I'd definitely want to go again.

Wednesday, 2 February 2011

FOSDEM, here we come!

FOSDEM is running this weekend in Brussels, and there are a plethora of talks to attend. Naturally I'll be very interested in the PostgreSQL-related talks, and they are as follows:

Sunday - 10am

PL/Parrot - David Fetter will be giving his talk about this new procedural language addition. I don't actually know much about this, which is why I'm especially interested.

Asynchronous Notifications for Fun and Profit - Marc Balmer's talk on asynchronous notifications in PostgreSQL will cover message brokering in distributed environments. This is, annoyingly, at the same time as David's talk.

Sunday - 11am

PostgreSQL extension's (sic) development - Dimitri Fontaine will be sharing his expertise on extension development, hopefully covering the subject of what he's been hard at work on: extension management within PostgreSQL itself.

Sunday - 12pm

Writing a user-defined type - The scarily clever Heikki Linnakangas will explain how you can create your own data type, create operators to work with it, and methods to apply indexing to them.

Sunday - 2pm

Introduction to pgpool-II version 3 - Tatsuo Ishii will be coming all the way from Kanagawa in Japan to talk about the all new version of pgPool II, and how it takes advantage of the new built-in streaming replication feature introduced in PostgreSQL 9.0.

Sunday - 3pm

Get ready for the PostgreSQL Extension Network - Damien Clochard will talk about the exciting new PGXN project (PostgreSQL Extension Network - kind of like Perl's CPAN) which David Wheeler has been developing.

Sunday - 4pm

Using MVCC for Clustered Databases - Marcus Wanner's talk will be on utilising MVCC (Multi-Version Concurrency Control) in the cluster, and how Postgres-R (a multi-master replication extension to PostgreSQL) uses it to effect.

Besides PostgreSQL-related talks, other ones I'm interested in are:
  • Sirius: Is the UK Government backing Free Software?
  • flashrom: Run your BIOS/EFI/firmware updates under any free OS
  • Happily hacking LibreOffice
  • The Document Foundation - four months in the making
... and no doubt I'll find others that'll pique my interest too. This will be my first time at FOSDEM, so I'm really looking forward to it.

Monday, 31 January 2011

Performance changes in 9.1

The upcoming PostgreSQL 9.1 release is packed full of awesome new features, but let's not forget the performance improvements which will also arrive with it. Every version of PostgreSQL brings about performance improvements in various areas, and 9.1 is no different. Here's a brief overview of some, but not all, of the changes affecting performance:

Checkpoints now with less spikes

When checkpoints occur on a database, they can result in an I/O spike. 9.1 has the option to spread the checkpoint fsync calls out so that the system doesn't grind to a halt when flushing. In addition to this, there's been a change which will remove duplicate requests to the background writer, meaning even less I/O, so checkpoints are now lighter and smoother than ever. These changes haven't yet made it in, but they're in-progress.

Integer to text casts quicker

The conversion of a signed integer to text now happens in hand-coded internal code as it's faster than the previous standard function being used. This means any queries where many rows have columns being cast from an integer to text will now spend less time converting the data, resulting in faster query times.

More efficient commits

The setting commit_siblings currently requires a scan of all active commits to see whether there are enough transactions committed before enforcing the commit delay. This will now stop checking once it's counted enough active commits, or not count at all if commit_siblings is set to 0. This improvement will be most noticeable in high transaction environments.

pg_upgrade even faster

pg_upgrade (a tool to quickly upgrade between versions of PostgreSQL without dump/reload) is already fast, but for databases with a large number of tables, it will be noticeably quicker in 9.1 as relations are joined rather than looped through with a nested loop join. Previously it would have to keep checking a list of new tables against each old table, but now the lists will be in sync, so they can be checked in parallel.

Unlogged tables

In most cases, you probably need to ensure the data you place into a table is never lost, so everything in that table is written to the Write Ahead Log. But what if the table stored information which wasn't that important, such as data which expires fairly quickly, or just a place to put data that's stored elsewhere which you merely wish to create a quick report on? This will be possible with unlogged tables which are much faster to insert to, update and delete from, as there's no extra step to write to the WAL files. The data will still be in the database if you cleanly shut down, but if the database crashes or has an unexpected shutdown, the table will be empty. It will also be possible to specify that unlogged tables be skipped in backups, meaning database backups will be faster too if unlogged tables are present in the database.