tag:blogger.com,1999:blog-37487961629214936702024-03-05T08:16:05.192+00:00Thom's memory leakUnknownnoreply@blogger.comBlogger24125tag:blogger.com,1999:blog-3748796162921493670.post-37740192232059814852019-07-25T12:42:00.000+01:002019-07-25T12:42:53.102+01:00jsquery vs SQL/JSONSQL/JSON is coming to PostgreSQL 12 and provides a native way to query JSON data (although, to be specific, JSON data stored as a JSONB data type). This also introduces the jsonpath data type which is used for SQL/JSON query expressions. I'll not be going into its usage in detail, or covering performance characteristics (at least not in this post), but I will compare the syntax and functionality with jsquery. Note that this may potentially change prior to final release.<br />
<br />
<a href="https://github.com/postgrespro/jsquery" target="_blank">jsquery</a> was introduced as a non-core extension for PostgreSQL 9.4 and higher by Teodor Sigaev, Alexander Korotkov and Oleg Bartunov. Like jsonpath, it also used its own datatype, jsquery.<br />
<br />
We have some functions to which we can pass jsonpath expressions to, 2 of which have operators which can be used as shorthand (albeit without additional parameter control, as they exist primarily for indexing purposes):<br />
<table>
<thead>
<tr>
<th>Function</th>
<th>Operator</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>jsonb_path_exists</code></td>
<td><code>@?</code></td>
<td>This will return true if it matched something, false if not, or null if it resulted in an operation on a missing value.</td>
</tr>
<tr>
<td><code>jsonb_path_match</code></td>
<td><code>@@</code></td>
<td>This does the same thing as jsonb_path_exists, but only first result item is tested.</td>
</tr>
<tr>
<td><code>jsonb_path_query</code></td>
<td>None</td>
<td>Returns the JSON data resulting from the jsonpath expression.</td>
</tr>
<tr>
<td><code>jsonb_path_query_array</code></td>
<td>None</td>
<td>Same as jsonb_path_query, but puts the result in a JSON array.</td>
</tr>
<tr>
<td><code>jsonb_path_query_first</code></td>
<td>None</td>
<td>Same as jsonb_path_query, but only selects the first value.</td>
</tr>
</tbody></table>
<br />
These will suppress errors where there's a lack of an array element, object field, an unexpected JSON type or numeric errors.<br />
<br />
Here are some examples of how jsquery and SQL/JSON differ in practice. (Note that jsquery usage requires installation of the jsquery extension):<br />
<br />
We will select all elements from an array which are equal to 1. Note that, here, jsquery returns an array containing 1, whereas jsonpath returns a scalar value of 1.<br />
<br />
<b>jsquery</b><br />
<pre>SELECT '[1,2,3]'::jsonb ~~ '#. ?($ = 1).$'::jsquery;</pre>
<br />
<b>jsonpath</b><br />
<pre>SELECT jsonb_path_query('[1,2,3]'::jsonb, '$[*] ? (@ == 1)');</pre>
<br />
<br />
Now we'll check that all elements are greater than 1 and less than 5.<br />
<br />
<b>jsquery</b><br />
<pre>SELECT '[2,3,4]' @@ '#: ($ > 1 and $ < 5)'::jsquery;</pre>
<br />
<b>jsonpath</b><br />
<pre>SELECT '[2,3,4]' @? '$[*] ? (@ > 1 && @ < 5)';</pre>
<br />
<br />
And here we have some jsonb data as follows:<br />
<pre>CREATE TABLE books (data jsonb);
INSERT INTO books (data) VALUES ('[{"author": "Charles Dickens", "book": "A Tale of Two Cities"},
{"author": "William Shakespeare", "book": "Hamlet"}]');
</pre>
<br />
We want books by William Shakespeare.<br />
<br />
<b>jsquery</b><br />
<pre>SELECT data ~~ '#. ? ($.author = "William Shakespeare")' FROM books;</pre>
<br />
<b>jsonpath</b><br />
<pre>SELECT jsonb_path_query(data,'$[*] ? (@.author == "William Shakespeare")') FROM books;</pre>
<br />
<br />
You can see that, while they share some characteristics, they are not the same. It's also possible to control whether it uses lax or strict rules which determine whether or not to throw an error if referencing a non-existing object member or a structural issue. "lax" suppresses such errors, "strict" doesn't. These are placed at the beginning of the jsonpath expression.<br />
<br />
So we end up with the following syntax for jsonpath:<br />
<br />
<pre>[lax|strict] <path expression> ? <filter expression></pre>
<br />
I've put together some comparisons between jsquery and jsonpath expressions.<br />
<br />
<table>
<caption>Key</caption>
<tbody>
<tr>
<td class="achievable">-</td>
<td>Achievable using other jsonpath operators.</td>
</tr>
<tr>
<td class="sqlequivalent">*</td>
<td>No jsonpath equivalent, but usage available at the SQL level.</td>
</tr>
<tr>
<td class="unsupported">x</td>
<td>No equivalent.</td>
</tr>
</tbody>
</table>
<br />
<table>
<caption>Comparison of variables</caption>
<thead>
<tr>
<th>jsquery</th>
<th>jsonpath</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>$</code></td>
<td><code>$</code></td>
<td>The whole document</td>
</tr>
<tr><td><code>.</code></td>
<td><code>.</code></td>
<td>Accessor</td>
</tr>
<tr>
</tr>
<tr><td><code>*</code></td>
<td><code>*</code></td>
<td>All values at the current level</td>
</tr>
<tr>
</tr>
<tr><td class="unsupported">x</td>
<td><code>**</code></td>
<td>All values at all levels</td>
</tr>
<tr>
</tr>
<tr><td><code>#N</code></td>
<td><code>$[N]</code></td>
<td>Nth value of an array starting at 0</td>
</tr>
<tr>
</tr>
<tr><td class="unsupported">x</td>
<td><code>$[start,end]</code></td>
<td>Slice of an array</td>
</tr>
<tr>
</tr>
<tr><td><code>#</code></td>
<td class="achievable">-</td>
<td>All array elements</td>
</tr>
<tr>
</tr>
<tr><td><code>%</code></td>
<td class="achievable">-</td>
<td>All object keys</td>
</tr>
<tr>
</tr>
</tbody>
</table>
<br />
<br />
<table>
<caption>Comparison of methods</caption>
<thead>
<tr>
<th>jsquery</th>
<th>jsonpath</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td class="unsupported">x</td>
<td><code>+</code> (unary)</td>
<td>Plus operation on a sequence</td>
</tr>
<tr>
<td class="unsupported">x</td>
<td><code>-</code> (unary)</td>
<td>Minus operation on sequence</td>
</tr>
<tr>
<td class="unsupported">x</td>
<td><code>+</code> (binary)</td>
<td>Addition</td>
</tr>
<tr>
<td class="unsupported">x</td>
<td><code>-</code> (binary)</td>
<td>Subtraction</td>
</tr>
<tr>
<td class="unsupported">x</td>
<td><code>*</code></td>
<td>Multiplication</td>
</tr>
<tr>
<td class="unsupported">x</td>
<td><code>/</code></td>
<td>Division</td>
</tr>
<tr>
<td><code>IS <type></code></td>
<td><code>type()</code></td>
<td>Checks the type (jsquery) or returns the type name (jsonpath)</td>
</tr>
<tr>
<td><code>@#</code></td>
<td><code>size()</code></td>
<td>Size (length) of an array</td>
</tr>
<tr>
<td class="unsupported">x</td>
<td><code>double()</code></td>
<td>Numeric value from string</td>
</tr>
<tr>
<td class="unsupported">x</td>
<td><code>ceiling()</code></td>
<td>Nearest integer greater than or equal to value</td>
</tr>
<tr>
<td class="unsupported">x</td>
<td><code>floor()</code></td>
<td>Nearest integer less than or equal to value</td>
</tr>
<tr>
<td class="unsupported">x</td>
<td><code>abs()</code></td>
<td>Absolute value of number</td>
</tr>
<tr>
<td class="unsupported">x</td>
<td><code>keyvalue()</code></td>
<td>Object represented as sequence of key, value and id fields</td>
</tr>
</tbody>
</table>
<br />
<br />
<table>
<caption>Comparison of operators</caption>
<thead>
<tr>
<th>jsquery</th>
<th>jsonpath</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>=</code></td>
<td><code>==</code></td>
<td>Equality</td>
</tr>
<tr>
<td><code><</code></td>
<td><code><</code></td>
<td>Less than</td>
</tr>
<tr>
<td><code><=</code></td>
<td><code><=</code></td>
<td>Less than or equal to</td>
</tr>
<tr>
<td><code>></code></td>
<td><code>></code></td>
<td>Greater than</td>
</tr>
<tr>
<td><code>>=</code></td>
<td><code>>=</code></td>
<td>Greater than or equal to</td>
</tr>
<tr>
<td><code>@></code></td>
<td class="sqlequivalent">*</td>
<td>Contains</td>
</tr>
<tr>
<td><code><@</code></td>
<td class="sqlequivalent">*</td>
<td>Contained by</td>
</tr>
<tr><td><code>IN</code></td>
<td class="achievable">-</td>
<td>Search within a list of scalar values</td>
</tr>
<tr>
<td><code>&&</code></td>
<td class="sqlequivalent">*</td>
<td>Overlap</td>
</tr>
<tr>
<td><code>AND</code></td>
<td><code>&&</code></td>
<td>Boolean AND</td>
</tr>
<tr>
<td><code>OR</code></td>
<td><code>||</code></td>
<td>Boolean OR</td>
</tr>
<tr>
<td><code>NOT</code></td>
<td><code>!</code></td>
<td>Boolean NOT</td>
</tr>
<tr>
<td><code>=*</code></td>
<td><code>exists</code></td>
<td>Expression contains 1 or more items</td>
</tr>
<tr>
<td class="unsupported"></td>
<td><code>starts with</code></td>
<td>Value begins with specified value</td>
</tr>
<tr>
<td class="unsupported">x</td>
<td><code>like_regex</code></td>
<td>Test string against regex pattern</td>
</tr>
</tbody>
</table>
<br />
<br />
<table>
<caption>Special values</caption>
<thead>
<tr>
<th>jsquery</th>
<th>jsonpath</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>true</code></td><td><code>true</code></td>
</tr>
<tr>
<td><code>false</code></td><td><code>false</code></td>
</tr>
<tr>
<td><code>null</code></td><td><code>null</code></td>
</tr>
<tr>
<td class="unsupported">x</td><td><code>is unknown</code></td>
</tr>
</tbody>
</table>
<br />
Note that, unlike SQL expressions, you can use the equality operator with "null", whereas you would usually have to state IS NULL. This is because it's not directly equivalent.<br />
<br />
<h3>
jsquery features not present in SQL/JSON.</h3>
jsquery supports index hints, but this was necessary as the optimiser has no knowledge of the contents of jsquery strings or statistics related to individual values. So this feature is effectively redundant when it comes to SQL/JSON.<br />
<br />
<h3>
Limitations</h3>
Only text, numeric and boolean types are supported at present. Datetime is still a work in progress, so these are intended to be supported in JSONB in future.<br />
<br />
For more information on the jsonpath and SQL/JSON, see the PostgreSQL documentation:<br />
<br />
<a href="https://www.postgresql.org/docs/devel/functions-json.html#FUNCTIONS-SQLJSON-PATH" target="_blank">SQL/JSON Path Expressions</a><br />
<a href="https://www.postgresql.org/docs/devel/datatype-json.html#DATATYPE-JSONPATH" target="_blank">jsonpath Type</a><br />
<br />Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3748796162921493670.post-55606753946863968462016-03-18T21:16:00.000+00:002016-03-21T10:27:56.287+00:00PostgreSQL 9.6 - Part 1.1 - Horizontal Scalability revisitedIn my <a href="http://thombrown.blogspot.co.uk/2016/02/postgresql-96-part-1-horizontal-scalability.html">previous blog</a> post about horizontal scalability changes in PostrgreSQL 9.6, I covered pushing sorts and joins to the remote server, but Robert Haas (EnterpriseDB) has just <a target="_blank" href="http://www.postgresql.org/message-id/E1agzA4-00074P-QW@gemulon.postgresql.org">committed a change</a> that allows DML (that's UPDATEs and DELETEs) to be pushed to the remote server too. This is thanks to the work of Etsuro Fujita (NTT) with advice and testing from Rushabh Lathia (EnterpriseDB), Shigeru Hanada, Robert Haas (EnterpriseDB), Kyotaro Horiguchi (NTT), Albe Laurenz and myself (EnterpriseDB). So this calls for an appendix to my original post!
<h5>DML pushdown</h5>
The problem previously was that an update or a delete would mean fetching a batch of rows from the remote server, and sending individual UPDATE commands to the remote server.<br/><br/>
So if we ran this on the local server (where remote.big_table is a remote table which resides on the remote server):
<pre>
UPDATE remote.big_table SET content = content || '.';
</pre>
The remote server would receive the following request from the local server:
<pre>
START TRANSACTION ISOLATION LEVEL REPEATABLE READ;
DECLARE c1 CURSOR FOR
SELECT id, content, ctid FROM public.big_table FOR UPDATE;
FETCH 100 FROM c1;
UPDATE public.big_table SET content = $2 WHERE ctid = $1
</pre>
So it starts a transaction, opens a cursor to iterate over the whole table, then fetches a batch of 100 rows. For each row the local server gets, it then issues a prepared UPDATE statement and executes it for every row. So if you have 1 million rows, there will be 1 million UPDATE statements.<br/><br/>
This is pretty inefficient, as not only does the local server have to keep fetching batches, and issuing individual UPDATE statements for each row, but the remote server has to keep sending off batches, and execute every one of the UPDATE statements it receives. There's also the potential effect on network traffic this would have.<br/><br/>
This will no longer be the case in 9.6. Instead, the local server will ask the remote server to do the whole UPDATE on its side:
<pre>
START TRANSACTION ISOLATION LEVEL REPEATABLE READ;
UPDATE public.big_table SET content = (content || '.'::text);
</pre>
Very straightforward activity here. The local server told the remote server to handle the whole update itself, and the remote server just does it. Obviously, there will be cases where individual UPDATEs need to be sent if joining the remote table to a local one, but this is an enormous win over the old behaviour in simple cases like this. Note that, at the moment, if an update targets a remote table and also joins to another remote table, it will push the join down, but only to fetch batches of rows to then issue individual UPDATES for. So those cases will still be at similar to the old behaviour.<br/><br/>
Also, the examples I have given are just of UPDATE statements, but this also applies to DELETE too. So again, rather than fetching a batch of tuples from the remote server, and issuing individual DELETE statements for each one, it will send the whole statement across. <br/><br/>
There is a downside to this new commit, which I haven't yet checked if there's a plan to rectify it. Previously, if you ran an UPDATE or DELETE statement, then cancelled it on the local side, the remote side would cancel and rollback, as instead of continuing to send individual UPDATE or DELETE statements, it would issue an abort. But with the current changes, the abort doesn't reach the remote server, or at least not in time.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3748796162921493670.post-49145209433338127782016-03-17T18:58:00.000+00:002016-03-17T15:07:55.532+00:00PostgreSQL 9.6 - Part 2 - MonitoringThis is part 2 in a 4 part blog series on improvements coming in PostgreSQL 9.6.
<ul>
<li>Part 1 - Horizontal Scalability
<ul><li>Join Pushdown</li><li>Sort Pushdown</li><li>"Walk and chew gum at the same time"</li></ul>
</li>
<li><b>Part 2 - Monitoring</b>
<ul><li><a href="#vacuum_progress_monitoring">VACUUM progress monitoring</a></li>
<li><a href="#wait_monitoring">Wait monitoring</a></li></ul>
</li>
<li>Part 3 - Parallelism</li>
<li>Part 4 - Vertical Scalability</li>
</ul>
<h5 id="vacuum_progress_monitoring">VACUUM progress monitoring</h5>
When you run VACUUM on a large table, do you know how much it's done after 5 minutes? Do you know how much more there is to go? The answer is, no. You can't know, at least not accurately or easily. In PostgreSQL 9.6, you'll be able to monitor just this (although not VACUUM FULL), thanks to the work of Amit Langote (NTT) and Robert Haas (EnterpriseDB), which was built upon the work done by Rahila Syed (EnterpriseDB) and Vinayak Pokale (NTT). We now have a system view called pg_stat_progress_vacuum. This is the first in, hopefully, several progress monitoring views to come in future. Here's an example of what it shows you (shown in extended output format so that it fits on this blog page):<br/>
<pre>
-[ RECORD 1 ]------+--------------
pid | 13612
datid | 16384
datname | test
relid | 16385
phase | scanning heap
heap_blks_total | 335141
heap_blks_scanned | 186055
heap_blks_vacuumed | 0
index_vacuum_count | 0
max_dead_tuples | 44739242
num_dead_tuples | 3849285
</pre>
The PostgreSQL docs explain what each of these columns contain: <a href="http://www.postgresql.org/docs/devel/static/progress-reporting.html#PG-STAT-PROGRESS-VACUUM">http://www.postgresql.org/docs/devel/static/progress-reporting.html#PG-STAT-PROGRESS-VACUUM</a><br/><br/>
In the example above, we can see we have 335141 heap blocks in the table being scanned. Each block is 8kB in size, so this represents a 2618MB table. Next, we see that 186055 blocks have been scanned, so 1454MB. That means we have approximately 1164MB left to scan, and we're 55.5% into the scan phase of the VACUUM. You can see that we haven't actually VACUUMed any of the table yet (heap_blks_vacuumed), and the current phase reports "scanning heap". There are 7 phases for VACUUMs that will be visible in this view:<br/>
<ul>
<li>initializing</li>
<li>scanning heap</li>
<li>vacuuming indexes</li>
<li>vacuuming heap</li>
<li>cleaning up indexes</li>
<li>truncating heap</li>
<li>performing final cleanup</li>
</ul><br/>
These are all described in more detail in the documentation: <a href="http://www.postgresql.org/docs/devel/static/progress-reporting.html#VACUUM-PHASES">http://www.postgresql.org/docs/devel/static/progress-reporting.html#VACUUM-PHASES</a><br/><br/>
What we have here is a very useful tool in monitoring VACUUM progress, which was previously not possible. DBAs used to have to wait until a VACUUM finished. Note that this new feature doesn't give us a time estimate of how much VACUUM time remains. This is because a VACUUM still may need to wait, or it may skip pages that it knows don't need VACUUMing, so a time estimate would be unreliable. You could craft a query that links pg_stat_progress_vacuum with pg_stat_activity to work out how long it's taken so far, and based on how many blocks it got through in that time, how long it would take to complete given the remaining blocks, but that would still only apply to an individual phase, and wouldn't give you a trustworthy ETA.
<br/><br/>
I did say that this was the first progress monitoring view, and that more may come in future. These may include progress on REINDEX, ANALYZE, VACUUM FULL/CLUSTER and ALTER TABLE (where it results in a table rewrite).
<h5 id="wait_monitoring">Wait monitoring</h5>
DBAs have relied on pg_stat_activity to provide information about what's currently happening on the database. This returns information about each connection in the cluster, including the queries being run, which users are connected to which database, when the connection was made, whether it's active or idle, and various other bits of information.<br/><br/>
But now it's been updated to include 2 new columns: wait_event_type and wait_event. This is thanks to the work of Amit Kapila (EnterpriseDB) and Ildus Kurbangaliev (PostgresPro). These provide introspection on a level not previously possible, and return information about what that connection is currently waiting on (if anything). This can be a huge help in debugging performance issues, or working out what's holding a query up. For example, if there's a hold-up waiting to write to the Write-Ahead Log, we'll see wait_event_type set to 'LWLockNamed' and wait_event set to 'WALWriteLock'.<br/><br/>
Here's a real example:<br/>
<pre>
-[ RECORD 81 ]---+---------------------------------------------------------------------------------------------------------------
datid | 16422
datname | pgbench
pid | 5972
usesysid | 10
usename | thom
application_name | pgbench
client_addr |
client_hostname |
client_port | -1
backend_start | 2016-03-15 19:37:47.544535+00
xact_start | 2016-03-15 19:38:06.277451+00
query_start | 2016-03-15 19:38:06.281068+00
state_change | 2016-03-15 19:38:06.28107+00
<span class="highlight"><b>wait_event_type | Lock</b></span>
<span class="highlight"><b>wait_event | tuple</b></span>
state | active
backend_xid | 176332
backend_xmin | 175895
query | UPDATE pgbench_branches SET bbalance = bbalance + 4430 WHERE bid = 77;
</pre>
As you can see, here we have wait_event_type showing 'Lock' and wait_event showing 'tuple'. So this is waiting on a tuple lock at this moment in time, which means the query won't progress until it no longer needs to wait.<br/><br/>
Here another example:<br/>
<pre>
-[ RECORD 47 ]---+------------------------------
datid | 16384
datname | pgbench
pid | 6052
usesysid | 10
usename | thom
application_name | pgbench
client_addr |
client_hostname |
client_port | -1
backend_start | 2016-03-17 14:43:37.424887+00
xact_start | 2016-03-17 14:43:37.426224+00
query_start | 2016-03-17 14:43:37.428431+00
state_change | 2016-03-17 14:43:37.428432+00
<span class="highlight"><b>wait_event_type | LWLockNamed</b></span>
<span class="highlight"><b>wait_event | WALWriteLock</b></span>
state | active
backend_xid | 8460
backend_xmin |
query | END;
</pre>
So, given the information shown, we can determine that it's waiting on a Write-Ahead Log write lock. A full list of the wait event types and wait events is available in the documentation: <a href="http://www.postgresql.org/docs/devel/static/monitoring-stats.html#WAIT-EVENT-TABLE">http://www.postgresql.org/docs/devel/static/monitoring-stats.html#WAIT-EVENT-TABLE</a><br/><br/>
This should be a huge help for cases where queries or commands get stuck waiting for something, where previously we wouldn't have known what it was waiting for.Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-3748796162921493670.post-58421778280281899012016-02-19T00:12:00.000+00:002016-02-26T12:19:15.006+00:00PostgreSQL 9.6 - Part 1 - Horizontal ScalabilityPostgreSQL 9.6 is shaping up to be an impressive release, so I thought I'd try to summarise some of the big features and improvements that it will bring. This certainly won't be exhaustive, but it will touch upon areas I've looked at closely.<br /><br/>
I'm looking at breaking this down into the following areas:
<br />
<ul>
<li>Part 1 - Horizontal Scalability</li>
<li>Part 2 - Monitoring</li>
<li>Part 3 - Parallelism</li>
<li>Part 4 - Vertical Scalability</li>
</ul>
So without further ado, here's Part 1 - Horizontal Scalability<br /><br />
I guess I should start with the question some might be asking: "What is 'Horizontal Scalability'?" This is just a fancy way of describing the ability to distribute workload among several clusters/servers/locations etc. So rather than putting all the burden on one server, you can delegate at least some of the work to other servers to speed things up.<br /><br />
For 9.6, I'll specifically be covering foreign tables (i.e. tables on remote servers which are being queried as if they were local tables). The changes are currently only usable for postgres_fdw (the core extension that allows connections to tables in other PostgreSQL databases), but other foreign data wrappers will be able to make similar changes in future. All the following features were the result of lots of hard work by my colleague at EnterpriseDB, Ashutosh Bapat, with help from Robert Haas.
<h5>Join pushdown</h5>
At the moment (pre-9.6), any queries you run which involve foreign tables require that the data from those tables be fetched in their entirety prior to any sorting or joining, and then anything you want to do to them (join, sort etc.) have to be done locally on the returned data set. As you can imagine, this is probably not what you want to do, particularly if you had a query like:<br />
<pre>SELECT *
FROM remote.contacts c
INNER JOIN remote.countries n ON c.country = n.country
WHERE c.continent = 'Australasia';</pre>
(Note: In these examples, the tables in the schema named "remote" are foreign tables (tables which reside on a remote server), so here, remote.contacts and remote.countries are both foreign tables. It just helps us distinguish between foreign tables and local ones.)<br/><br/>
This would fetch the whole of the table named "countries", and then, once the remote server has read all the rows of the table from disk, and pushed it across the network, the local server will then need fetch whichever rows of the foreign table named "contacts" match the WHERE clause that targets it, and then join both of the results locally.<br/><br/>
To illustrate this behaviour, we'd probably end up running these 2 queries on the remote server:<br/>
<pre>SELECT country, continent
FROM countries
WHERE continent = 'Australasia';</pre>
<pre>SELECT id, first_name, last_name, age, country
FROM contacts;</pre>
(Note: The "remote." prefix isn't shown here, because these are examples of the queries we'd actually be running on the remote server.)<br/><br/>
Now, we did get *some* pushdown here, namely the bit saying "WHERE continent = 'Australasia'". That's called a qual pushdown, and it means we didn't need to read in all of the remote "countries" table and then filter locally. Not a big deal in this case as we know we're dealing with a small table, but you can imagine how useful that is if you're fetching from a huge table that's indexed on the remote server. Without qual pushdown, the local server would have to fetch entire tables from the remote server when it only wants a single row.<br/><br/>
But that only gets us so far. We still don't want to fetch both of these data sets separately. Wouldn't it be great if we could just tell the remote server to do the join for us? It could then take advantage of any indexes it has available to optimise the join, and we wouldn't need to transfer so much data between the remote server and the local server.<br/><br/>
This is where PostgreSQL 9.6 can use its new join pushdown powers to do just that. In the above example, we would end up sending pretty much the same query we're running on the local server to the remote server, telling it to perform the join itself, and give us the results. This means less disk reads on the remote side, less network utilisation between remote and local server, less work for the local server (as the remote side has already done it all, and faster than the local one could have done it), and ultimately, a much shorter query time. The requirement for this to happen is that the foreign tables involved in the join have to reside on the same remote server and have the same user mapping in order to be considered for join pushdown.<br/><br/>
But, I hear you ask, what if we're joining a mixture of local tables and foreign tables? What then? Surely PostgreSQL will break down in tears not knowing what to do? As it happens, PostgreSQL has a diploma in working out how to construct joins just between the foreign tables, and sending them off to the remote server separately. (the usual terms and conditions apply, being that they need to reside on the same remote server and use the same user mappings).<br/><br/>
So imagine we have a local table called "contacts_notes", and this contains an "id" column, a "contactid" column (which should match remote.contacts.id) and a "notes" column, where we'd keep notes about the customer. We want to join this to the "contactid" column on the remote.contacts table, and have the following query we will run locally:<br/>
<pre>SELECT *
FROM remote.contacts c
INNER JOIN remote.countries n ON c.country = n.country
INNER JOIN contacts_notes o ON c.id = o.contactid
WHERE continent = 'Australasia';
</pre>
Prior to 9.6, it would do as mentioned before, fetch the filtered result from countries, then the whole of contacts, join them locally, and then join that result with the local contacts_notes table. 9.6 will now be able to look at the query and say "Okay, so I can see that 2 of these tables being joined are both on the same remote server, so I'll rewrite the join to only join those 2 tables, send that off to the remote server, and join the result with the local table when it comes back."<br/><br/>
So here's an example of the query plan for pre-9.6:
<pre> QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=49.30..35665.45 rows=16 width=186) (actual time=12533.153..13449.441 rows=187226 loops=1)
Output: c.id, c.first_name, c.last_name, c.age, c.country, n.country, n.continent, o.id, o.contactid, o.note
Hash Cond: (o.contactid = c.id)
-> Seq Scan on public.contacts_notes o (cost=0.00..27282.66 rows=1666666 width=18) (actual time=0.015..218.781 rows=1666666 loops=1)
Output: o.id, o.contactid, o.note
-> Hash (cost=49.10..49.10 rows=16 width=168) (actual time=12533.068..12533.068 rows=563143 loops=1)
Output: c.id, c.first_name, c.last_name, c.age, c.country, n.country, n.continent
Buckets: 65536 (originally 1024) Batches: 16 (originally 1) Memory Usage: 3675kB
-> Hash Join (cost=23.74..49.10 rows=16 width=168) (actual time=1.191..12235.555 rows=563143 loops=1)
Output: c.id, c.first_name, c.last_name, c.age, c.country, n.country, n.continent
Hash Cond: (c.country = n.country)
-> <span class="highlight"><b>Foreign Scan on remote.contacts c</b> (cost=1.00..23.80 rows=640 width=104) (actual time=0.702..11184.456 rows=5000000 loops=1)</span>
Output: c.id, c.first_name, c.last_name, c.age, c.country
<span class="highlight"><b>Remote SQL</b>: SELECT id, first_name, last_name, age, country FROM public.contacts</span>
-> Hash (cost=22.68..22.68 rows=5 width=64) (actual time=0.472..0.472 rows=28 loops=1)
Output: n.country, n.continent
Buckets: 1024 Batches: 1 Memory Usage: 10kB
-> <span class="highlight"><b>Foreign Scan on remote.countries n</b> (cost=1.00..22.68 rows=5 width=64) (actual time=0.454..0.455 rows=28 loops=1)</span>
Output: n.country, n.continent
<span class="highlight"><b>Remote SQL</b>: SELECT country, continent FROM public.countries WHERE ((continent = 'Australasia'::text))</span>
Planning time: 0.393 ms
Execution time: 13458.285 ms
(22 rows)
</pre>
And now with 9.6:
<pre> QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=154.38..35770.54 rows=16 width=186) (actual time=3095.418..4016.063 rows=187226 loops=1)
Output: c.id, c.first_name, c.last_name, c.age, c.country, n.country, n.continent, o.id, o.contactid, o.note
Hash Cond: (o.contactid = c.id)
-> Seq Scan on public.contacts_notes o (cost=0.00..27282.66 rows=1666666 width=18) (actual time=0.023..221.201 rows=1666666 loops=1)
Output: o.id, o.contactid, o.note
-> Hash (cost=154.19..154.19 rows=16 width=168) (actual time=3095.318..3095.318 rows=563143 loops=1)
Output: c.id, c.first_name, c.last_name, c.age, c.country, n.country, n.continent
Buckets: 65536 (originally 1024) Batches: 16 (originally 1) Memory Usage: 3675kB
-> <span class="highlight"><b>Foreign Scan</b> (cost=100.00..154.19 rows=16 width=168) (actual time=1.962..2932.641 rows=563143 loops=1)</span>
Output: c.id, c.first_name, c.last_name, c.age, c.country, n.country, n.continent
<span class="highlight">Relations: (remote.contacts c) <b>INNER JOIN</b> (remote.countries n)</span>
<span class="highlight"><b>Remote SQL</b>: SELECT r1.id, r1.first_name, r1.last_name, r1.age, r1.country, r2.country, r2.continent FROM (public.contacts r1 <b>INNER JOIN</b> public.countries r2 ON (TRUE)) WHERE ((r1.country = r2.country)) AND ((r2.continent = 'Australasia'::text))</span>
Planning time: 0.400 ms
Execution time: 4024.687 ms
(14 rows)
</pre>
This isn't the whole story though, as it's not always possible to push a join down. Consider if we had written the query like:<br/>
<pre>SELECT *
FROM remote.contacts c
INNER JOIN contacts_notes o ON c.id = o.contactid
WHERE EXISTS (SELECT 1 FROM remote.countries n WHERE n.country = c.country AND n.continent = 'Australasia');
</pre>
In order words, a semi-join. This can't currently be pushed down. Neither can an anti-join, which is the same thing, except we use NOT EXISTS instead. No doubt those will be pushed down in future, but not in 9.6.
<h5>Sort pushdown</h5>
Now this is all well and good, but what if we are instead sorting foreign table data rather than joining it. Prior to 9.6, we'd again have to fetch all the rows back from the remote server, then sort the data locally before being able to return the final data set.<br/><br/>
PostgreSQL 9.6 will tell the remote server to do the sorting itself, which can be a huge win, because the remote server can have indexes on sorted columns. To demonstrate this, we'll use the same tables mentioned in our previous example (except I should also mention that we have an index on contacts.country).<br/><br/>
Let's run the following query:<br/>
<pre>SELECT c.first_name, c.last_name, c.country, n.continent
FROM remote.contacts c
JOIN remote.countries n ON n.country = c.country
ORDER BY c.country;</pre>
Now let's look at the query plan with EXPLAIN (ANALYSE, VERBOSE), so that we can see what it did:<br/>
<pre> QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=200.00..324.85 rows=3171 width=128) (actual time=14239.133..25005.195 rows=5000000 loops=1)
Output: c.first_name, c.last_name, c.country, n.continent
Merge Cond: (c.country = n.country)
-> <span class="highlight"><b>Foreign Scan</b> on remote.contacts c (cost=100.00..133.82 rows=682 width=96) (actual time=14238.741..22807.627 rows=5000000 loops=1)</span>
Output: c.id, c.first_name, c.last_name, c.age, c.country
<span class="highlight"><b>Remote SQL</b>: SELECT first_name, last_name, country FROM public.contacts <b>ORDER BY country ASC</b></span>
-> Materialize (cost=100.00..144.09 rows=930 width=64) (actual time=0.386..331.147 rows=4979830 loops=1)
Output: n.country, n.continent
-> <span class="highlight"><b>Foreign Scan</b> on remote.countries n (cost=100.00..141.76 rows=930 width=64) (actual time=0.384..0.712 rows=249 loops=1)</span>
Output: n.country, n.continent
<span class="highlight"><b>Remote SQL</b>: SELECT country, continent FROM public.countries <b>ORDER BY country ASC</b></span>
Planning time: 0.319 ms
Execution time: 25247.358 ms
(13 rows)</pre>
As you can see, the query plan shows that the ORDER BY was pushed down for both foreign scans.<br/><br/>
We can also get an ORDER BY added to the remote queries when performing merge joins without explicitly specifying ORDER BY on our own query. PostgreSQL will add this if it deems it optimal. Note you can give PostgreSQL more information about the foreign tables by telling it to fetch planner information from the remote server prior to planning its own query. This can be done by adding the use_remote_estimate option to the server definition:
<pre>ALTER SERVER remote OPTIONS (ADD use_remote_estimate 'true')</pre>
Alternatively, you could set it on the individual foreign tables:
<pre>ALTER TABLE remote.contacts OPTIONS (ADD use_remote_estimate 'on');
ALTER TABLE remote.countries OPTIONS (ADD use_remote_estimate 'on');
</pre>
This isn't required, and it can add a little overhead, but it can be a big benefit in cases where the query will be particularly complex.
<h5>"Walk and chew gum at the same time"</h5>
At the time of writing this, only joins will be pushed down, or only sorts. Robert Haas described this to me as "We can walk, and we can chew gum; we just can't do them at the same time at the moment."<br/><br/>
But Ashutosh Bapat has come to the rescue, and submitted a patch to do just this. In demonstrating the effect of this change, we'll take the following example query we used previously:
<pre>SELECT c.first_name, c.last_name, c.country, n.continent
FROM remote.contacts c
JOIN remote.countries n ON n.country = c.country
ORDER BY c.country;</pre>
This could then generate the following plan:<br/>
<pre> QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
<span class="highlight"><b>Foreign Scan</b> (cost=852221.63..1039720.04 rows=5000057 width=32) (actual time=11699.639..25104.462 rows=5000000 loops=1)</span>
Output: c.first_name, c.last_name, c.country, n.continent
Relations: (remote.contacts c) INNER JOIN (remote.countries n)
<span class="highlight"><b>Remote SQL</b>: SELECT r1.first_name, r1.last_name, r1.country, r2.continent FROM (public.contacts r1 <b>INNER JOIN</b> public.countries r2 ON (TRUE)) WHERE ((r1.country = r2.country)) <b>ORDER BY r2.country ASC</b></span>
Planning time: 4.220 ms
Execution time: 25326.628 ms
(6 rows)</pre>
As you can see, all the joining and sorting is done on the remote server side.<br/><br/>
Join + sort pushdown can also be of use in cases where, say 2 foreign tables need to be joined to a local table, and you'd want the joined and sorted results back to optimise the join to the local table. So we should see that in 9.6 too.
Here's a preview of what that would look like. Here's a query where we join 2 foreign tables, and we also want to join that with a local table:<br/>
<pre>SELECT c.country, n.continent
FROM remote.contacts c
JOIN remote.countries n ON n.country = c.country
JOIN contacts_notes o on c.id = o.contactid;</pre>
This gives us the following query plan:
<pre> QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=836225.89..1057925.20 rows=5000063 width=64)
Output: c.country, n.continent
Merge Cond: (o.contactid = c.id)
-> Index Only Scan using contacts_notes_contactid_idx on public.contacts_notes o (cost=0.43..30030.72 rows=1666666 width=4)
Output: o.contactid
-> Materialize (cost=836225.46..961227.03 rows=5000063 width=23)
Output: c.country, c.id, n.continent
-> <span class="highlight"><b>Foreign Scan</b> (cost=836225.46..948726.87 rows=5000063 width=23)</span>
Output: c.country, c.id, n.continent
<span class="highlight">Relations: (remote.contacts c) <b>INNER JOIN</b> (remote.countries n)</span>
<span class="highlight"><b>Remote SQL</b>: SELECT r1.country, r1.id, r2.continent FROM (public.contacts r1 <b>INNER JOIN</b> public.countries r2 ON (TRUE)) WHERE ((r1.country = r2.country)) <b>ORDER BY r1.id ASC</b></span>
(11 rows)
</pre>
You can see that the remote query has pushed down the join we explicitly requested, but also optimised the query further with a sort pushdown too.<br /><br />
There will be a lot more we'll have pushdown capabilities for in future, including, but not limited to, aggregate and limit pushdown.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3748796162921493670.post-62541192191330771902014-12-09T12:53:00.000+00:002014-12-09T12:53:04.472+00:00This is a blog testThis is just a blog test. Please ignore.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3748796162921493670.post-16753288449902168652014-03-02T10:32:00.000+00:002014-03-06T19:50:38.451+00:00Students, we want you involved in Google Summer of Code<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFg1mWaN9p1WzK_3Fe7iGtM2Wxksw-_DGGUGydPBaABcW-h7vnYDMQ2DpupStITzadLbV9KwviUNNrNspMQxDnYXNj1FFUVE6FwZ90pbqKm-1RKGP01zyj4BSPwqv9b6cHfEgi9uTbyf_N/s1600/GSOC+2014+logo.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFg1mWaN9p1WzK_3Fe7iGtM2Wxksw-_DGGUGydPBaABcW-h7vnYDMQ2DpupStITzadLbV9KwviUNNrNspMQxDnYXNj1FFUVE6FwZ90pbqKm-1RKGP01zyj4BSPwqv9b6cHfEgi9uTbyf_N/s1600/GSOC+2014+logo.png" /></a></div>
<p>The PostgreSQL Project is yet again participating in Google Summer of Code for 2014. Google will be funding students who take part at $5,500 USD per student (see <a href="http://www.google-melange.com/gsoc/document/show/gsoc_program/google/gsoc2014/help_page">GSoC FAQ</a> for more details). We would like to hear from students who would be willing to work on projects to add new or enhance existing features. You won't be going it alone, we'll assign experienced community developers to mentor you throughout your project.</p>
<p>Have a look at the <a href="https://wiki.postgresql.org/wiki/Todo">TODO</a> list on the wiki for inspiration, or the <a href="https://wiki.postgresql.org/wiki/GSoC_2014">list of project ideas</a> submitted so far by the community, although these are by no means what projects are limited to. Whether you've spotted something you think you could work on, or have a new idea to propose, submit it to the pgsql-students mailing list for discussion (see the <a href="http://www.postgresql.org/list/">mailing list page</a> to subscribe) but do it soon as there's less than 3 weeks to register final proposals.</p>
<p>Also, if you know of any students who may be interested in participating, please send them our way.</p>
<p>
More information can be found on the <a href="http://www.postgresql.org/developer/summerofcode/">PostgreSQL GSoC page</a>
</p>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3748796162921493670.post-54809041107957986332014-02-04T15:26:00.002+00:002014-02-04T15:33:29.019+00:00Why won't you take my money?<i>(I've only just realised that this has been sitting on my account as a draft entry for a long time, so I'm publishing it despite some details perhaps being slightly out of date)</i><br /><br />
Why do companies make it so hard to hand my money over to them? It's like they're running away from me and throwing obstacles in my way to stop me reaching them.<br /><br />
Perhaps I should elucidate a little. What I'm talking about is buying music and films. Ideally I would see something I like, and buy it. However, whilst I can technically do this, it's not practical in most cases. Here's an example:
<br /><br />
LoveFilm like to make a big deal out of being able to stream films to your PC. "Great", I thought, "I have a PC, and I want to watch films, so I'll just plug it into my TV and give it a go." Just one problem; even though it says you can watch it on your PC, it took a bit more research to discover that what they actually meant was "A PC that can install and use Microsoft Silverlight." Now that's not what they say on their streaming page. They seem to omit that key detail. You have to dig around in the help section for the real requirements. Now I don't have Windows on any of my PCs. I either have Debian or Ubuntu, and Silverlight is not available on these platforms. You may be thinking "Ah, but Linux has the Moonlight project which is the Linux equivalent of Silverlight." That it is, but completely unusable in this case because Microsoft won't license their PlayReady DRM system to the Moonlight project, so that option's dead. Also my work laptop is a Macbook Pro, but this is for work, not for installing entertainment software on, so that's not an option. Basically DRM is the real reason I can't watch it. Because they don't want me to copy the film that I paid to watch, they insist I must have software installed that prevents me from doing so. "It's a shame Netflix isn't available in the UK then." Well no, not really. They have exactly the same barriers... Silverlight + DRM.<br /><br />
Okay, so let's just buy the film outright rather than streaming it using a rental service. Where can I do that? Well there don't seem to be many places that let you do that in the UK. I found something called BlinkBox, but apparently they use Windows Media DRM to "protect" the films, which again, can't be played on Linux. There is apparently a limited set of films which can be "streamed" to Linux through Adobe Flash, but that's about it.<br /><br />
So if you don't have Windows or OSX, and don't have Silverlight in most cases, you're stuck. And even if you do meet those requirements, you're not free to transfer them to other devices. Essentially, they won't take your money.<br />
Now when it comes to music, things are a bit brighter. One of the biggest online shops, Amazon, allows you to buy single tracks or entire albums in MP3 format. Nearly every variation of computer and multimedia device can play MP3s. However, apparently Amazon now "watermark" many tracks with identifying information, so if they are copied by various other people, you can be identified as the origin of the copy. If you happen to have a large-capacity music device that gets lost or stolen, you'd better hope that those who end up with your device don't decide to upload all your tracks. You'd be liable and accused of enabling piracy. In any case, why would someone choose MP3 over buying the CD? The CD is higher quality, can be ripped without any watermarking, and encoded into any format you wish, including convenient lossless ones such as FLAC. Well surely they're more expensive to buy? Let's compare buying Avenged Sevenfold's self-titled album in MP3 format and CD format. On Amazon.co.uk, at the time of writing this, the MP3 album is £4.99. Bargain! Not bad at all. So surely the CD will be more expensive? Well no, it's not. In fact it's £4.49. What's the incentive to buy the lower-quality MP3 version with possible watermarking, over the cheaper and higher-quality CD version, with free delivery, cover art, rippable into any format you wish without loss in quality, and you still have the CD as backup if you lose it? I guess the only advantage is convenience, if you can't wait 24 hours for it to arrive.<br /><br />
I like my audio to be in FLAC format. It doesn't take up huge amounts of space, the sound is identical to the original, and plays on my media box, my music player (iAudio 7), and my laptop. Unfortunately attempting to find a place that provides a large library of music in this format has been a fruitless endeavour. There was one place that met this demand. In fact not only could you choose FLAC format, but many other formats too, such as MP3, OGG, AAC, WAV, WMA, Musepack, Monkey's Audio and OptimFROG. You could even choose the bitrate! Isn't that great?! You get your music in the format of your choosing rather than just the one format, and you didn't need any special software to download the tracks either, unlike Amazon's. So why am I referring to this place in the past-tense? It's because they were sued by the RIAA on behalf of EMI, Sony BMG, Universal Music Group and Warner Music Group and as a result had pressure from the Russian government. They were claiming $1.65 trillion in damages from 11 million songs. That's $150,000 per track. Let's leave the absolute insanity of the claimed damages because it just goes without saying. AllofMP3 eventually closed, even though there never did appear to be any real case against them. Russian law allowed them to license the music in the way they did, apparently. But even if they were completely illegal, what they did was give customers what they wanted. They made a healthy profit, and were very popular. So surely other companies would want to do likewise? No. It appears not. Rather than have you conveniently buy a track without watermarking or DRM in a format of your choice and with almost no overhead cost for them, they'd rather have you buy the physical product which needed to be manufactured individually, have printed artwork, put into a protective case, shipped to a distributor, kept in stock in a warehouse, eventually shipped to a seller, kept in their warehouse, then have staff paid to package it up and pay a delivery company to deliver it to you in person... for less.<br /><br />
In fact this is worse when it comes to buying eBooks on Amazon's site. Not only are they DRM-protected, but they tend to cost more than the physical printed book. The convenience they provide is that clicking on the Buy It Now button on the book's page will allow it to almost instantly appear on your device. But note how you have to get the book from Amazon, and no-one else. You can't go to another eBook shop and buy a book in an alternative format, such as the highly-popular ePub format. And the books we do buy from Amazon can never be put on a non-Amazon device. So again, why would we pay extra for the additional limitations? It's this kind of total control that makes Apple's "ecosystem" so awful and exploitative.<br /><br />
Now there is an alternative... you can get the film, book, music track you want, and have it almost instantly. You can play it or read it on any device you like, and there's no DRM or secret identifying information in the files. The DVD or Bluray discs won't have endless unskippable warnings about how you shouldn't copy films, or loads of trailers for films you don't ever want to see. What is this wonderful source that gives people exactly what they're so willing to pay for? Well you've probably guessed it... piracy. Yes, you don't even have to pay for it. These companies see piracy as a terrible plague that's destroying their world, yet for some reason they refuse to compete with it. I've got money to give them, but instead they say I'm not allowed to watch their films because I don't have the necessary handcuffs.<br /><br />
I previously bought the complete Sherlock Holmes DVD box set, and at the beginning of *each disc* was an unskippable anti-piracy video with Matrix-style music, giving warnings about how downloading films illegally is extremely bad and makes you a criminal. Not exactly setting me up to be immersed in 19th century London. Ironically, the only people who are punished by seeing these warnings are those who obtained the episodes by legal means, since those who ripped them and made them available for download remove the intrustive and annoying 30 second atmosphere-destroying clip. I had paid for these legitimately, but because of that, I'm being warned that I mustn't download illegal copies.<br /><br />
I'll end on an experience I had with a computer game, which I have posted elsewhere before:<br /><br />
Years ago I bought Sid Meier's Pirates! for Windows (back when I was using Windows). I was really looking forward to it because I had previously played Pirates Gold! on the Amiga CD32 and loved it. I wanted this game so much that I had pre-ordered it a couple months in advance for the more expensive special edition (the same game but with various extra things like a map poster, extra media etc.). When it finally arrived, I put the disc in the drive and then proceeded to run the installer. Then some kind of pre-installation checker was running (something like SecuROM... it could have even actually been SecuROM). It failed the check. I was certain I didn't have a dodgy copy as it was bought from a reputable source referenced by the company that made the game and had all the official packaging and goodies. So I tried it again... failed. I rebooted... failed. I checked that I met their system requirement. I easily exceeded every requirement. Because it was so new, searches for this problem yielded no results. It wasn't until a week later (without being able to install it) that others were reporting the same problem. Many people couldn't install the game they had bought through legitimate means for their legitimate copy. Eventually it was revealed that many drives couldn't read the disks, and these were a wide range of drive models. Atari mentioned that it was related to the speed at which the drive spun the disc, so there was a tool suggested that would artificially slow down the drive. This still didn't work for me. It turns out that the copy-protection system they used had intentional errors on the disc that the software was checking for. Theoretically, copying the disc would either automatically correct the errors, or the copy would fail to complete. Many drives clearly weren't compatible with this error-checking process.<br /><br />
... I eventually downloaded a pirated copy (yes, a pirated copy of Pirates!) and it installed no problem. No DRM, no checks, no installation issues. In fact it was better than the version of the game I got even after installation as it no longer needed the disc in the drive. Plus I could keep the pirated installer backed up... but not my legitimate copy.<br /><br />
The lesson: DRM, anti-piracy systems and intrusive unskippable warnings don't affect pirates, only paying customers. These industries wage a constant war on piracy, but they're the ones who encourage it by punishing those who don't pirate.Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-3748796162921493670.post-79023962754900132562013-06-04T19:07:00.002+01:002013-06-04T21:33:23.451+01:001,001 chained streaming-only replication instances<p>On my laptop I've managed to create 1,001 local chained (one-to-one) streaming-only (meaning no archive directory) asynchronous replication instances. The output of the status of the list is here: <a href="https://gist.github.com/darkixion/5694200">https://gist.github.com/darkixion/5694200</a></p>
<p>I also tested the promotion of the 1st standby to see if it would cope with propagation to the final 1,000th standby, and it worked flawlessly. This didn't work on my copy of Linux Mint without some adjustments to the kernel semaphores values, and it does take a while for all the standbys in the chain to reach full recovery. However, promotion propagation is very fast.</p>
<p>Try it for yourself (if you have enough RAM that is). You may find it quicker to use my <a href="http://darkixion.github.io/pg_rep_test/">pg_rep_test tool</a>. Just don't do this manually... it'll take far too long.</p>
<p>Thanks to Heikki for putting in the changes that made this archiveless cascading replication possible. :)</p>
<h3>Update: some figures</h3>
<p>So looking at the logs, it's clear why it takes so long for all 1,000 standbys to come online; it tries to connect to its replication host every 5 seconds, so the delay between the host coming online and the standby coming online is up to 5 seconds. This potentially amounts to 5,000 seconds (about 83 mins) to ensure they're all online and receiving a streaming replication connection. A test of this shows it taking 46 minutes 25 seconds.</p>
<p>And as requested by Jonathan Katz (<a href="https://twitter.com/jkatz05">@jkatz05</a>) I can tell you that the amount of time it takes for the promotion of the 1st standby to cause the 1,000th standby to switch to the new timeline (at least on my laptop with an SSD) is 1 minute 46 seconds, so a rate of 9.266 promoted instances per second. And as for actual data changes (in the case of my test, the creation of a table), it took about 6 seconds to reach the 1,000th standby. Re-tested with an insert of a row, and it's about the same again.</p>Unknownnoreply@blogger.com4tag:blogger.com,1999:blog-3748796162921493670.post-10513648847727479592013-05-27T23:32:00.001+01:002013-05-28T00:21:19.278+01:00Accepted Google Summer of Code projects 2013This year's accepted Google Summer of Code projects have been <a href="http://www.google-melange.com/gsoc/projects/list/google/gsoc2013">published</a>. Among them are 3 accepted proposals for PostgreSQL which not only will be fine addition to PostgreSQL's set of features, but also helps introduce new students to our community for hopefully long-term contributions.<br /><br />
The projects that will be worked on are:<br /><br />
<h3>
Cube extension improvement (Stas Kelvich)</h3>
Indexes on the cube data type (in the "cube" extension) tend to be on the large size, meaning maintenance of and accessing these indexes is expensive. The improvements this project aims to implement are in reducing the cost of indexes on cube data by using r-tree structures. In addition to this, PostgreSQL's relatively new K-Nearest Neighbour framework would serve to allow the creation of ordering operators for retrieving sorted data directly from the index, and ordering operators for kNN with different spatial norms.<br /><br />
<h3>
UPDATE ... RETURNING OLD (Karol Trzcionka)</h3>
PostgreSQL can perform UPDATE statements and return the new row by using the RETURNING clause and referencing the columns you want. This project would introduce NEW and OLD aliases to provide the ability to reference not just the new row but also the old. This would allow for a before/after comparison of data.<br /><br />
<h3>
Efficient KNN search through high-dimensional indexing with iDistance (Mike Schuh)</h3>
This will introduce a new indexing algorithm that utilises a high-dimensional space leading to more efficient K-nearest neighbour searches. Such indexes are an advantage over b-tree and r-tree which degrade with a modest increase in dimensions, whereas the iDistance algorithm has been demonstrated to remain well-performing and efficient.<br /><br />
Of course our students won't be left to work in isolation; they will also receive guidance from established community members specifically assigned to mentor them. We welcome Stas, Karol and Mike to the community, and hope not only that they are successful in their projects, but that they continue to contribute beyond this year's Google Summer of Code. Also thanks to Alexander Korotkov, David Fetter and Stephen Frost who will mentor this year's students. It's worth noting that Alexander was actually a GSoC student last year whose work on indexing on ranges <a href="http://www.postgresql.org/message-id/E1T1yXn-0002we-KU@gemulon.postgresql.org">made it into the upcoming 9.3 release</a>.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3748796162921493670.post-62734037972865535082012-04-02T21:02:00.020+01:002012-04-04T09:44:46.314+01:00Deprecated features in PostgreSQL - Past to present<style type="text/css"> /*<![CDATA[*/ h4 { margin-top: 30px; } h5 { font-size: 1.2em; color: darkblue; margin: 20px 0 10px; } table { border-spacing: 0; border-collapse: collpase; } table, th, td { border: 1px solid gray; } th { background-color: #333333; color: #8CE1F5; } td.version { text-align: center; } /*]]>*/<br /></style><br /><p> If you have been using PostgreSQL for a long time, or you’re relatively new but have been following old instructions about how to use it, it’s possible that you’re using features that have been deprecated. The reason features disappear tend to be because they have been superseded by better features which cover the same functionality. It’s important to try to avoid using features which are destined to disappear if there’s a newer alternative. Also when planning an upgrade, it’s useful to know if a feature you’re using will suddenly break in the new version. Some of these features still continue to work, but have since been removed from documentation because they’re to be removed in a future release. Others have just been removed completely.</p><p>We’ll start off with the absurdly old and work our way to the present:</p><h4>Version 6.2</h4><h5>timetravel contrib module</h5><p> <strong>Does it still work?:</strong> No. This is ooooold and was last supported back in PostgreSQL 6.1.</p><p> <strong>What’s wrong with it?:</strong> This really dragged performance down and took up a huge amount of storage space. But the concept was pretty cool... being able to query data as it was at another time.</p><p> <strong>What to use instead:</strong> You can use triggers to implement a similar mechanism.<br /></p><h4>Version 6.4</h4><h5>char2/char4/char8/char16 data types</h5><p> <strong>Does it still work?:</strong> No. These were removed way back in PostgreSQL 6.4. In fact I shouldn’t bother mentioning these, but you never know... someone *could* still be using them somewhere.</p><p> <strong>What’s wrong with it?:</strong> Not in the SQL standard and they’re no faster than using the ubiquitous char(n).</p><p> <strong>What to use instead:</strong> char(n)<br /></p><h4>Version 7.0</h4><h5>abstime data type</h5><p> <strong>Does it still work?:</strong> This will still work, but it’s no longer documented as of PostgreSQL 7.0 and only intended to be used internally. Despite its name, it supports both date and time.</p><p> <strong>What’s wrong with it?:</strong> The range this data type provides is limited: ‘1901-12-14’ to ‘2038-01-19’. It also only has a resolution down to the second. Its behaviour is unfortunately like that of MySQL’s, in that if you insert an invalid value, it won’t fire an error. Instead you’ll just see ‘invalid’ as the value when you go to query it.</p><p> <strong>What to use instead:</strong> Since abstime supports timezone, the better alternative is using timestamp with time zone (timestamptz). It takes up more space (8 bytes instead of 4), but it has a far wider range: ‘4713 BC’ to ‘294276 AD’ and supports microsecond resolution.<br /></p><h5>reltime data type</h5><p> <strong>Does it still work?:</strong> Yes, still works, but again, no longer documented as of PostgreSQL 7.0 and for internal use only.</p><p> <strong>What’s wrong with it?:</strong> This stores a date/time offset but only +/- 68 years. Again, this doesn’t error with values higher than this limit. Does it put ‘invalid’ in the column like abstime? No. Instead the value wraps around, so entering +70 years would result in a value of around -66 years. Not what you want. It also has a resolution down to the second.</p><p> <strong>What to use instead:</strong> The SQL standard equivalent of this kind of data type is interval, which PostgreSQL has. This does take up more space (12 bytes), but it’s range is absolutely huge: ‘-178000000 years’ to ‘+178000000 years’. This also has microsecond resolution. Interval can also handle relative time units; for example, adding a month to 15th February using interval will give you 15th March, but with abstime it has a fixed notion of a month being 30 days, so will give you 17th March (on a non-leap year). A year is also considered to be 360 days.<br /></p><h5>timespan data type</h5><p> <strong>Does it still work?:</strong> No. This was deprecated back in PostgreSQL 7.0, and totally removed in PostgreSQL 7.3. If for any reason you’re using this, you’re *definitely* overdue an upgrade, and have been for many years.</p><p> <strong>What’s wrong with it?:</strong> It’s not in the SQL standard and was really just an alias for interval.</p><p> <strong>What to use instead:</strong> Just use interval.</p><h5>psql/pg_dump’s -u option</h5><p> <strong>Does it still work?:</strong> No, this was deprecated as far back as PostgreSQL 7.0 and removed in 8.3. You should definitely not be using this.</p><p> <strong>What’s wrong with it?:</strong> This option forced psql and pg_dump to prompt for the username and password before connecting to the database. Since prompting for a username is always optional, but prompting for a password may or may not be required (depending on authentication method), it didn’t make sense to glue both of these together.</p><p> <strong>What to use instead:</strong> It has been replaced by the -U option to specify the username, and the -W option to prompt for the password.<br /></p><h4>Version 7.1</h4><h5>getpgusername() function</h5><p> <strong>Does it still work?:</strong> Yes, but it’s effectively deprecated as of PostgreSQL 7.1!</p><p> <strong>What’s wrong with it?:</strong> It’s no longer documented, and could be removed in a future release since it’s obsolete.</p><p> <strong>What to use instead:</strong> Call current_user instead, since getpgusername() is now just an alias for that.<br /></p><h4>Version 8.1</h4><h5>autovacuum contrib module</h5><p> <strong>Does it still work?:</strong> No, as it was moved into core since PostgreSQL 8.1.</p><p> <strong>What’s wrong with it?:</strong> Nothing. Quite the opposite. It was considered so essential that it became part of the main codebase.</p><p> <strong>What to use instead:</strong> Nothing to worry about. Since it’s now in core, you get it out of the box without having to explicitly include it.<br /></p><h4>Version 8.2</h4><h5>mSQL-interface and tips contrib modules</h5><p> <strong>Does it still work?:</strong> No, these were completely removed in PostgreSQL 8.2.</p><p> <strong>What’s wrong with it?:</strong> These were considered abandoned and unmaintained.</p><p> <strong>What to use instead:</strong> Nothing.<br /></p><h5>adddepend, dbase, dbmirror, fulltextindex, mac, ora2pg and userlock<br />contrib modules</h5><p> <strong>Does it still work?:</strong> No, again, these were completely removed in PostgreSQL 8.2.</p><p> <strong>What’s wrong with it?:</strong> Most of these were moved to pgFoundry to be maintained separately.</p><p> <strong>What to use instead:</strong> These still exist on pgFoundry if you really want them (except for fulltextindex which has disappeared, and ora2pg which is on its own website), although they’re all now unmaintained (apart from ora2pg).<br /></p><h4>Version 8.3</h4><h5>automatic casting to text</h5><p> <strong>Does it still work?:</strong> As of PostgreSQL 8.3, non-text data types are no longer implicitly cast to text. This is considered to be one of the major hurdles for some people migrating from earlier versions and the biggest cause of incompatibility.</p><p> <strong>What’s wrong with it?:</strong> Anyone who knows PostgreSQL well will know that it doesn’t like to throw any weirdness or odd behaviour your way. There are cases where implicit casting to text causes undesired results. For example: current_date < 2012-04-02 would result in both sides being automatically cast to text types, even though the date on the right-hand side would first be considered an integer (2012 minus 4 minus 2).</p><p> <strong>What to use instead:</strong> It’s always good practise to be explicit about data types when specifying literals. This will avoid any usual behaviour.<br /></p><h5>tsearch2 contrib module</h5><p> <strong>Does it still work?:</strong> Yes, but it is deprecated as of PostgreSQL 8.3.</p><p> <strong>What’s wrong with it?:</strong> It has been superseded by changes in core with a few functional changes. It’s still kept around for backwards-compatibility.</p><p> <strong>What to use instead:</strong> Use the newer core functionality. There’s information on the tsearch2 contrib module page in the documentation on how to convert to the new functionality.<br /></p><h5>xml2 contrib module</h5><p> <strong>Does it still work?:</strong> Yes, but it is deprecated as of PostgreSQL 8.3.</p><p> <strong>What’s wrong with it?:</strong> Nothing really, and it’s still around for backwards-compatibility, but there is newer XML functionality in core based on the SQL/XML standard.</p><p> <strong>What to use instead:</strong> Use the built-in XML features (xml data type, xml functions, xml parameters).<br /></p><h4>Version 8.4</h4><h5>pg_dump/pg_dumpall’s -d and -D options</h5><p> <strong>Does it still work?:</strong> No, these were removed in PostgreSQL 8.4.</p><p> <strong>What’s wrong with it?:</strong> Such options were often mistaken for a database name parameter, but in fact it caused database dumps to output using insert statements rather than copy statements. This is significantly slower to restore, and cannot be adjusted after the fact.</p><p> <strong>What to use instead:</strong> If someone really did want to use these options intentionally, then the long name options of --inserts and --column-inserts are to be used instead.<br /></p><h4>Version 9.1</h4><h5>createlang/droplang client applications</h5><p> <strong>Does it still work?:</strong> Only up until PostgreSQL 9.1.</p><p> <strong>What’s wrong with it?:</strong> Languages are now treated like extensions as of PostgreSQL 9.1.</p><p> <strong>What to use instead:</strong> Execute CREATE EXTENSION <language name> instead.<br /></p><h5>CREATE/DROP LANGUAGE</h5><p> <strong>Does it still work?:</strong> Yes, but it’s no longer intended to be used by users, only extensions.</p><p> <strong>What’s wrong with it?:</strong> Languages are now considered to be extensions as of PostgreSQL 9.1.</p><p> <strong>What to use instead:</strong> You can install new languages by installing it as an extension with CREATE EXTENSION <language name>. And to remove it, use the DROP equivalent. If you’ve upgraded your cluster to 9.1 or above from a previous version, you will still have the language installed but not as an extension. You can, however, convert it to an extension by using: CREATE EXTENSION <language name> FROM unpackaged. You can then remove it later with DROP EXTENSION.<br /></p><h4>Version 9.2</h4><h5>=> operator</h5><p> <strong>Does it still work?:</strong> Only up to PostgreSQL 9.1, but as of PostgreSQL 9.2, no, at least as far as hstore is concerned. This was actually deprecated in 9.0 and has emitted warnings about using it since then. This is most notably used in the hstore extension. You can still create this operator, but it will return a warning when you do so. Be prepared for this to be completely disallowed in a future release.</p><p> <strong>What’s wrong with it?:</strong> “=>” is reserved in the SQL standard for named function parameters, so it needs to be available for such functionality.</p><p> <strong>What to use instead:</strong> If you’ve been using this in hstore, then it will require changing text=>text to hstore(text,text). If you’ve been using it as a custom operator, you should change it to something else as at some point it will be prohibited.</p><h5>Literal language name case-sensitivity</h5><p> <strong>Does it still work?:</strong> If you're on 9.1 or below, yes, but as of 9.2 you won't get away with this anymore.</p><p> <strong>What’s wrong with it?:</strong> Language names should be treated like an identifier rather than a literal, and in general, literals are case-sensitive. There was special code that case-folded the language name so that the letter casing didn't matter, but this change is the first step in removing string literals as language names altogether.</p><p> <strong>What to use instead:</strong> Just don't use single quotes around language names when writing your functions at all, rather than just lower-casing them. Either use no quotes (as they're not needed for any core language) or use double-quotes as you would with any other identifier.</p><p>There are also lots of configuration parameters that have been removed, and here they are:</p><table style="border-spacing: 0; border-collapse: collapse;"><thead><tr><th>Parameter</th><th class="version">Removed in</th><th>Why?</th></tr></thead><tbody><tr><td>australian_timezones</td><td class="version">8.2</td><td>Better generalised timezone configuration</td></tr><tr><td>preload_libraries</td><td class="version">8.2</td><td>Renamed to shared_preload_libraries</td></tr><tr><td>bgwriter_all_percent</td><td class="version">8.3</td><td>No longer necessary</td></tr><tr><td>bgwriter_all_maxpages</td><td class="version">8.3</td><td>No longer necessary</td></tr><tr><td>bgwriter_lru_percent</td><td class="version">8.3</td><td>No longer necessary</td></tr><tr><td>redirect_stderr</td><td class="version">8.3</td><td>Renamed to logging_collector</td></tr><tr><td>stats_block_level</td><td class="version">8.3</td><td>Now covered by track_counts</td></tr><tr><td>stats_command_string</td><td class="version">8.3</td><td>Renamed to track_activities</td></tr><tr><td>stats_reset_on_server_start</td><td class="version">8.3</td><td>pg_stat_reset() can be used instead</td></tr><tr><td>stats_row_level</td><td class="version">8.3</td><td>Now covered by track_counts</td></tr><tr><td>stats_start_collector</td><td class="version">8.3</td><td>Now always enabled</td></tr><tr><td>explain_pretty_print</td><td class="version">8.4</td><td>No longer needed</td></tr><tr><td>max_fsm_pages</td><td class="version">8.4</td><td>No longer needed as per-relation free space maps deal with this.</td></tr><tr><td>max_fsm_relations</td><td class="version">8.4</td><td>No longer needed as per-relation free space maps deal with this.</td></tr><tr><td>add_missing_from</td><td class="version">9.0</td><td>Always defaulted to ‘off’ so now permanently off.</td></tr><tr><td>regex_flavor</td><td class="version">9.0</td><td>Always defaulted to ‘advanced’ so now permanently set to this.</td></tr><tr><td>custom_variable_classes</td><td class="version">9.2</td><td>Considered better to remove it as it only causes more maintenance with minimal benefit.</td></tr><tr><td>silent_mode</td><td class="version">9.2</td><td>Not necessary as can be achieved with pg_ctl -l or NOHUP.</td></tr><tr><td>wal_sender_delay</td><td class="version">9.2</td><td>New latch infrastructure has now made this setting redundant.</td></tr></tbody></table>Unknownnoreply@blogger.com12tag:blogger.com,1999:blog-3748796162921493670.post-34321490180040219292012-01-23T18:00:00.006+00:002012-01-23T23:44:47.556+00:00Password restrictionsI've been changing a lot of my passwords lately so that none are alike. In fact they're so ridiculously random and lengthy that I wouldn't stand a chance remembering them. I've a means of obtaining these passwords though in a method known to me using encryption which requires 2 very different types of key, so remembering them isn't necessary.<div><br /></div><div>However, after going around trying to change my passwords, I've noticed that many places impose restrictions upon how long a password may be, and what characters you're allowed to use.</div><div><br /></div><div>For example, with PayPal, they absolutely require you to have at least 8 characters, but for some reason, it must not be any more than 20 characters. Why limit it at 20? Surely they hash the password anyway?... don't they? eBay have an identical restriction.</div><div><br /></div><div>And then came my bank (a Dutch bank which shall remain nameless). They also said that it must be at least 8 characters, but no more than 12. It must also contain at least one lower-case letter, one upper-case letter, one number and one non-alphabetic character. But then it also may not use a speech mark ("), equals (=), tilde (~), less-than (<) or greater-than (>). So this not only suggests they're not hashing the password, but there's also probably some risk posted by those additional characters, meaning they may not be sanitising their data. Normally if that were the case I'd expect characters like a back-tick (`), semi-colon (;) or single-quote (') to be prohibited, but for some reason those are allowed in this case. *shrug*</div><div><br /></div><div>Google seems to be better at this, but still imposes an arbitrary limit. In this case they require a minimum of 8 characters, and allow up to 100 characters. Why 100? Does it matter? Thankfully they allow any character you like, including spaces and all types of punctuation you can think of. But in addition to this, I have 2-factor authentication enabled, so I require a 6-digit pin generated by my phone, so the risk of someone logging into my account is minimised. Although I guess this is all moot when it comes to Google's application-specific passwords. This is necessary if you use 2-factor authentication when you wish applications to access your account (such as an Email client), because those applications can't ask you for the 6-digit pin. So Google generates a password specifically for that application which can be revoked whenever you like. Here's an example of what its generated passwords look like:</div><div><br /></div><div>qihi jnmd irjb ytis</div><div><br /></div><div>They always show up as just lower-case letters, and the spaces don't matter. So really, if you have application-specific passwords enabled, the security on your Google Account is only as strong as those passwords, and obviously the more you have the lower the security. Still, it's extremely unlikely anyone could get into your account using brute-force trying to find a code, but it's a shame Google doesn't allow you to enter an application-specific password of your choice.</div><div><br /></div><div>I also recently set up an account with Good Old Games (gog.com). The limit there was 32 characters. I tried using special characters in the password but found that it caused the form to show the password field as invalid, but wouldn't say why. I figured out they only allow upper-case and lower-case letters and numbers.</div><div><br /></div><div>Amazon fare better than the rest so far. Any characters you want, up to 128 characters.</div><div><br /></div><div>The best I've seen is SpiderOak. I gave it a totally random mix of every type of character and a password 1024 characters long. It was fine with this. I don't know how many more characters it will take, but surely that should be enough.</div><div><br /></div><div>So SpiderOak is the most secure out of this list. SpiderOak also can't recover your password because they don't store it. Apparently it's only used to decrypt your user data, which they can't access. In fact they boast a zero-knowledge policy, meaning they don't know your password or anything that you're storing, not even including file names.</div><div><br /></div><div>On the opposite end is the Dutch bank who shall remain nameless. You'd expect a bank to have higher requirements than most, but it seems not. 8-12 characters? And some characters excluded? What are they playing at? They don't even offer 2-factor authentication. It's just username and password (no, not even account details).</div><div><br /></div><div>So, anyone know why these companies place such restrictions on passwords? Does it present a denial-of-service avenue? Do they not hash it in case they want to provide the password to, say, the NSA? Any good reason?</div>Unknownnoreply@blogger.com6tag:blogger.com,1999:blog-3748796162921493670.post-81843364837795190002012-01-07T16:25:00.007+00:002012-01-07T23:45:47.179+00:00With a Little Help by Cory Doctorow<div>Look what arrived today (apologies for the poor quality pictures you're about to see).</div><div><br /></div><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh9AbLIq7i39X-X-eKpJR2lxhTqQJtEhO3uu1YNAOTxuW0PTts-SInDiKkjUtRflyEyEF_StiB9KfxeDTeeYNwelrUK9S9BxEMuk1pO2l5H4u7F6avO9ETAgcVrPaP8aLNHrDe5G0i_QdRI/s1600/2012-01-07+16.07.10.jpg"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 300px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh9AbLIq7i39X-X-eKpJR2lxhTqQJtEhO3uu1YNAOTxuW0PTts-SInDiKkjUtRflyEyEF_StiB9KfxeDTeeYNwelrUK9S9BxEMuk1pO2l5H4u7F6avO9ETAgcVrPaP8aLNHrDe5G0i_QdRI/s400/2012-01-07+16.07.10.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5694928146803608562" /></a><br />Yes, something wrapped in a coffee burlap bag.<br /><br /><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj7k0w7xNXhDxrRQolR97pJLZ3oIOEwkDJPoMFTmfwwMbimIJ2_ljt_hFFqrmBc37RtfdsKBHYUgAZmYwYSBp5l0Im3A9LjXy7Q0RB7XLR3AymYKTuSxzP250XzxFD0L9fKt-_heWydbW9c/s1600/2012-01-07+16.07.32.jpg"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 300px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj7k0w7xNXhDxrRQolR97pJLZ3oIOEwkDJPoMFTmfwwMbimIJ2_ljt_hFFqrmBc37RtfdsKBHYUgAZmYwYSBp5l0Im3A9LjXy7Q0RB7XLR3AymYKTuSxzP250XzxFD0L9fKt-_heWydbW9c/s400/2012-01-07+16.07.32.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5694928150920118450" /></a><br />Which contains a rectangular hexahedron wrapped in rice paper sealed with stickers.<br /><br /><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjH0_gxRVNX93Oat-FsLTXEaUfyRRmdiCBsv-S6m2D3gh9icif5vPrdYto1OacYO6VWBOVh3R-W6OR67-3QRQlxOzHDOZrScOP4sKZcBsYjIV9FMwrfXOmBm_4G1ZhdlA6-9frR7Nfx5nAP/s1600/2012-01-07+16.08.40.jpg"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 300px; height: 400px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjH0_gxRVNX93Oat-FsLTXEaUfyRRmdiCBsv-S6m2D3gh9icif5vPrdYto1OacYO6VWBOVh3R-W6OR67-3QRQlxOzHDOZrScOP4sKZcBsYjIV9FMwrfXOmBm_4G1ZhdlA6-9frR7Nfx5nAP/s400/2012-01-07+16.08.40.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5694928153896438530" /></a><br />It's a book! A quality-looking hardback book emblazoned with a shiny faceless figure donning a cape and glasses, and there's a strip of bright orange extending from the spine of the book. There's also a 4GB SDHC memory card stuck to the front.<br /><br /><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvzbd55d2hJGcVmAJ7AOkBqLY5lHeHTYb3eyKRTtxOkOJ_kLNl6UGJq5dM1k1CDc9NWj321yJNCrO31t3c9xpYZPd_-NKJJL0WOKrOE3ZNoVobKbasRXnBF5MkrnST8zvc3hsupVN6eITE/s1600/2012-01-07+16.08.56.jpg"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 300px; height: 400px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvzbd55d2hJGcVmAJ7AOkBqLY5lHeHTYb3eyKRTtxOkOJ_kLNl6UGJq5dM1k1CDc9NWj321yJNCrO31t3c9xpYZPd_-NKJJL0WOKrOE3ZNoVobKbasRXnBF5MkrnST8zvc3hsupVN6eITE/s400/2012-01-07+16.08.56.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5694928163830711330" /></a><br />As you can see it's the special edition of With a Little Help by Cory Doctorow. This book is number 76 of 250. Let's have a look inside...<br /><br /><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh_4Wcz7wn0F_HjD6BGOxhBpLvYuFddZtaylDkew-3XqY_6GC7cr9GYt8OnRUsbtw2f7Zij-toOfwbkIvUkeZR_cLg6A3cm5SZ-30_m42v2dzgmnD-aJlXBkb2YJkkY1Fr0TYRGJqlTUkcN/s1600/2012-01-07+16.10.21.jpg"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 300px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh_4Wcz7wn0F_HjD6BGOxhBpLvYuFddZtaylDkew-3XqY_6GC7cr9GYt8OnRUsbtw2f7Zij-toOfwbkIvUkeZR_cLg6A3cm5SZ-30_m42v2dzgmnD-aJlXBkb2YJkkY1Fr0TYRGJqlTUkcN/s400/2012-01-07+16.10.21.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5694928166389683298" /></a><br />We appear to have a letter addressed to "Bear" from Melissa Frain from Tor (no, not the Tor project, Tor the sci-fi folk). Bear? Anyway, it's a request for the review of a manuscript. And at the back we have...<br /><br /><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgwqfGrxeVP3CXNLHD5-WnRaOTHmSZW0kxKKJ3t_RIAAZ5eNl34Didiy5thDysMULhVKBp3R629qb3rkSnPRvgC9LSlStqsX2XnYO2JQcCR8cOHmfvC1Wn_Q2sATgwOyDSqnX_DRHj8HsBZ/s1600/2012-01-07+16.10.37.jpg"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 300px; height: 400px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgwqfGrxeVP3CXNLHD5-WnRaOTHmSZW0kxKKJ3t_RIAAZ5eNl34Didiy5thDysMULhVKBp3R629qb3rkSnPRvgC9LSlStqsX2XnYO2JQcCR8cOHmfvC1Wn_Q2sATgwOyDSqnX_DRHj8HsBZ/s400/2012-01-07+16.10.37.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5694928746793558482" /></a><br />A sketch of what appears to be either a woman with a birthday cake stuck to her head, or a fairytale princess or some such variety of character.<div><br /></div><div>And lastly, we have...<br /><br /><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjAmIaHnPf_4tNrYLXKz915WHwhsTcnEx5DPeBdMQ8JmB3sqeVuv4OWNXDFY_AVwsJHCVejDUu6gQNa679t3ONRpG63VrrXgjpIBnDjnv2erFL92hJBxb3BAHLHkxvLYM1p6UhBRrEUDWQ2/s1600/2012-01-07+16.11.23.jpg"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 300px; height: 400px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjAmIaHnPf_4tNrYLXKz915WHwhsTcnEx5DPeBdMQ8JmB3sqeVuv4OWNXDFY_AVwsJHCVejDUu6gQNa679t3ONRpG63VrrXgjpIBnDjnv2erFL92hJBxb3BAHLHkxvLYM1p6UhBRrEUDWQ2/s400/2012-01-07+16.11.23.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5694928749070491874" /></a><br />A personal inscription from Doctorow himself. Naturally his message alludes to the fact that he would be nothing without my help, and he's eternally grateful. (or something slightly less insane)</div><div><br /></div><div>If you don't already know, Cory Doctorow (a Torontonian resident in the UK), amongst many other things, is a science fiction writer. This book is available for free in <a href="http://craphound.com/walh/audiobook/download-audiobook">audio book format</a>, <a href="http://craphound.com/walh/audiobook/buy-cd">on CD</a> (MP3 or Ogg Vorbis) for a nominal fee, in <a href="http://craphound.com/walh/e-book/browse-all-versions">several ebook formats</a> (with a suggested donation) in paperback through on-demand publishing, or this gorgeous, <a href="http://craphound.com/walh/paper-books/buy-hardcover">unique limited edition hand-bound book</a>. There's even the option to offer to <a href="http://craphound.com/walh/contribute">give a book</a>/books to institutions such as schools, prisons, hospitals etc.<br /><br />So if you can get it for free, why the hell would you fork out money for it? Well for a start, it's directly re-numerating the author for the hard work they put in to make the publication you've enjoyed. That support means they'll go on to write more in future, and also affords them luxuries such as paying their energy bill and buying food to stave off death. But then the hardback book costs a fair amount. What are you getting for your money? Well, by its very nature, a limited edition item is... limited. And this one is particularly limited. The book I'm looking at is one of only 250. A personal inscription from the author isn't something you'll get with the vast majority of books (and he even spelled my name correctly). And then there's the end-papers. No-one else will have the same book as the one I have here as it has unique pieces of Cory's paper hoard that he didn't want to just throw away, so he has immortalised them forever at both ends of the book. I'm intrigued as to what this sketch is about though.<br /><br />But not only that, the SD card on the front contains all the audio book files (Ogg Vorbis and MP3) every format of the ebook, and all the covers. Incidentally, the SDHC card is merely stuck into a recess in the book with some Blu-Tack, so can be easily removed and replaced.</div><div><br /></div><div>As for the actual book itself, it says on <a href="http://craphound.com/walh/paper-books/buy-hardcover">Cory's site</a> that these are hand-bound at the Wyvern Bindery in Clerkenwell, London, and the printing of the book is done by a family-run company called Oldacres in Hatton Garden who have been around since 1897.</div><div><br /></div><div>The book costs the same (including P&P) regardless of where you live, except being in the UK gave me the benefit of getting it next day.<br /><br />Now that the book is in my hands, I'm very pleased with my purchase and would recommend it to anyone who appreciates both good sci-fi, and beautiful, high-quality, unique well-made books. Plus the money goes directly back to the writer instead of mostly eaten away through a complicated chain of publishers, distributors and third-party sellers.</div><div><br /></div><div>One last thing: Cory added a "feature" you don't get with most books; the opportunity to submit typo corrections with the advantage being that the typo will be fixed with a credit to yourself in the next printing of the book.</div><div><br /></div><div>If you would like your very own copy, <a href="http://craphound.com/walh/">go get it</a>.</div>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3748796162921493670.post-27494409209172911552011-11-07T13:55:00.013+00:002011-11-10T10:58:51.530+00:00Writable Common Table Expressions<p>There have previously been various posts on Writable Common Table Expressions (wCTEs) which demonstrate either performance gains or simple examples of how they work, but recently when giving some training, I wanted to come up with some new material to show folk that demonstrates an example of when you might wish to use it. Note that writable common table expressions are only possible in PostgreSQL, and only from version 9.1 onwards. But because 9.1 has been out for almost 2 months now, that means you can go and use this feature straight away!</p><p>So in case you don't know what a writable common table expression is, in summary it's the same as a normal common table expression, except that you can also use UPDATEs, DELETEs and INSERTs with it. If you don't know what a normal common table expression is, it effectively let's you make the results of queries available to other queries in the same statement.</p><p>What follows is a rather basic case, but the data is kept simple so as to make it easier to follow the examples.</p><p>First, we create a company payroll table:</p><code style="white-space:pre-wrap;">CREATE TABLE company_payroll<br />(<br /> department text PRIMARY KEY,<br /> staff_count int NOT NULL,<br /> staff_cost numeric(16,2) NOT NULL<br />);</code><br /><p>Which we'll populate with some data:</p><code>INSERT INTO company_payroll VALUES<br />('development', 45, 2839281.33),<br />('marketing', 8, 412424.45),<br />('management', 4, 839432.53);</code><p>Then we want an employees table:</p><code style="white-space:pre-wrap;">CREATE TABLE employees<br />(<br /> id serial PRIMARY KEY,<br /> department text REFERENCES company_payroll (department),<br /> salary numeric(16,2),<br /> consecutive_days_late int NOT NULL DEFAULT 0<br />);</code><p>Then stick some values in it:</p><code>INSERT INTO employees (department, salary, consecutive_days_late) VALUES<br />('development', 12834.23, 0),<br />('development', 37853.55, 6),<br />('marketing', 24488.23, 2),<br />('development', 91824.29, 14),<br />('marketing', 24829.23, 16),<br />('marketing', 24488.23, 2),<br />('management', 100293.11, 14),<br />('management', 40287.11, 18),<br />('development', 72843.22, 11);</code><p>It doesn't matter that the employees don't add up to the details provided in the company payroll. Just pretend there are many other rows, but no-one else has been late.</p><p>So we now have 2 tables looking like this:</p><code style="white-space:pre-wrap;">test=# select * from company_payroll;<br /> department | staff_count | staff_cost <br />-------------+-------------+------------<br /> development | 45 | 2839281.33<br /> marketing | 8 | 412424.45<br /> management | 4 | 839432.53<br />(3 rows)<br /><br />test=# select * from employees;<br /> id | department | salary | consecutive_days_late <br />----+-------------+-----------+-----------------------<br /> 1 | development | 12834.23 | 0<br /> 2 | development | 37853.55 | 6<br /> 3 | marketing | 24488.23 | 2<br /> 4 | development | 91824.29 | 14<br /> 5 | marketing | 24829.23 | 16<br /> 6 | marketing | 24488.23 | 2<br /> 7 | management | 100293.11 | 14<br /> 8 | management | 40287.11 | 18<br /> 9 | development | 72843.22 | 11<br />(9 rows)</code><p>Okay, so we have 9 employees, but quite a few of them keep turning up late for work. Management have decided to put their foot down and fire anyone who's been late for more than 10 days in a row (and probably break a few employment laws in the process). We want to delete these employees from the employees table, then update the company payroll to reflect these changes. Then finally we want to see some before and after figures to compare the new figures against the old.</p><p>So let's present a writable common table expression to solve the problem, then we'll break it down.</p><code style="white-space:pre-wrap;">WITH <span style="font-weight: bold;">fired_employees</span> AS (<br /> DELETE FROM employees<br /> WHERE consecutive_days_late > 10<br /> RETURNING *),<br /><span style="font-weight: bold;">summarise_fired_staff</span> AS (<br /> SELECT department, COUNT(*) AS fired_count, SUM(salary) AS salary_sum<br /> FROM fired_employees<br /> GROUP BY department),<br /><span style="font-weight: bold;">updated_financials</span> AS (<br /> UPDATE company_payroll<br /> SET staff_count = staff_count - fired_count,<br /> staff_cost = staff_cost - salary_sum<br /> FROM summarise_fired_staff<br /> WHERE company_payroll.department = summarise_fired_staff.department<br /> RETURNING company_payroll.*<br />)<br />SELECT<br /> cp.department, cp.staff_count AS old_staff_count,<br /> uf.staff_count AS new_staff_count,<br /> cp.staff_cost AS old_staff_cost,<br /> uf.staff_cost AS new_staff_cost<br />FROM company_payroll cp<br />INNER JOIN updated_financials uf ON uf.department = cp.department;</code><p>So let's take this one step at a time. The first thing we're doing is deleting all employees from the employees table where they've been late for more then 10 consecutive days:</p><code style="white-space:pre-wrap;"> DELETE FROM employees<br /> WHERE consecutive_days_late > 10<br /> RETURNING *</code><p>Okay, so that's simple enough. But what we've done here is put it in a WITH clause and labelled the result as <code><span style="font-weight: bold;">fired_employees</span></code>. This can now be treated like a temporary table, but one which only exists for the duration of the query. Note that we're also using the RETURNING clause of the DELETE statement to have it return all rows that were deleted by the statement. It's the output of this which is now represented by the <code><span style="font-weight: bold;">fired_employees</span></code> table expression name. We can now use this for other purposes, and we will. So <code><span style="font-weight: bold;">fired_employees</span></code> now contains:</p><code style="white-space:pre-wrap;"> id | department | salary | consecutive_days_late <br />----+-------------+-----------+-----------------------<br /> 4 | development | 91824.29 | 14<br /> 5 | marketing | 24829.23 | 16<br /> 7 | management | 100293.11 | 14<br /> 8 | management | 40287.11 | 18<br /> 9 | development | 72843.22 | 11<br />(5 rows)</code><p>Next we summarise how many people we've fired from each department, and how much they cost in total per department:</p><code style="white-space:pre-wrap;"> SELECT department, COUNT(*) AS fired_count, SUM(salary) AS salary_sum<br /> FROM fired_employees<br /> GROUP BY department</code><p>Again, very straightforward. Note that we're referring to the <code><span style="font-weight: bold;">fired_employees</span></code> expression which we can treat like a regular table, and the output of this has been named <code><span style="font-weight: bold;">summarise_fired_staff</span></code>. This contains:</p><code style="white-space:pre-wrap;"> department | fired_count | salary_sum <br />-------------+-------------+------------<br /> development | 2 | 164667.51<br /> management | 2 | 140580.22<br /> marketing | 1 | 24829.23<br />(3 rows)</code><p>As you can see, we fired 2 developers worth 164667.51 in whatever currency you want to pretend this represents. We've also fired 2 from management and 1 from marketing.</p><p>Now we want to update the figures we maintain in the company payroll table with this information:</p><code style="white-space:pre-wrap;"> UPDATE company_payroll<br /> SET staff_count = staff_count - fired_count,<br /> staff_cost = staff_cost - salary_sum<br /> FROM summarise_fired_staff<br /> WHERE company_payroll.department = summarise_fired_staff.department<br /> RETURNING company_payroll.*</code><p>Here we're joining the company_payroll to the <code><span style="font-weight: bold;">summarise_fired_staff</span></code> expression based on the department name. The update involves decrementing the staff_count by the fired_count returned in the summarise_fired_staff result, and the same for the staff_cost, which is decremented by the salary_sum. Now we want to return the updated rows from this action, so we use "RETURNING company_payroll.*". And the output of this part of the query we've labelled <code><span style="font-weight: bold;">updated_financials</span></code>. As you can see, it's similar to how bash commands are used; you pass the output from one to another, and so on. The result of each one gets passed to the next. And now <code><span style="font-weight: bold;">updated_financials</span></code> contains:</p><code style="white-space:pre-wrap;"> department | staff_count | staff_cost <br />-------------+-------------+------------<br /> development | 43 | 2674613.82<br /> management | 2 | 698852.31<br /> marketing | 7 | 387595.22<br />(3 rows)<br /></code><p>So finally we want to see a side-by-side comparison of before and after figures:</p><code style="white-space:pre-wrap;">SELECT<br /> cp.department, cp.staff_count AS previous_staff_count,<br /> uf.staff_count AS new_staff_count,<br /> cp.staff_cost AS previous_staff_cost,<br /> uf.staff_cost AS new_staff_cost<br />FROM company_payroll cp<br />INNER JOIN updated_financials uf ON uf.department = cp.department</code><p>So we've joined the still yet-to-be-actually-updated company_payroll table with the updated version (which we named <code><span style="font-weight: bold;">updated_financials</span></code>) based on department name. We then select the department, followed by the original staff_count from the company_payroll table (labelling it previous_staff_count for clarity), and then the <code><span style="font-weight: bold;">updated_financials</span></code> own staff_count colum (labelled new_staff_count). We also do the same for the staff cost.</p><p>So running all of this in one go returns the following:</p><code style="white-space:pre-wrap;"> department | old_staff_count | new_staff_count | old_staff_cost | new_staff_cost <br />-------------+-----------------+-----------------+----------------+----------------<br /> development | 45 | 43 | 2839281.33 | 2674613.82<br /> management | 4 | 2 | 839432.53 | 698852.31<br /> marketing | 8 | 7 | 412424.45 | 387595.22<br />(3 rows)</code><p>If you're curious to know what the explain plan for this query is, it's as follows:</p><code style="white-space:pre-wrap; font-size: 9px;"> QUERY PLAN <br />---------------------------------------------------------------------------------------------------------------------------------------<br /> Hash Join (cost=38.33..39.40 rows=3 width=80) (actual time=0.256..0.258 rows=3 loops=1)<br /> Hash Cond: (cp.department = uf.department)<br /> CTE fired_employees<br /> -> Delete on employees (cost=0.00..21.75 rows=313 width=6) (actual time=0.059..0.066 rows=4 loops=1)<br /> -> Seq Scan on employees (cost=0.00..21.75 rows=313 width=6) (actual time=0.022..0.023 rows=4 loops=1)<br /> Filter: (consecutive_days_late > 10)<br /> Rows Removed by Filter: 12<br /> CTE summarise_fired_staff<br /> -> HashAggregate (cost=8.61..10.61 rows=200 width=48) (actual time=0.107..0.108 rows=3 loops=1)<br /> -> CTE Scan on fired_employees (cost=0.00..6.26 rows=313 width=48) (actual time=0.062..0.071 rows=4 loops=1)<br /> CTE updated_financials<br /> -> Update on company_payroll (cost=1.07..5.88 rows=3 width=198) (actual time=0.184..0.198 rows=3 loops=1)<br /> -> Hash Join (cost=1.07..5.88 rows=3 width=198) (actual time=0.159..0.167 rows=3 loops=1)<br /> Hash Cond: (summarise_fired_staff.department = company_payroll.department)<br /> -> CTE Scan on summarise_fired_staff (cost=0.00..4.00 rows=200 width=168) (actual time=0.127..0.131 rows=3 loops=1)<br /> -> Hash (cost=1.03..1.03 rows=3 width=62) (actual time=0.008..0.008 rows=3 loops=1)<br /> Buckets: 1024 Batches: 1 Memory Usage: 1kB<br /> -> Seq Scan on company_payroll (cost=0.00..1.03 rows=3 width=62) (actual time=0.002..0.003 rows=3 loops=1)<br /> -> Seq Scan on company_payroll cp (cost=0.00..1.03 rows=3 width=56) (actual time=0.029..0.030 rows=3 loops=1)<br /> -> Hash (cost=0.06..0.06 rows=3 width=56) (actual time=0.205..0.205 rows=3 loops=1)<br /> Buckets: 1024 Batches: 1 Memory Usage: 1kB<br /> -> CTE Scan on updated_financials uf (cost=0.00..0.06 rows=3 width=56) (actual time=0.188..0.203 rows=3 loops=1)<br /> Total runtime: 0.514 ms<br />(23 rows)<br /></code><p>You could also easily add in delta (or diff) columns to show how much they've changed by, because you have 2 sets of columns to reference and do a straight-forward calculation, then label those columns accordingly.</p><p>So what initially appears as a complicated behemoth of a statement is actually fairly simple once you break it down. And while the example shown isn't necessarily realistic, it should hopefully demonstrate its potential applications.</p><p>And you can try out this example right away if you're running PostgreSQL 9.1. Just run the CREATE TABLE and INSERT statements above, then run the big writable common table expression query and see the results for yourself. Note that because the wCTE query actually makes changes, running it more than once will do nothing since the employees that were fired were removed by the query the first time it was run.</p><p>You can read other blog entries on wCTEs at these links:</p><ul><li><a href="http://akretschmer.blogspot.com/2009/11/writeable-cte-short-performance-test.html">Andreas Kretschmer - writeable CTE - a short performance test</a></li><li><a href="http://www.depesz.com/index.php/2011/03/16/waiting-for-9-1-writable-cte/">Depesz - Waiting for 9.1 - Writable CTE</a></li><li><a href="http://xzilla.net/blog/2011/Mar/Upserting-via-Writeable-CTE.html">Robert Treat - Upserting via Writeable CTE</a></li><li><a href="http://vibhorkumar.wordpress.com/2011/10/26/upsertmerge-using-writable-cte-in-postgresql-9-1/">Vibhor Kumar - UPSERT/MERGE using Writable CTE in PostgreSQL 9.1</a></li></ul>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3748796162921493670.post-77308279604623493702011-09-01T14:45:00.000+01:002011-09-01T14:45:00.484+01:00The first PostgreSQL Conference Europe talks!The first round of PostgreSQL Conference Europe talks have now been selected and already it's shaping up to be a great conference if these are any to go by. You can see the list so far at <a href="http://www.postgresql.eu/events/sessions/pgconfeu2011/">http://www.postgresql.eu/events/sessions/pgconfeu2011/</a>.<div>
<br /></div><div>If we take a brief look at a few of the talks, you'll see that you really don't want to miss this year's line-up.</div><div>
<br /></div><div>Firstly we have a talk by PostgreSQL veteran and core team member Bruce Momjian who will be "Explaining the Postgres Query Optimizer". This will enlighten attendees as to how PostgreSQL makes its decisions in response to queries which is important to understand in order to know how best to ask PostgreSQL what you want. Bruce gave a very popular MVCC talk last year, and no doubt this one will be one to watch.</div><div>
<br /></div><div>Also, Michael Meskes's talk entitled "Mission impossible? Can I replace my most important databases with PostgreSQL?" won't be a brief "yes", but will elaborate on Michael's experience with over 12 years of using open source in business, and how to make migrations work. He'll also provide some real life examples of the benefits which have been realised over the years.</div><div>
<br /></div><div>And you'll also get the chance to attend the "Synchronous Replication and Durability Tuning" talk by Simon Riggs, who developed the synchronous replication functionality, and Greg Smith, who wrote the book on PostgreSQL performance. No really, <a href="http://www.packtpub.com/postgresql-90-high-performance/book">he did</a>. In this talk Simon and Greg will go through how synchronous replication works in PostgreSQL 9.1, what considerations should be taken into account, and typical performance users of this feature can expect. Then they will cover how to go about improving performance for this particular set up.</div><div>
<br /></div><div>We don't just have talks though, there are also training sessions available, and in addition to the ones previously announced we now also have "Streaming replication, hot standby and synchronous replication" by Redpill Linpro's own Magnus Hagander. There's also "Hands-On with PostGIS" by Vincent Picavet, a GIS specialist who's providing training on behalf of Dalibo. Go check out the complete list at <a href="http://2011.pgconf.eu/training/">http://2011.pgconf.eu/training/</a>.</div><div>
<br /></div><div>These talks shown so far are just the start. At the time of writing this we have confirmed about 15 talks (probably more as you read this), but there will be many many more, so keep your eye on the <a href="http://2011.pgconf.eu/">conference website</a> and subscribe to the conference <a href="https://twitter.com/#!/pgconfeu">Twitter feed</a> for the latest updates. And get don't forget to make your booking for training sessions in advance as there are a limited number of allocations for each class.</div>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3748796162921493670.post-15223151790320452742011-07-26T14:07:00.004+01:002011-07-26T14:10:53.601+01:00Could Clang displace GCC among PostgreSQL developers? Part I: Intro and compile timesPeter Geoghegan (@sternocera) has just joined us on Planet PostgreSQL but unfortunately his post (which looks like he's put quite a bit of work into it) became too old to list on the main site's front page before his account was approved, so here's me waving my arms to give you a second chance to find it. You can read it <a href="http://pgeoghegan.blogspot.com/2011/07/could-clang-displace-gcc-among.html">here</a>.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3748796162921493670.post-3783911109724034612011-04-19T12:01:00.006+01:002011-04-19T12:57:44.402+01:00The lies of No2AVSo I went to the No2AV website, and clicked on the link entitled "Why vote no" to see what reasons they've cooked up. Let's take a look:<div><br /></div><div><b>AV is costly</b></div><div><br /></div><div>They argue, as they keep arguing, that AV will cost up to an additional £250 million: "Local councils would have to waste money on costly electronic vote counting machines and expensive voter education campaigns."</div><div><br /></div><div>What about the extra £800 million they'll also have to spend on red carpets, gold-trimmed ballot papers and free meringue and strawberries served to voters as the booths? What about those? Yes, of course, there's no such cost, because they won't be used. No-one said they would need to be used. They harp on about how Australia is the only other country to use AV to vote in the lower house, but they conveniently ignore the fact that Australia doesn't even use electronic voting machines.</div><div><br /></div><div>"With ordinary families facing tough times can we really afford to spend a quarter of a billion pounds of taxpayers' money bringing in a new voting system?"</div><div><br /></div><div>And why are they facing tough times in the first place eh? Because of the government voted in using the First Past The Post system.</div><div><br /></div><div>"Schools and hospitals, or the Alternative Vote – that's the choice in this referendum."</div><div><br /></div><div>You will be taking money away from schools and hospitals to fund the new electoral system? Why not the military? Why not higher taxes for the rich? And also bear in mind that this won't be paid for all in a single year.</div><div><br /></div><div><br /></div><div><b>AV is complex and unfair</b></div><div><br /></div><div>There's no denying that it's more complex than FPTP, but are you seriously saying that people are too dumb to say "I want this party to win, but if not then this one, if not that one, then this one... and I don't want anyone else to win, so won't put anything." People fill out questionnaires every day with questions like that.</div><div><br /></div><div>"The winner should be the candidate that comes first, but under AV the candidate who comes second or third can actually be elected."</div><div><br /></div><div>That's not true for a start. The candidate that comes first, and gets the majority of the vote wins, instead of the one with more votes than other people, even if it's less than 50%. The No2AV campaign use a race analogy in their campaign leaflets and broadcasts, saying that the race horse or runner that comes third wins under AV. But the analogy doesn't apply. Who are the horses representing for a start? Using the race analogy, there's a finite speed which gets divided up between horses. The horse which uses more average speed than the others win. That's not how horse racing works. So one horse can be left right at the back slowly limping along and never making it to the finish line, and they attend the race every time? And the others make it to the finish line but they won't win anything? Yes, this analogy is terrible.</div><div><br /></div><div>At least under AV, if your candidate gets the least votes, you can transfer it to your next favourite so your vote is still counted. Under FPTP many people feel their votes are wasted because of the endless tactical voting between 2 parties, so people tend to vote based on who they *don't* want to win. Now that's not fair. At least people who vote for minority parties can still get a say.</div><div><br /></div><div>I saw the broadcast of the No2AV campaign, and it showed a woman teaching a class of people how it works. They seriously can't explain it any clearer? That's as clear as it gets? I've seen many examples of very simple and clear explanations, and I guess it's because of the No2AV's inability to understand it that they are against it. I'm sure I could make FPTP sound incomprehensible if I wanted.</div><div><br /></div><div>" That’s why it is used by just three countries in the world – Fiji, Australia and Papua New Guinea."</div><div><br /></div><div>Yes, these backward, savage and unfair countries are prime examples of where democracy has failed. These countries use this system entirely because they want the second or third person to win.</div><div><br /></div><div>"We can't afford to let the politicians off the hook by introducing a loser's charter."</div><div><br /></div><div>Politicians are always off the hook under the current system. It favours the same old people who have safe seats, don't really need to do much, and very very rarely gives newer candidates a chance.</div><div><br /></div><div><br /></div><div><b>AV is a politician's fix</b></div><div><br /></div><div>"AV leads to more hung parliaments, backroom deals and broken promises like the Lib Dem tuition fees U-turn."</div><div><br /></div><div>So rather than have parties compromise, there has to be one supreme ruler who makes all the decisions for everyone, even if the majority of the voters didn't vote for them. People are much less represented under FPTP. Using the tuition fees U-turn as an example is trying to make it sound like it's entirely the Lib Dem's fault. Of course, the Conservatives wouldn't have broken such a promise... because they would have never made it in the first place. They'd cut it without a second thought... and they did.</div><div><br /></div><div>"Instead of the voters choosing the government, politicians would hold power."</div><div><br /></div><div>It's amazing how they state something without any explanation. This statement is exactly the opposite of the case. At the moment, the politicians hold the power because you have so little to choose from, and people are under-represented in government. It's the same old party politics we've been locked in for goodness knows how long.</div><div><br /></div><div>"Under AV, the only vote that really counts is Nick Clegg's."</div><div><br /></div><div>There's no attempt to explain what they mean by this, or how they came to this conclusion.</div><div><br /></div><div>"We can't afford to let the politicians decide who runs our country."</div><div><br /></div><div>Agreed. What's your point? What's this got to do with AV? You may as well say, "We can't let the terrorists win.", "We musn't have children's right to education removed.", "We can't afford to increase food prices ten-fold." How is this relevant to AV? Again, no explanation.</div><div><br /></div><div>"NOtoAV is a campaign that has support from right across the country."</div><div><br /></div><div>The BNP can make the same claim. What's your point?</div><div><br /></div><div>So the only points they make, they either don't explain, or if they do explain, doesn't make sense when applied to AV.</div><div><br /></div><div>When the Conservatives were running for government last year, their slogan was "Vote For Change". They seem to have changed their tune.</div><div><br /></div><div>So if I were to run a Yes2AV campaign like the No2AV campaign run theirs, I'd say the following:</div><div><br /></div><div>- FPTP is used by all terrorists around the world.</div><div><br /></div><div>- To continue to use FPTP will cost billions of pounds of tax-payers money</div><div><br /></div><div>- The people you don't want to get in will be the only ones who get in</div><div><br /></div><div>- Many children die under FPTP</div><div><br /></div><div>- A vote against AV is a vote for Hitler</div><div><br /></div><div>See, no explanation needed. Facts and explanations are clearly complicated and confusing. Just use vague statements and outrageous claims and you'll be fine.</div>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3748796162921493670.post-14039965023225711082011-02-07T15:08:00.013+00:002011-02-07T16:13:50.091+00:00PostgreSQL @ FOSDEM 2011I returned from Brussels with Dave Page yesterday where I attended FOSDEM. It was my first time at this event and had a great time. Loads of organisations were represented there including Mozilla (promoting Firefox), <a href="http://www.flickr.com/photos/dark_ixion/5424798041/">LibreOffice</a>, OpenOffice, Debian, <a href="http://www.flickr.com/photos/dark_ixion/5425392194/">OpenSUSE</a>, <a href="http://www.flickr.com/photos/dark_ixion/5424784523/">Ubuntu</a>, <a href="http://www.flickr.com/photos/dark_ixion/5424788273/">Fedora</a>, FreeBSD, Perl, FSFE, Mandriva, Gnome, KDE, CentOS, CAcert, and many others.<div><br /></div><div>And of course, PostgreSQL, where I was helping out, selling various items and talking to people about Postgres itself. All the plush blue elephants were sold (and there were quite a lot), all the stress balls sold, many t-shirts and backpacks sold, and all the pins and pens were given away too. In fact the last pin was taken by a friendly Monty Widenius (creator MySQL and now MariaDB) which he put on there and then. Perl, who were our next-door neighbours, bought one of the plush PostgreSQL elephants and placed it <a href="http://www.flickr.com/photos/dark_ixion/5425401600/">atop their massive Perl camel</a>, which we then declared made it <a href="http://www.postgresql.org/docs/current/static/plperl.html">pl/Perl</a>. And one person even bought a batch of 100 stress balls which were placed in a PostgreSQL backpack for him to carry them off with. We also introduced a promotion which stated that if you could prove that you're an Oracle employee, you got a stress ball completely free of charge. Sadly no-one took us up on that offer.</div><div><br /></div><div>Unfortunately I didn't make it to many talks. I only attended three which were Heikki Linnakangas from EnterpriseDB talking about creating custom data types and operators in PostgreSQL, Tatsuo Ishii from SRA OSS Inc introducing <a href="http://pgpool.projects.postgresql.org/">pgPool II</a> version 3, and Damien Clochard of Dalibo explaining <a href="http://pgxn.org/">PGXN</a>. I had hoped to make it to at least one of the LibreOffice talks and get to talk to their guys at the booth, but didn't get round to it, and there was also a talk from folk at Sirius about open source software and whether the UK government is backing it, which I had hoped to see but was too late for. But the whole thing was a complete geek-fest, and I'd definitely want to go again.</div>Unknownnoreply@blogger.com1tag:blogger.com,1999:blog-3748796162921493670.post-7168838831188223712011-02-02T23:26:00.003+00:002011-02-03T00:09:18.442+00:00FOSDEM, here we come!<a href="http://fosdem.org">FOSDEM</a> is running this weekend in Brussels, and there are a plethora of talks to attend. Naturally I'll be very interested in the PostgreSQL-related talks, and they are as follows:<div><br /></div><div><br /></div><div><b>Sunday - 10am</b></div><div><br /></div><div><a href="http://fosdem.org/2011/schedule/event/pl_parrot">PL/Parrot</a> - David Fetter will be giving his talk about this new procedural language addition. I don't actually know much about this, which is why I'm especially interested.</div><div><br /></div><div><a href="http://fosdem.org/2011/schedule/event/async_notifies">Asynchronous Notifications for Fun and Profit</a> - Marc Balmer's talk on asynchronous notifications in PostgreSQL will cover message brokering in distributed environments. This is, annoyingly, at the same time as David's talk.</div><div><br /></div><div><br /></div><div><b>Sunday - 11am</b></div><div><br /></div><div><a href="http://fosdem.org/2011/schedule/event/pg_extension1">PostgreSQL extension's (sic) development</a> - Dimitri Fontaine will be sharing his expertise on extension development, hopefully covering the subject of what he's been hard at work on: extension management within PostgreSQL itself.</div><div><br /></div><div><br /></div><div><b>Sunday - 12pm</b></div><div><br /></div><div><a href="http://fosdem.org/2011/schedule/event/user_defined_type">Writing a user-defined type</a> - The scarily clever Heikki Linnakangas will explain how you can create your own data type, create operators to work with it, and methods to apply indexing to them.</div><div><br /></div><div><br /></div><div><b>Sunday - 2pm</b></div><div><br /></div><div><a href="http://fosdem.org/2011/schedule/event/pgpool_introduction">Introduction to pgpool-II version 3</a> - Tatsuo Ishii will be coming all the way from Kanagawa in Japan to talk about the all new version of pgPool II, and how it takes advantage of the new built-in streaming replication feature introduced in PostgreSQL 9.0.</div><div><br /></div><div><br /></div><div><b>Sunday - 3pm</b></div><div><br /></div><div><a href="http://fosdem.org/2011/schedule/event/pg_extension2">Get ready for the PostgreSQL Extension Network</a> - Damien Clochard will talk about the exciting new <a href="http://pgxn.org/">PGXN project</a> (PostgreSQL Extension Network - kind of like Perl's CPAN) which David Wheeler has been developing.</div><div><br /></div><div><br /></div><div><b>Sunday - 4pm</b></div><div><br /></div><div><a href="http://fosdem.org/2011/schedule/event/clustered_databases">Using MVCC for Clustered Databases</a> - Marcus Wanner's talk will be on utilising MVCC (Multi-Version Concurrency Control) in the cluster, and how <a href="http://www.postgres-r.org/">Postgres-R</a> (a multi-master replication extension to PostgreSQL) uses it to effect.</div><div><br /></div><div>Besides PostgreSQL-related talks, other ones I'm interested in are:</div><div><ul><li>Sirius: Is the UK Government backing Free Software?</li><li>flashrom: Run your BIOS/EFI/firmware updates under any free OS</li><li>Happily hacking LibreOffice</li><li>The Document Foundation - four months in the making</li></ul><div>... and no doubt I'll find others that'll pique my interest too. This will be my first time at FOSDEM, so I'm really looking forward to it.</div></div><div><br /></div>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3748796162921493670.post-91334417624480785362011-01-31T10:21:00.015+00:002011-01-31T10:28:04.478+00:00Performance changes in 9.1The upcoming <span class="blsp-spelling-error" id="SPELLING_ERROR_0">PostgreSQL</span> 9.1 release is packed full of awesome new features, but let's not forget the performance improvements which will also arrive with it. Every version of <span class="blsp-spelling-error" id="SPELLING_ERROR_1">PostgreSQL</span> brings about performance improvements in various areas, and 9.1 is no different. Here's a brief overview of some, but not all, of the changes affecting performance:<br /><br /><div><br /><span style="font-weight:bold;">Checkpoints now with less spikes</span><br /><br />When checkpoints occur on a database, they can result in an I/O spike. 9.1 has the option to spread the checkpoint <span class="blsp-spelling-error" id="SPELLING_ERROR_2">fsync</span> calls out so that the system doesn't grind to a halt when flushing. In addition to this, there's been a change which will remove duplicate requests to the background writer, meaning even less I/O, so checkpoints are now lighter and smoother than ever. These changes haven't yet made it in, but they're in-progress.<br /><br /><div><br /><b>Integer to text casts quicker</b><br /><br />The conversion of a signed integer to text now happens in hand-coded internal code as it's faster than the previous standard function being used. This means any queries where many rows have columns being cast from an integer to text will now spend less time converting the data, resulting in faster query times.<br /><br /><br /><b>More efficient commits</b><br /><br />The setting commit_siblings currently requires a scan of all active commits to see whether there are enough transactions committed before enforcing the commit delay. This will now stop checking once it's counted enough active commits, or not count at all if commit_siblings is set to 0. This improvement will be most noticeable in high transaction environments.<br /><br /></div><div><br /><b>pg_upgrade even faster</b><br /><br />pg_upgrade (a tool to quickly upgrade between versions of <span class="blsp-spelling-error" id="SPELLING_ERROR_3">PostgreSQL</span> without dump/reload) is already fast, but for databases with a large number of tables, it will be noticeably quicker in 9.1 as relations are joined rather than looped through with a nested loop join. Previously it would have to keep checking a list of new tables against each old table, but now the lists will be in sync, so they can be checked in parallel.<br /><br /></div><div><br /><b><span class="blsp-spelling-error" id="SPELLING_ERROR_4">Unlogged</span> tables</b><br /><br />In most cases, you probably need to ensure the data you place into a table is never lost, so everything in that table is written to the Write Ahead Log. But what if the table stored information which wasn't that important, such as data which expires fairly quickly, or just a place to put data that's stored elsewhere which you merely wish to create a quick report on? This will be possible with <span class="blsp-spelling-error" id="SPELLING_ERROR_5">unlogged</span> tables which are much faster to insert to, update and delete from, as there's no extra step to write to the <span class="blsp-spelling-error" id="SPELLING_ERROR_6">WAL</span> files. The data will still be in the database if you cleanly shut down, but if the database crashes or has an unexpected shutdown, the table will be empty. It will also be possible to specify that <span class="blsp-spelling-error" id="SPELLING_ERROR_7">unlogged</span> tables be skipped in backups, meaning database backups will be faster too if unlogged tables are present in the database.</div></div>Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3748796162921493670.post-81419619152517133332010-11-16T18:30:00.010+00:002013-04-18T08:04:10.537+01:00count_if (expression)<p><br />Someone just posted a problem on a PostgreSQL mailing list. They had several boolean columns, but they wanted to get how many values were true for each column, but in 1 query. There are several ways to do this, but none of which are particularly elegant.<br /></p><p>Say we have the following table definition:<br /></p><pre>CREATE TABLE subscribers<br />(<br /> subscriber text PRIMARY KEY,<br /> news boolean DEFAULT false,<br /> sport boolean DEFAULT false,<br /> promotions boolean DEFAULT false,<br /> horrible_spam boolean DEFAULT false<br />);</pre><br /><pre>INSERT INTO subscribers (subscriber, news, sport, promotions, horrible_spam)<br />VALUES<br />('person_a', true, false, false, false),<br />('person_b', true, true, true, false),<br />('person_c', true, false, true, true),<br />('person_d', false, false, false, false),<br />('person_e', true, true, false, false),<br />('person_f', true, false, true, true),<br />('person_g', true, false, false, false);<br /></pre><p><br />Now we want to get a count of how many of each subscription category has someone subscribing to it. We can use a case statement for each column:<br /></p><pre>SELECT sum(CASE news WHEN true THEN 1 END) AS "news count",<br /> sum(CASE sport WHEN true THEN 1 END) as "sport count",<br /> sum(CASE promotions WHEN true THEN 1 END) as "promotions count",<br /> sum(CASE horrible_spam WHEN true THEN 1 END) as "horrible_spam count"<br />FROM subscribers;<br /></pre><p>Which produces:</p><pre> news count | sport count | promotions count | horrible_spam count <br />------------+-------------+------------------+---------------------<br /> 6 | 2 | 3 | 2<br />(1 row)<br /></pre><p>This looks almost unreadable and is very cumbersome. There's another way however, although it's no better. We use 4 subqueries:</p><br /><pre>SELECT (SELECT count(news) FROM subscribers WHERE news = true) AS "news count",<br /> (SELECT count(sport) FROM subscribers WHERE sport = true) AS "sport count",<br /> (SELECT count(promotions) FROM subscribers WHERE promotions = true) AS "promotions count",<br /> (SELECT count(horrible_spam) FROM subscribers WHERE horrible_spam = true) AS "horrible_spam count";</pre><pre> news count | sport count | promotions count | horrible_spam count <br />------------+-------------+------------------+---------------------<br /> 6 | 2 | 3 | 2<br />(1 row)<br /></pre><p>Arghhh.. my eyes! Okay, we can try casting booleans to integers:</p><pre>SELECT count(news::int) AS "news count",<br /> count(sport::int) AS "sport count",<br /> count(promotions::int) AS "promotions count",<br /> count(horrible_spam::int) AS "horrible_spam count"<br />FROM subscribers;</pre><pre> news count | sport count | promotions count | horrible_spam count <br />------------+-------------+------------------+---------------------<br /> 7 | 7 | 7 | 7<br />(1 row)<br /></pre><p>Hmmm.. obviously not.</p><p>Surely there's a better way? Well no, there isn't... unless we create an aggregate function which will count based on an expression.</p><pre>CREATE OR REPLACE FUNCTION countif_add(current_count int, expression bool)<br />RETURNS int AS<br />$BODY$<br /> SELECT CASE expression<br /> WHEN true THEN<br /> current_count + 1<br /> ELSE<br /> current_count<br /> END;<br />$BODY$<br />LANGUAGE SQL IMMUTABLE;</pre><pre>CREATE AGGREGATE count_if (boolean)<br />(<br /> sfunc = countif_add,<br /> stype = int,<br /> initcond = 0<br />);</pre><p>Now we can use a far tidier query.</p><pre>SELECT count_if(news) AS "news count",<br /> count_if(sport) AS "sport count",<br /> count_if(promotions) AS "promotions count",<br /> count_if(horrible_spam) AS "horrible_spam count"<br />FROM subscribers;</pre><pre> news count | sport count | promotions count | horrible_spam count <br />------------+-------------+------------------+---------------------<br /> 6 | 2 | 3 | 2<br />(1 row)<br /></pre><p>But this function lets us do more than count true values in boolean columns. It's evaluating expressions. So you can have queries like:</p><pre>SELECT count_if(my_column > 7) AS "things bigger than 7"<br /> count_if(this_column = that_column) AS "stuff that matches"<br />FROM my_table;</pre>Unknownnoreply@blogger.com4tag:blogger.com,1999:blog-3748796162921493670.post-78762321781737922402010-10-07T09:41:00.003+01:002010-10-07T09:45:51.977+01:00PGDay.EU 2010 Call for Papers ending soon!Dave Page posted this on the European PostgreSQL mailing list, but thought I'd also put this out there...<br /><br />This year's European PostgreSQL Conference is due to be held on the 6 -<br />8th December 2010 in Stuttgart Germany (<a href="http://2010.pgday.eu/">http://2010.pgday.eu/</a>).<br />Following on from last years extremely successful event in Paris, we<br />hope to make the show bigger and better than before, with:<br /><br />* A 4 star hotel venue (with spa facilities in the complex)<br /><br />* A special guest keynote speaker (to be announced Real Soon Now :-) )<br /><br />* Multiple talk tracks in English and German<br /><br />* Talks for PostgreSQL hackers and users, as well as decision makers<br /><br />* A day of training sessions<br /><br />Of course, to make that happen we need talk proposals from our<br />community members and users. We're looking for presentations aimed at<br />developers, users and decision makers on topics such as:<br /><br />- Developing applications for PostgreSQL<br />- Administering large scale PostgreSQL installations<br />- Case studies of PostgreSQL deployments<br />- PostgreSQL tools and utilities<br />- PostgreSQL hacking<br />- Community & user groups<br />- Tuning the server<br />- Migrating from other systems<br />- Scaling/replication<br />- Benchmarking & hardware<br />- PostgreSQL related products<br />- Deployment of PostgreSQL from a business perspective<br />- Applications and FOSS projects that use PostgreSQL<br /><br />So if you can make it to Germany in December, please take a look at<br />the call for papers page on the website<br />(<a href="http://2010.pgday.eu/callforpapers">http://2010.pgday.eu/callforpapers</a>), and send in your proposal before<br />the 11th October!<br /><br />Please forward this message to any colleagues or friends that you<br />believe may be interested in presenting at the conference.Unknownnoreply@blogger.com0tag:blogger.com,1999:blog-3748796162921493670.post-60809862736647024892010-09-22T09:00:00.003+01:002010-09-22T12:45:34.738+01:00PostgreSQL docs reloaded<div>If you regularly reference the online PostgreSQL docs, you may have noticed that they have just had a face-lift in time for the big 9.0 release. If not, <a href="http://www.postgresql.org/docs/9.0/static/functions-datetime.html">take a look</a>. This went through quite a few revisions with input from many folk in the Postgres community. As with all things, you won't please everyone, and this is no less true in the case of this redesign. But look at this as a pilot change, where we've come up with a first version with possible "bug-fixes" to follow. I've also provided similar changes for the <a href="http://docs.postgresql.fr/9.0/functions-datetime.html">French version</a>, although there's only been feedback from one person for that. Regardless, they appear to have gone with it anyway.</div><br /><div>I'm no designer, and neither are the majority of the people who contributed their recommendations, but the original cause of me wanting to make some changes is so that the examples which appear throughout the docs would be distinct from the main body text. Previously they appeared to all merge into one which made it difficult to pick out the various elements on the page. This involved simple indentation, but I thought that while I was there, I'd get rid of the horrible default borders on the tables, and then show warnings in a red box, and notes in beige... etc.. etc.</div><br /><div>So now I'm soliciting feedback. I firmly believe that PostgreSQL seriously has some of the best documentation out there; clear, comprehensive and organised. So design changes mustn't be distracting, should aid readability and help communicate. Is it confusing? Does it make you want to cry?.. with joy.. with pain?</div>Unknownnoreply@blogger.com17tag:blogger.com,1999:blog-3748796162921493670.post-13000286144261585492010-09-05T20:23:00.002+01:002010-09-05T20:39:56.979+01:00Why I refuse to use transporter technologyAs <span class="blsp-spelling-error" id="SPELLING_ERROR_0">you</span> are <span class="blsp-spelling-error" id="SPELLING_ERROR_1">aware</span>, transporter <span class="blsp-spelling-error" id="SPELLING_ERROR_2">technology</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_3">has</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_4">allowed</span> us <span class="blsp-spelling-error" id="SPELLING_ERROR_5">to</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_6">deliver</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_7">objects</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_8">across</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_9">great</span> distances <span class="blsp-spelling-error" id="SPELLING_ERROR_10">almost</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_11">instantaneously</span>. <span class="blsp-spelling-error" id="SPELLING_ERROR_12">Not</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_13">only</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_14">does</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_15">this</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_16">mean</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_17">we</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_18">don't</span> have <span class="blsp-spelling-error" id="SPELLING_ERROR_19">to</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_20">wait</span> for items <span class="blsp-spelling-error" id="SPELLING_ERROR_21">we've</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_22">ordered</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_23">to</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_24">take</span> a long <span class="blsp-spelling-error" id="SPELLING_ERROR_25">journey</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_26">across</span> land and <span class="blsp-spelling-error" id="SPELLING_ERROR_27">sea</span>, but <span class="blsp-spelling-error" id="SPELLING_ERROR_28">we</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_29">also</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_30">saves</span> us a <span class="blsp-spelling-error" id="SPELLING_ERROR_31">huge</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_32">amount</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_33">of</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_34">money</span> in postage <span class="blsp-spelling-error" id="SPELLING_ERROR_35">costs</span>.<div><br /></div><div><span class="blsp-spelling-error" id="SPELLING_ERROR_36">However</span>, <span class="blsp-spelling-error" id="SPELLING_ERROR_37">while</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_38">people</span> have <span class="blsp-spelling-error" id="SPELLING_ERROR_39">started</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_40">to</span> use <span class="blsp-spelling-error" id="SPELLING_ERROR_41">this</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_42">to</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_43">travel</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_44">across</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_45">the</span> globe, I <span class="blsp-spelling-error" id="SPELLING_ERROR_46">can't</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_47">bring</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_48">myself</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_49">to</span> do <span class="blsp-spelling-error" id="SPELLING_ERROR_50">it</span>. <span class="blsp-spelling-error" id="SPELLING_ERROR_51">The</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_52">reason</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_53">is</span> simple. I <span class="blsp-spelling-error" id="SPELLING_ERROR_54">will</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_55">be</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_56">killed</span> off <span class="blsp-spelling-error" id="SPELLING_ERROR_57">forever</span>. <span class="blsp-spelling-error" id="SPELLING_ERROR_58">Here's</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_59">my</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_60">rationale</span>...</div><div><br /></div><div><span class="blsp-spelling-error" id="SPELLING_ERROR_61">When</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_62">the</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_63">Transmat</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_64">sequence</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_65">begins</span>, <span class="blsp-spelling-error" id="SPELLING_ERROR_66">it</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_67">starts</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_68">by</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_69">converting</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_70">your</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_71">matter</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_72">to</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_73">energy</span> and <span class="blsp-spelling-error" id="SPELLING_ERROR_74">during</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_75">this</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_76">process</span>, <span class="blsp-spelling-error" id="SPELLING_ERROR_77">every</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_78">molecule</span> and <span class="blsp-spelling-error" id="SPELLING_ERROR_79">electro-chemical</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_80">reaction</span> in <span class="blsp-spelling-error" id="SPELLING_ERROR_81">your</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_82">body</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_83">is</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_84">recorded</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_85">with</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_86">the</span> exact position, <span class="blsp-spelling-error" id="SPELLING_ERROR_87">electrical</span> charge, <span class="blsp-spelling-error" id="SPELLING_ERROR_88">temperature</span>... <span class="blsp-spelling-error" id="SPELLING_ERROR_89">everything</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_90">copied</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_91">down</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_92">into</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_93">the</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_94">network's</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_95">distributed</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_96">memory</span>. <span class="blsp-spelling-error" id="SPELLING_ERROR_97">This</span> data <span class="blsp-spelling-error" id="SPELLING_ERROR_98">is</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_99">then</span> sent <span class="blsp-spelling-error" id="SPELLING_ERROR_100">to</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_101">the</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_102">rematerialisation</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_103">hub</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_104">nearest</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_105">to</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_106">your</span> destination. <span class="blsp-spelling-error" id="SPELLING_ERROR_107">The</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_108">hub</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_109">then</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_110">generates</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_111">the</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_112">required</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_113">energy</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_114">to</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_115">reproduce</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_116">you</span>, and <span class="blsp-spelling-error" id="SPELLING_ERROR_117">then</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_118">proceeds</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_119">to</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_120">convert</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_121">your</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_122">energy</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_123">back</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_124">into</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_125">matter</span>. <span class="blsp-spelling-error" id="SPELLING_ERROR_126">Now</span> in <span class="blsp-spelling-error" id="SPELLING_ERROR_127">theory</span>, <span class="blsp-spelling-error" id="SPELLING_ERROR_128">this</span> data <span class="blsp-spelling-error" id="SPELLING_ERROR_129">could</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_130">be</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_131">duplicated</span> and 2 copies <span class="blsp-spelling-error" id="SPELLING_ERROR_132">of</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_133">you</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_134">could</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_135">be</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_136">generated</span>. And <span class="blsp-spelling-error" id="SPELLING_ERROR_137">here's</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_138">where</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_139">the</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_140">disturbing</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_141">truth</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_142">comes</span> in <span class="blsp-spelling-error" id="SPELLING_ERROR_143">to</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_144">play</span>: <span class="blsp-spelling-error" id="SPELLING_ERROR_145">you're</span> no longer <span class="blsp-spelling-error" id="SPELLING_ERROR_146">yourself</span>.</div><div><br /></div><div>Imagine <span class="blsp-spelling-error" id="SPELLING_ERROR_147">you</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_148">hacked</span> a <span class="blsp-spelling-error" id="SPELLING_ERROR_149">Transmat</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_150">program</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_151">to</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_152">run</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_153">through</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_154">its</span> routine, but <span class="blsp-spelling-error" id="SPELLING_ERROR_155">instead</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_156">of</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_157">taking</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_158">your</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_159">matter</span> and <span class="blsp-spelling-error" id="SPELLING_ERROR_160">energy</span>, <span class="blsp-spelling-error" id="SPELLING_ERROR_161">it</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_162">just</span> copies <span class="blsp-spelling-error" id="SPELLING_ERROR_163">it</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_164">into</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_165">memory</span>. <span class="blsp-spelling-error" id="SPELLING_ERROR_166">It</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_167">then</span> contacts a <span class="blsp-spelling-error" id="SPELLING_ERROR_168">rematerialisation</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_169">hub</span>, <span class="blsp-spelling-error" id="SPELLING_ERROR_170">say</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_171">the</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_172">one</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_173">nearest</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_174">you</span>, <span class="blsp-spelling-error" id="SPELLING_ERROR_175">to</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_176">generate</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_177">another</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_178">you</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_179">from</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_180">that</span> data. <span class="blsp-spelling-error" id="SPELLING_ERROR_181">That</span> "<span class="blsp-spelling-error" id="SPELLING_ERROR_182">you</span>" <span class="blsp-spelling-error" id="SPELLING_ERROR_183">that</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_184">has</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_185">arrived</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_186">will</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_187">be</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_188">completely</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_189">the</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_190">same</span> as <span class="blsp-spelling-error" id="SPELLING_ERROR_191">you</span>, and <span class="blsp-spelling-error" id="SPELLING_ERROR_192">he'll</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_193">think</span> "<span class="blsp-spelling-error" id="SPELLING_ERROR_194">I'm</span> a <span class="blsp-spelling-error" id="SPELLING_ERROR_195">copy</span>?". And <span class="blsp-spelling-error" id="SPELLING_ERROR_196">you'll</span> look <span class="blsp-spelling-error" id="SPELLING_ERROR_197">at</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_198">him</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_199">knowing</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_200">that's</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_201">not</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_202">you</span> as <span class="blsp-spelling-error" id="SPELLING_ERROR_203">you're</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_204">looking</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_205">from</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_206">your</span> original self. <span class="blsp-spelling-error" id="SPELLING_ERROR_207">So</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_208">that</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_209">person</span> *<span class="blsp-spelling-error" id="SPELLING_ERROR_210">isn't</span>* <span class="blsp-spelling-error" id="SPELLING_ERROR_211">you</span>. But in <span class="blsp-spelling-error" id="SPELLING_ERROR_212">the</span> normal <span class="blsp-spelling-error" id="SPELLING_ERROR_213">Transmat</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_214">process</span>, <span class="blsp-spelling-error" id="SPELLING_ERROR_215">you</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_216">would</span> have <span class="blsp-spelling-error" id="SPELLING_ERROR_217">been</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_218">converted</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_219">to</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_220">energy</span> and <span class="blsp-spelling-error" id="SPELLING_ERROR_221">this</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_222">other</span> "<span class="blsp-spelling-error" id="SPELLING_ERROR_223">you</span>" <span class="blsp-spelling-error" id="SPELLING_ERROR_224">will</span> go about <span class="blsp-spelling-error" id="SPELLING_ERROR_225">convinced</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_226">he's</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_227">still</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_228">the</span> original, <span class="blsp-spelling-error" id="SPELLING_ERROR_229">just</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_230">arriving</span> in a <span class="blsp-spelling-error" id="SPELLING_ERROR_231">new</span> location.</div><div><br /></div><div><span class="blsp-spelling-error" id="SPELLING_ERROR_232">So</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_233">you</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_234">can</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_235">see</span>, transporter <span class="blsp-spelling-error" id="SPELLING_ERROR_236">technology</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_237">kills</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_238">people</span> and <span class="blsp-spelling-error" id="SPELLING_ERROR_239">then</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_240">creates</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_241">people</span>, but <span class="blsp-spelling-error" id="SPELLING_ERROR_242">does</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_243">it</span> in a <span class="blsp-spelling-error" id="SPELLING_ERROR_244">way</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_245">to</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_246">give</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_247">the</span> illusion <span class="blsp-spelling-error" id="SPELLING_ERROR_248">of</span> "<span class="blsp-spelling-error" id="SPELLING_ERROR_249">transportation</span>". <span class="blsp-spelling-error" id="SPELLING_ERROR_250">It</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_251">should</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_252">really</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_253">be</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_254">called</span> "<span class="blsp-spelling-error" id="SPELLING_ERROR_255">destroying</span> and <span class="blsp-spelling-error" id="SPELLING_ERROR_256">cloning</span>". <span class="blsp-spelling-error" id="SPELLING_ERROR_257">I'll</span> stick <span class="blsp-spelling-error" id="SPELLING_ERROR_258">with</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_259">the</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_260">subterranean</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_261">highways</span> for <span class="blsp-spelling-error" id="SPELLING_ERROR_262">now</span>, <span class="blsp-spelling-error" id="SPELLING_ERROR_263">at</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_264">least</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_265">until</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_266">this</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_267">promising</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_268">portal</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_269">technology</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_270">becomes</span> <span class="blsp-spelling-error" id="SPELLING_ERROR_271">mainstream</span>.</div>Unknownnoreply@blogger.com2tag:blogger.com,1999:blog-3748796162921493670.post-55151452508801292492010-08-06T22:00:00.003+01:002010-08-11T16:19:58.486+01:00GMail + Lab Features = perfect PostgreSQL Mailing list manager<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg6byeLPndb1HTxerlMrIOCJaPnSq2LhjCChKQ-qTq2i_pUqkcaHsGr8muhO9ubKTzOnoQuSHEI5UooSdjtP600DlR13fxFUfUZHz5SGEVDz5aD3N5Z8I1V_nIXUo8SePSznX7PbrindGYq/s1600/gmail_nested_labels.png"><img style="float: right; margin: 0pt 0pt 10px 10px; cursor: pointer; width: 88px; height: 400px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg6byeLPndb1HTxerlMrIOCJaPnSq2LhjCChKQ-qTq2i_pUqkcaHsGr8muhO9ubKTzOnoQuSHEI5UooSdjtP600DlR13fxFUfUZHz5SGEVDz5aD3N5Z8I1V_nIXUo8SePSznX7PbrindGYq/s400/gmail_nested_labels.png" alt="" id="BLOGGER_PHOTO_ID_5504172278478608114" border="0" /></a><br />I've been following many PostgreSQL mailing lists for some time now, and because so many emails arrive in my inbox every hour, they need organising. Until recently I've used a filter on my email to label everything to and from *@postgresql.org with the label "PostgreSQL", to mark it as read and to archive it (i.e. skip the inbox). That way I could just click on the PostgreSQL label and view all the emails. But I do subscribe to a LOT of the groups, and they're all bundled into the same label.<div><br /><div>So comes Gmail Lab features to the rescue, namely "<b>Nested Labels</b>" and "<b>Hide Labels from Subjects</b>". This does require a fair amount of initial setup, but only needs to be done once. Basically Nested Labels allows for a tree-like structure of labels defined by /parent/child/grandchild/ etc. For example, I've set up a new filter for all emails to pgsql-general to be labelled PostgreSQL/General which puts it in the General label beneath the PostgreSQL label. (see the image to the right) In my case, I've also given them different shades depending on how hardcore the category is, with Novice as light blue, General as blue and Hackers as dark blue, but that's just purely cosmetic.</div><div><br /></div><div>If you're at all familiar with GMail labels, you'll know that they appear next to the subject line of messages too, but with this new setup, it can become problematic. Not only am I still continuing to apply the original PostgreSQL label to all of them, but I'm applying at least 1 new one and potentially several others if the message was also sent to other groups. This means it becomes difficult to read the subject lines. And this is where "Hide Labels from Subjects" comes in, which prevents tags from taking up subject space by removing them from it completely.</div><div><br /></div><div>I should also mention that there's another GMail Lab Feature I've enabled for using these mailing lists too: "<b>Default 'Reply to all</b>'" as whenever you're replying to someone on a mailing list, one should always copy in the entire list too, otherwise only the sender sees your message.</div><div><br /></div><div>And not strictly related to mailing lists, but any email, another Lab Feature I find worryingly useful is "<b>Undo Send</b>". This, by default, gives you 10 seconds to change your mind about sending an email. It provides an "Undo" link upon sending a message. If you click it within 10 seconds of sending, it goes back into edit mode and doesn't send it. However, if you navigate elsewhere in your inbox during this time, you'll lose that opportunity. I didn't think I'd use it that much to start off with, but I'm surprised at how often I've looked at my email after sending and realised there was either a mistake or something I missed out.</div></div>Unknownnoreply@blogger.com0