Wednesday, March 12, 2014

Learning PostgreSql: different behavior with READ UNCOMMITTED/COMMITTED

Because of multi-valued concurrency control aka MVCC, there are many differences in queries' behavior with different isolation levels. Before running examples, let us set up test data.

Setting up test data

The following script sets up the data we shall be playing with:

DROP TABLE test;CREATE TABLE test(ID INT NOT NULL, CONSTRAINT PK_test PRIMARY KEY(ID),col2 INT NOT NULL, col3 FLOAT);INSERT INTO test(IDcol2col3)VALUES(1,0,0),(8,1,1); 

There are no dirty reads.

In one session, let us insert an uncommitted row:

BEGIN TRANSACTION;
INSERT INTO test(IDcol2col3)VALUES(3,1,1); 

In another session let us try to read uncommitted data:

BEGIN ISOLATION LEVEL READ UNCOMMITTED;SELECT FROM test;1;0;0
8
;1;1 

Clearly, we do not see the uncommitted row inserted by another session. This is exactly as documented:
"When you select the level Read Uncommitted you really get Read Committed, and phantom reads are not possible in the PostgreSQL implementation of Repeatable Read, so the actual isolation level might be stricter than what you select"

READ COMMITTED behaves like READ_COMMITTED_SNAPSHOT

At this point we still have an uncommitted row. Let up keep the inserting transaction open, commit the selecting one, and begin another selecting transaction:

BEGIN ISOLATION LEVEL READ COMMITTED;SELECT FROM test;1;0;0
8
;1;1  

This SELECT is working exactly as the previous one, exactly as documented. This behavior is different from ow SQL Server executes SELECT with READ COMMITTED, where the SELECT would be blocked. However, this is consistent with how SQL Server executes SELECTs under READ_COMMITTED_SNAPSHOT.
If we commit the insert and rerun the select without committing the selecting transaction, we shall see the new row:

1;0;0
8;1;1
3;1;1 

Again, this is consistent with how SQL Server executes SELECTs under READ_COMMITTED_SNAPSHOT.

Multi-statement functions and READ COMMITTED

When multi-statement functions run under READ COMMITTED isolation level, each statement can see the committed data as of the beginning of the statement, and not as of the beginning of function execution. This is consistent with how SQL Server executes SELECTs under READ_COMMITTED_SNAPSHOT.
The following function selects two snapshots of data ten seconds apart:

CREATE OR REPLACE FUNCTION two_snapshots()RETURNS TABLE(test_id intsnapshot_name varcharAS $BODY$BEGIN
    CREATE TEMP TABLE 
before_sleep AS
    SELECT 
ID from test;
    
PERFORM pg_sleep(10);
    
RETURN QUERY SELECT IDCAST('before sleep' AS VARCHARFROM before_sleep
        
UNION ALL
                 
SELECT ID'after sleep' FROM test;
    
DROP TABLE before_sleep;END;$BODY$
LANGUAGE plpgsql
; 

Let us open a window and cut and paste the following modification, but not run it yet:  

INSERT INTO test(IDcol2col3)VALUES(16,1,1);

In another session, let us invoke the function:

SELECT FROM two_snapshots(); 

We have ten seconds to run the insert - that should be more than enough. Once the function completes, it returns the following result, including the row inserted after the function has been invoked:
1;"before sleep"
8;"before sleep"
3;"before sleep"
1;"after sleep"
8;"after sleep"
3;"after sleep"
16;"after sleep" 
As we have seen, the data modified after the function has been invoked is still visible to commands in the function, as long as the data has been committed before the command started.
There is much more to discuss about isolation levels in PostgreSql - we shall continue later.

No comments:

Post a Comment