What is `FOR UPDATE SKIP LOCKED` and how it can impact your query plans
You can use SELECT ... FOR UPDATE SKIP LOCKED
as an very easy way to
parallelize task processing in your application with minimal overhead. But be
careful of the impact this can have on your query plans!
What is SELECT ... FOR UPDATE SKIP LOCKED
Introduced in Postgres 9.5 - the SKIP LOCKED
syntax can be tacked at the end
of a SELECT ... FOR UPDATE
statement and will return only rows which are not
locked by any other ongoing transaction.
The normal behaviour of SELECT ... FOR UPDATE
operation is to wait for the
locked rows to be returned, thus making it quite difficult for the user to
parallelize such SELECT
statements on a single table without having appropriate
mechanisms to deal with either updating rows locked, or targeting different
rows by an additional attribute (eg: partitioning by modulo).
The introduction of SKIP LOCKED
, enabled a very powerful way in which users
can parallelize their processing without having to worry about writing locking
mechanisms in their application, and can instead rely on the robust locking
mechanism provided by Postgres.
Table Setup
Given a simple table structure like:
CREATE TABLE t_random AS
SELECT
uuid_generate_v4(),
ROUND(RANDOM()*100) AS c1,
ROUND(RANDOM()*1000) AS c2,
ROUND(RANDOM()*100) AS c3
FROM GENERATE_SERIES(1,100000) s;
On this table, lets apply a status
column - and assume that each row in the
table is a job that would need to be executed.
ALTER TABLE your_table_name ADD COLUMN status VARCHAR(255);
UPDATE your_table_name
SET status =
CASE
WHEN (c1 + c2 + c3) < 300 THEN 'PENDING'
WHEN (c1 + c2 + c3) BETWEEN 300 AND 500 THEN 'INPROGRESS'
ELSE 'COMPLETED'
END;
Running in 2 side by side transactions you can now experiment with the
SELECT ... FOR UPDATE
and SKIP LOCKED
.
-- session 1
begin;
SELECT id FROM t_random WHERE status='PENDING' ORDER BY id LIMIT 100 FOR UPDATE SKIP LOCKED;
-- the above will return 100 rows.
-- session 2
begin;
SELECT id FROM t_random WHERE status='PENDING' ORDER BY ID LIMIT 100 FOR UPDATE SKIP LOCKED;
-- this will return a different set of 100 rows, allowing you to processes these
-- in parallel.
end; -- the 100 rows are unlocked and are now VISIBLE to other sesions.
Row locking under the hood
So how does row locking actually work under the hood? When running a
SELECT ... FOR UPDATE
query Postgres will find the targeted rows and then
use a LockTupleExclusive
row lock, which in turn will update the header of the
tuple with xmin
and xmax
values. These values affect which transactions
can view the row. This requires both a write to disk, which will persist
the tuple change, and also it writes the lock in shared memory and the lock
is now viewable via the pg_locks
view.
This immediately highlights that row locking is an operation which has I/O,
it also shows that to check for row locks the database engine needs to read
the tuple information. This means we can no longer have Index Only Scan
queries,
which will negatively affect performance.
You can put an EXPLAIN ANALYZE
when running the above queries to have a peek
under the hood:
EXPLAIN ANALYZE SELECT id FROM t_random WHERE status = 'PENDING' ORDER BY ID LIMIT 100;
----------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..22.91 rows=100 width=16) (actual time=4.240..4.642 rows=100 loops=1)
-> Index Only Scan using trand_status on t_random (cost=0.42..80932.71 rows=359853 width=16) (actual time=4.237..4.630 rows=100 loops=1)
Heap Fetches: 91
Planning Time: 0.399 ms
Execution Time: 4.687 ms
(5 rows)
EXPLAIN ANALYZE SELECT id FROM t_random WHERE status = 'PENDING' ORDER BY ID LIMIT 100 FOR UPDATE SKIP LOCKED;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..41.94 rows=100 width=22) (actual time=0.552..0.780 rows=100 loops=1)
-> LockRows (cost=0.42..149404.06 rows=359853 width=22) (actual time=0.551..0.769 rows=100 loops=1)
-> Index Scan using trand_status on t_random (cost=0.42..145805.53 rows=359853 width=22) (actual time=0.536..0.676 rows=100 loops=1)
Filter: ((status)::text = 'PENDING'::text)
Planning Time: 0.212 ms
Execution Time: 0.813 ms
(6 rows)
The difference in the query plan between a query without FOR UPDATE
and one
with FOR UPDATE
is the presence of the LockRows
statement which forces PG
to check the row tuple headers - meaning we now doing 2 reads - one from the
index, and one from the table.
Recommendation
It is recommended to only lock a small number of records in one transaction. This is because the row locking has both a disk I/O cost as well as a memory overhead, as the locks get stored in shared memory.
If a large number of rows are locked in a table, the query performance will
start to noticeably drop and trend towards more of a SEQSCAN
performance
level due to the sheer number of rows it has to skip.
TL;DR - be mindful that sometimes although you have an index, the query plan can be changed underneath you and it can result in a seq scan
.