Postgres not using index when ORDER BY and LIMIT when LIMIT above X
I have been trying to debug an issue with postgres where it decides to not use an index when LIMIT is above a specific value.
For example I have a table of 150k rows and when searching with LIMIT of 286 it uses the index while with LIMIT above 286 it does not.
LIMIT 286 uses index
db=# explain (analyze, buffers) SELECT * FROM tempz.tempx AS r INNER JOIN tempz.tempy AS z ON (r.id_tempy=z.id) WHERE z.int_col=2000 AND z.string_col='temp_string' ORDER BY r.name ASC, r.type ASC, r.id ASC LIMIT 286;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.56..5024.12 rows=286 width=810) (actual time=0.030..0.992 rows=286 loops=1)
Buffers: shared hit=921
-> Nested Loop (cost=0.56..16968.23 rows=966 width=810) (actual time=0.030..0.977 rows=286 loops=1)
Join Filter: (r.id_tempy = z.id)
Rows Removed by Join Filter: 624
Buffers: shared hit=921
-> Index Scan using tempz_tempx_name_type_id_idx on tempx r (cost=0.42..14357.69 rows=173878 width=373) (actual time=0.016..0.742 rows=910 loops=1)
Buffers: shared hit=919
-> Materialize (cost=0.14..2.37 rows=1 width=409) (actual time=0.000..0.000 rows=1 loops=910)
Buffers: shared hit=2
-> Index Scan using tempy_string_col_idx on tempy z (cost=0.14..2.37 rows=1 width=409) (actual time=0.007..0.008 rows=1 loops=1)
Index Cond: (string_col = 'temp_string'::text)
Filter: (int_col = 2000)
Buffers: shared hit=2
Planning Time: 0.161 ms
Execution Time: 1.032 ms
(16 rows)
vs.
LIMIT 287 doing sort
db=# explain (analyze, buffers) SELECT * FROM tempz.tempx AS r INNER JOIN tempz.tempy AS z ON (r.id_tempy=z.id) WHERE z.int_col=2000 AND z.string_col='temp_string' ORDER BY r.name ASC, r.type ASC, r.id ASC LIMIT 287;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=4976.86..4977.58 rows=287 width=810) (actual time=49.802..49.828 rows=287 loops=1)
Buffers: shared hit=37154
-> Sort (cost=4976.86..4979.27 rows=966 width=810) (actual time=49.801..49.813 rows=287 loops=1)
Sort Key: r.name, r.type, r.id
Sort Method: top-N heapsort Memory: 506kB
Buffers: shared hit=37154
-> Nested Loop (cost=0.42..4932.59 rows=966 width=810) (actual time=0.020..27.973 rows=51914 loops=1)
Buffers: shared hit=37154
-> Seq Scan on tempy z (cost=0.00..12.70 rows=1 width=409) (actual time=0.006..0.008 rows=1 loops=1)
Filter: ((int_col = 2000) AND (string_col = 'temp_string'::text))
Rows Removed by Filter: 2
Buffers: shared hit=1
-> Index Scan using tempx_id_tempy_idx on tempx r (cost=0.42..4340.30 rows=57959 width=373) (actual time=0.012..17.075 rows=51914 loops=1)
Index Cond: (id_tempy = z.id)
Buffers: shared hit=37153
Planning Time: 0.258 ms
Execution Time: 49.907 ms
(17 rows)
Update:
This is Postgres 11 and VACUUM ANALYZE
is run daily. Also, I have already tried to use CTE to remove the filter but the problem is the sorting specifically
-> Sort (cost=4976.86..4979.27 rows=966 width=810) (actual time=49.801..49.813 rows=287 loops=1)
Sort Key: r.name, r.type, r.id
Sort Method: top-N heapsort Memory: 506kB
Buffers: shared hit=37154
Update 2:
After running VACUUM ANALYZE
the database starts using the index for some hours and then it goes back to not using it.
Comments
Post a Comment