Postgres not using index when ORDER BY and LIMIT when LIMIT above X

By Ritesh Sahu - October 28, 2022

I have been trying to debug an issue with postgres where it decides to not use an index when LIMIT is above a specific value.

For example I have a table of 150k rows and when searching with LIMIT of 286 it uses the index while with LIMIT above 286 it does not.

LIMIT 286 uses index

db=# explain (analyze, buffers) SELECT * FROM tempz.tempx AS r INNER JOIN tempz.tempy AS z ON (r.id_tempy=z.id) WHERE z.int_col=2000 AND z.string_col='temp_string' ORDER BY r.name ASC, r.type ASC, r.id ASC LIMIT 286;
                                                                          QUERY PLAN                                                                           
---------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.56..5024.12 rows=286 width=810) (actual time=0.030..0.992 rows=286 loops=1)
   Buffers: shared hit=921
   ->  Nested Loop  (cost=0.56..16968.23 rows=966 width=810) (actual time=0.030..0.977 rows=286 loops=1)
         Join Filter: (r.id_tempy = z.id)
         Rows Removed by Join Filter: 624
         Buffers: shared hit=921
         ->  Index Scan using tempz_tempx_name_type_id_idx on tempx r  (cost=0.42..14357.69 rows=173878 width=373) (actual time=0.016..0.742 rows=910 loops=1)
               Buffers: shared hit=919
         ->  Materialize  (cost=0.14..2.37 rows=1 width=409) (actual time=0.000..0.000 rows=1 loops=910)
               Buffers: shared hit=2
               ->  Index Scan using tempy_string_col_idx on tempy z  (cost=0.14..2.37 rows=1 width=409) (actual time=0.007..0.008 rows=1 loops=1)
                     Index Cond: (string_col = 'temp_string'::text)
                     Filter: (int_col = 2000)
                     Buffers: shared hit=2
 Planning Time: 0.161 ms
 Execution Time: 1.032 ms
(16 rows)

vs.

LIMIT 287 doing sort

db=# explain (analyze, buffers) SELECT * FROM tempz.tempx AS r INNER JOIN tempz.tempy AS z ON (r.id_tempy=z.id) WHERE z.int_col=2000 AND z.string_col='temp_string' ORDER BY r.name ASC, r.type ASC, r.id ASC LIMIT 287;
                                                                         QUERY PLAN                                                                          
-------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=4976.86..4977.58 rows=287 width=810) (actual time=49.802..49.828 rows=287 loops=1)
   Buffers: shared hit=37154
   ->  Sort  (cost=4976.86..4979.27 rows=966 width=810) (actual time=49.801..49.813 rows=287 loops=1)
         Sort Key: r.name, r.type, r.id
         Sort Method: top-N heapsort  Memory: 506kB
         Buffers: shared hit=37154
         ->  Nested Loop  (cost=0.42..4932.59 rows=966 width=810) (actual time=0.020..27.973 rows=51914 loops=1)
               Buffers: shared hit=37154
               ->  Seq Scan on tempy z  (cost=0.00..12.70 rows=1 width=409) (actual time=0.006..0.008 rows=1 loops=1)
                     Filter: ((int_col = 2000) AND (string_col = 'temp_string'::text))
                     Rows Removed by Filter: 2
                     Buffers: shared hit=1
               ->  Index Scan using tempx_id_tempy_idx on tempx r  (cost=0.42..4340.30 rows=57959 width=373) (actual time=0.012..17.075 rows=51914 loops=1)
                     Index Cond: (id_tempy = z.id)
                     Buffers: shared hit=37153
 Planning Time: 0.258 ms
 Execution Time: 49.907 ms
(17 rows)

Update:

This is Postgres 11 and VACUUM ANALYZE is run daily. Also, I have already tried to use CTE to remove the filter but the problem is the sorting specifically

->  Sort  (cost=4976.86..4979.27 rows=966 width=810) (actual time=49.801..49.813 rows=287 loops=1)
         Sort Key: r.name, r.type, r.id
         Sort Method: top-N heapsort  Memory: 506kB
         Buffers: shared hit=37154

Update 2:

After running VACUUM ANALYZE the database starts using the index for some hours and then it goes back to not using it.

Search This Blog

Theprogrammersfirst | A technical portal.

Postgres not using index when ORDER BY and LIMIT when LIMIT above X

Comments

Post a Comment

Popular posts from this blog

Today Walkin 14th-Sept

Hibernate Search - Elasticsearch with JSON manipulation

Spring Elasticsearch Operations