Eliminating full table scans for slow paginated SQL queries
Sem Postma, September 9, 2023
You often do not need to know the total page count. Here is an easy way to rewrite your SQL queries with the potential to drastically improve performance.
If your ORM is returning paginated responses it will often run an additional query calculating the total page count. You may also be doing this yourself if you are not using an ORM.
# normal query:
SELECT * FROM pages
WHERE view_count > 54321 AND updated_at > '2020-01-01' LIMIT 10 OFFSET 10
# query which retrieves total number of records:
SELECT COUNT(*) FROM pages
WHERE view_count > 54321 AND updated_at > '2020-01-01'
This is completely fine for small/medium size databases but for big databases, this won’t scale. Although databases can do a lot of optimizations, these queries can get very slow, especially without any indexes because the database will often do a full table scan. Most ORMs provide the option to disable these expensive queries. You won’t get the total amount of pages but that may not be an issue for your use case. Often your application will display several pages and a next button on the bottom of your page like this:
You should consider if it’s really necessary to show the last page (744). If you are always displaying a maximum of 10 pages you can very easily eliminate a full count like this:
# 10 records per page
SELECT COUNT(a.id) FROM (
SELECT id FROM pages
WHERE view_count > 521 AND updated_at > '2020-01-01'
LIMIT 100 OFFSET 10
) AS a
Now you are only ever counting a 100 rows max and assuming 10 record per page, you are only fetching the next 10 pages instead of the entire result set.