So far, we have returned the full content of our tables. However, as our database grows, this might prove not to be the best approach in terms of performance. A popular solution is to serve the data in chunks by presenting multiple pages or implementing infinite scrolling. In this article, we implement its back-end aspect using NestJS and PostgreSQL. We also compare various approaches to achieving it and point out their advantages and disadvantages.
You can find the code from this article in this repository.
Offset and limit
Let’s start by investigating this simple query:
1SELECT id, title FROM postsIt returns all of the records from the posts table.
A significant thing to acknowledge is that the order of the above rows is not guaranteed. However, when implementing pagination, we depend on the order of rows to be predictable. Therefore, we should use the ORDER BY clause.
1SELECT id, title FROM posts
2ORDER BY id ASCTo start paginating our data, we need to limit the number of rows in our query. To do that, we need the LIMIT statement.
1SELECT id, title FROM posts
2ORDER BY id ASC
3LIMIT 10Thanks to the above, we now get the first ten items instead all of them. This allows us to present the user with the first page of results.
To serve the second page of the data, we need to specify the starting point of our query. We can use the OFFSET keyword to specify how many rows we want to skip.
1SELECT id, title FROM posts
2ORDER BY id ASC
3OFFSET 10
4LIMIT 10Above, we omit the first ten posts and get ten posts in the results. In our case, it gives us entities with ids from 11 to 20. This is where the order of our data plays a significant role. We can easily modify it by changing the ORDER BY clause, but keeping some order is important.
Counting the number of rows
It is a common approach to display the number of data pages to the user. For example, if we have fifty rows and display ten per page, we have five data pages.
To do the above, we need to know the number of rows of data in our table. To do that, we can use the COUNT keyword.
1SELECT COUNT(*) AS all_posts_count FROM postsWe can use the COUNT keyword while selecting data from some columns. When doing that, we need to specify the section of the data we are counting by partitioning it. For example, we can count the number of posts by a certain author.
1SELECT author_id, COUNT(*) OVER (PARTITION BY author_id) AS author_posts_count FROM postsTo present the results in a readable way, let’s only display one row per author using the DISTINCT keyword.
1SELECT DISTINCT author_id, COUNT(*) OVER (PARTITION BY author_id) AS author_posts_count FROM postsAbove, we can see that the author with id 3 wrote two posts, and the author with id 2 wrote forty posts.
In our case, we want to count the total number of posts. However, even though that’s the case, we still need to use the OVER clause.
1SELECT id, title, COUNT(*) OVER() AS total_posts_count FROM posts
2ORDER BY id ASC
3OFFSET 10
4LIMIT 10Grouping and partitioning data with the OVER() function is a good topic for a separate article.
The whole idea is to count the number of rows and fetch their details in the same transaction to keep the integrity of the data. When we run a single query, PostgreSQL wraps it in a transaction out of the box.
We can define a transaction separately if we want to count the posts in a separate SELECT statement.
1BEGIN;
2 SELECT id, title FROM posts
3 ORDER BY id ASC
4 OFFSET 10
5 LIMIT 10;
6
7 SELECT COUNT(*) AS total_posts_count FROM posts;
8COMMIT;If you want to know more about transactions, check out API with NestJS #76. Working with transactions using raw SQL queries
It is also important to notice that PostgreSQL returns the result of COUNT as big int. The maximum value of a regular integer is 2³¹⁻¹ (2,147,483,647), and for a big integer, it is 2⁶³⁻¹ (9,223,372,036,854,775,807).
Unfortunately, JavaScript does not know how to parse big integers to JSON out of the box.
1const data = {
2 value: BigInt(10)
3}
4
5JSON.stringify(data);Uncaught TypeError: Do not know how to serialize a BigInt
If we don’t expect our table to hold more than 2,147,483,647 elements, we can cast the result of COUNT(*) to a regular integer.
1SELECT COUNT(*) OVER()::int AS total_posts_count FROM postsImplementing offset pagination with NestJS
When implementing the offset pagination with NestJS, we expect the user to provide the offset and limit as query parameters. To handle that, we can create a designated class.
1import { IsNumber, Min, IsOptional } from 'class-validator';
2import { Type } from 'class-transformer';
3
4class PaginationParams {
5 @IsOptional()
6 @Type(() => Number)
7 @IsNumber()
8 @Min(0)
9 offset?: number;
10
11 @IsOptional()
12 @Type(() => Number)
13 @IsNumber()
14 @Min(1)
15 limit?: number;
16}
17
18export default PaginationParams;We then use it in our controller.
1import {
2 ClassSerializerInterceptor,
3 Controller,
4 Get,
5 Query,
6 UseInterceptors,
7} from '@nestjs/common';
8import { PostsService } from './posts.service';
9import GetPostsByAuthorQuery from './getPostsByAuthorQuery';
10import PaginationParams from '../utils/paginationParams';
11
12@Controller('posts')
13@UseInterceptors(ClassSerializerInterceptor)
14export default class PostsController {
15 constructor(private readonly postsService: PostsService) {}
16
17 @Get()
18 getPosts(
19 @Query() { authorId }: GetPostsByAuthorQuery,
20 @Query() { offset, limit }: PaginationParams,
21 ) {
22 return this.postsService.getPosts(authorId, offset, limit);
23 }
24
25 // ...
26}The last step is to implement the logic in our PostsRepository class.
1import {
2 Injectable,
3} from '@nestjs/common';
4import DatabaseService from '../database/database.service';
5import PostModel from './post.model';
6
7@Injectable()
8class PostsRepository {
9 constructor(private readonly databaseService: DatabaseService) {}
10
11 async get(offset = 0, limit: number | null = null) {
12 const databaseResponse = await this.databaseService.runQuery(
13 `
14 SELECT id, title, COUNT(*) OVER()::int AS total_posts_count FROM posts
15 ORDER BY id ASC
16 OFFSET $1
17 LIMIT $2
18 `,
19 [offset, limit],
20 );
21 const items = databaseResponse.rows.map(
22 (databaseRow) => new PostModel(databaseRow),
23 );
24 const count = databaseResponse.rows[0]?.total_posts_count || 0;
25 return {
26 items,
27 count,
28 };
29 }
30
31 // ...
32}
33
34export default PostsRepository;A significant thing above is that we provide default values for offset and limit:
- providing for offset means that we don’t intend to skip any rows,
- by setting the limit to null, we state that we don’t want to limit the results.
Doing all of the above, we end up with fully functional offset pagination.
Disadvantages
The offset and limit approach to pagination is widely used. Unfortunately, it has some significant disadvantages.
The most important caveat is that the database needs to compute all of the rows skipped by the OFFSET keyword. This can take a toll on the performance:
- first, the database sorts all of the rows as specified in the ORDER BY clause,
- then, PostgreSQL drops the number of rows specified in the OFFSET.
Aside from the above issue, we can run into a problem with consistency:
- the first user fetches page number one with posts,
- the second user creates a new post that ends up on page number one,
- the first user fetches the second page.
Unfortunately, the above operations cause the first user to see the last element of the first page again on the second page. Besides that, the user missed the element added to the first page.
Advantages
The offset approach is very common and straightforward to implement. It is also very easy to change the column we use for sorting, including multiple columns. It makes it an acceptable solution in many cases, especially if the offset is not expected to be big and the data inconsistencies are acceptable.
Keyset pagination
We can take another approach to pagination by filtering out the data we’ve already seen using the WHERE keyword instead of OFFSET. First, let’s run the following query:
1SELECT id, title FROM posts
2ORDER BY id ASC
3LIMIT 10In the results, we can see that the last post has an id of 10. We can now use this knowledge to request posts with the id bigger than 10.
1SELECT id, title FROM posts
2WHERE id > 10
3ORDER BY id ASC
4LIMIT 10To get the next page of results, we need to inspect the above results and notice that the id of the last row is 20. We can use that to modify our WHERE clause.
1SELECT id, title FROM posts
2WHERE id > 20
3ORDER BY id ASC
4LIMIT 10Unfortunately, this exposes the most significant disadvantages of the keyset pagination. To get a chunk of data, we need to know the id of the last element of the previous chunk. This makes traversing more than one page at once impossible.
To change the column by which we order our elements, we need to modify both ORDER BY and WHERE clauses.
Counting the number of rows
It is crucial to notice that using the WHERE clause affects the rows counted with COUNT(*). To deal with this issue, we need to count the rows separately. We can create an explicit transaction or use a Common Table Expression query using the WITH statement.
1WITH selected_posts AS (
2 SELECT id, title FROM posts
3 WHERE id > 10
4 ORDER BY id ASC
5 LIMIT 10
6),
7total_posts_count_response AS (
8 SELECT COUNT(*) AS total_posts_count FROM posts
9)
10SELECT * FROM selected_posts, total_posts_count_responseImplementing keyset pagination with NestJS
First, let’s modify our PaginationParams class to accept an additional query parameter.
1import { IsNumber, Min, IsOptional } from 'class-validator';
2import { Type } from 'class-transformer';
3
4class PaginationParams {
5 @IsOptional()
6 @Type(() => Number)
7 @IsNumber()
8 @Min(0)
9 offset?: number;
10
11 @IsOptional()
12 @Type(() => Number)
13 @IsNumber()
14 @Min(1)
15 limit?: number;
16
17 @IsOptional()
18 @Type(() => Number)
19 @IsNumber()
20 @Min(1)
21 idsToSkip?: number;
22}
23
24export default PaginationParams;We also need to modify our PostsRepository to handle the additional parameter.
1import { Injectable } from '@nestjs/common';
2import DatabaseService from '../database/database.service';
3import PostModel from './post.model';
4
5@Injectable()
6class PostsRepository {
7 constructor(private readonly databaseService: DatabaseService) {}
8
9 async get(offset = 0, limit: number | null = null, idsToSkip = 0) {
10 const databaseResponse = await this.databaseService.runQuery(
11 `
12 WITH selected_posts AS (
13 SELECT * FROM posts
14 WHERE id > $3
15 ORDER BY id ASC
16 OFFSET $1
17 LIMIT $2
18 ),
19 total_posts_count_response AS (
20 SELECT COUNT(*)::int AS total_posts_count FROM posts
21 )
22 SELECT * FROM selected_posts, total_posts_count_response
23 `,
24 [offset, limit, idsToSkip],
25 );
26 const items = databaseResponse.rows.map(
27 (databaseRow) => new PostModel(databaseRow),
28 );
29 const count = databaseResponse.rows[0]?.total_posts_count || 0;
30 return {
31 items,
32 count,
33 };
34 }
35
36 // ...
37}
38
39export default PostsRepository;Disadvantages
The most apparent disadvantage of the keyset pagination is that the users need to know the id of the row they want to start with. However, we could overcome that by mixing the offset-based pagination with the keyset pagination.
Additionally, the column used in the WHERE clause should have an index for an additional performance boost. Fortunately, PostgreSQL creates an index for every primary key out of the box. Therefore, the keyset pagination should perform well when using ids.
Also, ordering the results by text fields might not be straightforward if we want to use natural sorting. If you want to know more, check out this answer on StackOverflow.
Advantages
The keyset pagination can be a significant performance improvement over the offset-based approach. It also solves the data inconsistency issue we can experience with offset pagination. The user adding or removing elements between fetching chunks of data does not cause elements to be duplicated or skipped.
Summary
In this article, we’ve gone through two different approaches to pagination with PostgreSQL. After pointing out their advantages and disadvantages, we can conclude that each can be a reasonable solution. The keyset pagination is more restrictive but can provide a performance boost. Fortunately, we can mix different ways of paginating the data, and combining the offset and keyset pagination can cover a wide variety of cases.