Learn NestJS - 188 In-Depth Articles

Marcin WanagoSeptember 2, 2024

JavaScriptNestJS

As our database grows, focusing more on its performance is important. SQL indexes act as guides within the database, helping to speed up data retrieval. In this article, we cover how to create indexes using PostgreSQL, Drizzle ORM, and NestJS and explain what indexes are.

Why we might need indexes

In the previous parts of this series, we’ve created the following database schema:

database-schema.ts

1import { serial, text, integer, pgTable } from 'drizzle-orm/pg-core';
2 
3export const users = pgTable('users', {
4  id: serial('id').primaryKey(),
5  email: text('email').unique(),
6  name: text('name'),
7  password: text('password'),
8});
9 
10export const articles = pgTable('articles', {
11  id: serial('id').primaryKey(),
12  title: text('title'),
13  content: text('content'),
14  authorId: integer('author_id').references(() => users.id),
15});
16 
17// ...
18 
19export const databaseSchema = {
20  articles,
21  users,
22};

There is a good chance that we might want to find all the articles written by a particular author.

articles.service.ts

1import { Injectable } from '@nestjs/common';
2import { DrizzleService } from '../database/drizzle.service';
3import { databaseSchema } from '../database/database-schema';
4import { eq } from 'drizzle-orm';
5 
6@Injectable()
7export class ArticlesService {
8  constructor(private readonly drizzleService: DrizzleService) {}
9 
10  getByAuthorId(authorId: number) {
11    return this.drizzleService.db
12      .select()
13      .from(databaseSchema.articles)
14      .where(eq(databaseSchema.articles.authorId, authorId));
15  }
16 
17  //
18}

When we use where(eq(databaseSchema.articles.authorId, authorId)), PostgreSQL needs to scan the whole articles table to find the appropriate rows. We can verify that with the EXPLAIN ANALYZE query.

1EXPLAIN ANALYZE
2SELECT * FROM articles
3WHERE author_id = 1

We can see that PostgreSQL performs a sequential scan, reading each row in the database to find the data that matches the filters. This approach can be slow, especially with large datasets. Let’s improve this by adding an index.

Introducing indexes

An SQL index works much like a book’s index, allowing the database to find information quickly. Let’s add an index on the author_id column to speed up our query.

database-schema.ts

1import { serial, text, integer, pgTable, index } from 'drizzle-orm/pg-core';
2 
3export const articles = pgTable(
4  'articles',
5  {
6    id: serial('id').primaryKey(),
7    title: text('title'),
8    content: text('content'),
9    authorId: integer('author_id').references(() => users.id),
10  },
11  (table) => ({
12    authorIdIndex: index('author_id_index').on(table.authorId),
13  }),
14);
15 
16// ...
17 
18export const databaseSchema = {
19  articles,
20  // ...
21};

Let’s take a look at our query now and see if there is a change

1EXPLAIN ANALYZE
2SELECT * FROM articles
3WHERE author_id = 1

We can see that now PostgreSQL does the index scan instead of a sequential scan that uses the index and speeds up the execution time a lot.

When we create an index, PostgreSQL begins maintaining a data structure organized around a certain database column. We can visualize this as key-value pairs.

PostgreSQL uses the B-tree data structure with each leaf pointing to a specific row.

With the index in place, PostgreSQL can quickly locate all articles written by a specific author because it has a structure sorted by the author’s ID. However, there are some downsides to using indexes.

Maintaining this additional data structure requires extra space in the database. Although indexes speed up data retrieval, maintaining an additional data structure introduces extra work for the database when inserting, updating, or deleting records. This added overhead means that we must carefully plan our indexes to avoid negatively impacting overall performance.

Unique indexes

In the previous parts of this series, we created a table to store each user where each person has a unique email.

database-schema.ts

1import { serial, text, integer, pgTable } from 'drizzle-orm/pg-core';
2 
3export const addresses = pgTable('addresses', {
4  id: serial('id').primaryKey(),
5  street: text('street'),
6  city: text('city'),
7  country: text('country'),
8});
9 
10export const users = pgTable('users', {
11  id: serial('id').primaryKey(),
12  email: text('email').unique(),
13  name: text('name'),
14  password: text('password'),
15  addressId: integer('address_id')
16    .unique()
17    .references(() => addresses.id),
18});
19 
20export const databaseSchema = {
21  addresses,
22  users,
23};

Since we marked the email column as unique, PostgreSQL prevents us from creating two users with the same email. The database automatically creates an index to improve performance when we mark a column as unique. Let’s verify that.

We also mark the address id as unique, because a particular address should belong to only one user. If you want to know more about how to define one-to-one relationships like that, check out API with NestJS #150. One-to-one relationships with the Drizzle ORM

1SELECT tablename, indexname
2FROM pg_indexes
3WHERE tablename='users';

Above, we can see that when we define a primary key, we also get a unique index.

Unique indexes not only help the database quickly search through existing users to determine if a particular email is unique but also improve the performance of various SELECT queries.

1EXPLAIN ANALYZE
2SELECT * FROM users
3WHERE email = 'john@smith.com'

Indexes with multiple columns

Our queries can include multiple filters. For example, we might want to find an article written by a specific user and includes a particular title.

1const articles = await this.drizzleService.db
2  .select()
3  .from(databaseSchema.articles)
4  .where(
5    and(
6      eq(databaseSchema.articles.authorId, 1),
7      eq(databaseSchema.articles.title, 'My first article'),
8    ),
9  );

Creating separate indexes on the author_id and title columns would speed up the above query. However, we can take it further and create a multi-column index.

database-schema.ts

1import { serial, text, integer, pgTable, index } from 'drizzle-orm/pg-core';
2 
3export const articles = pgTable(
4  'articles',
5  {
6    id: serial('id').primaryKey(),
7    title: text('title'),
8    content: text('content'),
9    authorId: integer('author_id').references(() => users.id),
10  },
11  (table) => ({
12    authorIdIndex: index('author_id_index').on(table.authorId),
13    authorIdTitleIndex: index('author_id_title_index').on(
14      table.authorId,
15      table.title,
16    ),
17  }),
18);
19 
20// ...
21 
22export const databaseSchema = {
23  articles,
24  // ...
25};

Types of indexes

So far, all of the indexes we created in this article use a B-tree structure. While it works well in most situations, there are other options available.

Generalized Inverted Indexes (GIN)

GIN indexes are particularly useful when dealing with complex data types like arrays or JSON. They can also be quite handy when implementing text search functionality.

database-schema.ts

1import { serial, text, pgTable, customType, index } from 'drizzle-orm/pg-core';
2import { sql, SQL } from 'drizzle-orm';
3 
4const tsvector = customType<{ data: unknown }>({
5  dataType() {
6    return 'tsvector';
7  },
8});
9 
10export const articles = pgTable(
11  'articles',
12  {
13    id: serial('id').primaryKey(),
14    title: text('title'),
15    content: text('content'),
16    textTsvector: tsvector('text_tsvector').generatedAlwaysAs(
17      (): SQL => sql`
18      setweight(to_tsvector('english', ${articles.title}), 'A') ||
19      setweight(to_tsvector('english', ${articles.content}), 'B')
20    `,
21    ),
22  },
23  (table) => {
24    return {
25      textTsvectorIndex: index('text_tsvector_index').using(
26        'gin',
27        table.textTsvector,
28      ),
29    };
30  },
31);
32 
33export const databaseSchema = {
34  articles,
35};

If you want to know more about implementing the full-text search functionality with the Drizzle ORM, take a look at API with NestJS #163. Full-text search with the Drizzle ORM and PostgreSQL

To make sure that the GIN index is available in our database, we may need to enable the pg_trgm and btree_gin extensions.

1CREATE EXTENSION pg_trgm;
2CREATE EXTENSION btree_gin;

Generalized Search Tree (GIST)

GIST indexes are useful for indexing geometric data and can also be used to implement the full-text search feature. In certain situations, they might be preferable over GIN indexes.

database-schema.ts

1import { serial, text, pgTable, customType, index } from 'drizzle-orm/pg-core';
2import { sql, SQL } from 'drizzle-orm';
3 
4const tsvector = customType<{ data: unknown }>({
5  dataType() {
6    return 'tsvector';
7  },
8});
9 
10export const articles = pgTable(
11  'articles',
12  {
13    id: serial('id').primaryKey(),
14    title: text('title'),
15    content: text('content'),
16    textTsvector: tsvector('text_tsvector').generatedAlwaysAs(
17      (): SQL => sql`
18      setweight(to_tsvector('english', ${articles.title}), 'A') ||
19      setweight(to_tsvector('english', ${articles.content}), 'B')
20    `,
21    ),
22  },
23  (table) => {
24    return {
25      textTsvectorIndex: index('text_tsvector_index').using(
26        'gist',
27        table.textTsvector,
28      ),
29    };
30  },
31);
32 
33export const databaseSchema = {
34  articles,
35};

We may need to enable the btree_gist extension in our database to use GIST indexes.

1CREATE EXTENSION btree_gist;

Hash indexes

Hash indexes in PostgreSQL are based on the hash table data structure and might work well in some use cases.

1export const articles = pgTable(
2  'articles',
3  {
4    id: serial('id').primaryKey(),
5    title: text('title'),
6    content: text('content'),
7    authorId: integer('author_id').references(() => users.id),
8  },
9  (table) => ({
10    titleIndex: index('title_index').using('hash', table.title),
11  }),
12);

Block Range Indexes (BRIN)

The Block Range Indexes (BRIN) can be useful for indexing very large datasets.

1export const articles = pgTable(
2  'articles',
3  {
4    id: serial('id').primaryKey(),
5    title: text('title'),
6    content: text('content'),
7    authorId: integer('author_id').references(() => users.id),
8  },
9  (table) => ({
10    titleIndex: index('title_index').using('brin', table.title),
11  }),
12);

Summary

In this article, we explored the fundamentals of indexes and implemented examples that enhanced the performance of our SELECT queries. We also discussed the potential drawbacks of misusing indexes. Additionally, we highlighted that the default index type in PostgreSQL is the B-tree index, which is highly effective for various queries and is the most commonly used. Additionally, we touched on other index types like GIN and hash indexes.

With this knowledge, we can now effectively manage indexes when working with the Drizzle ORM.

Improving the performance with indexes using Drizzle ORM

Why we might need indexes

Introducing indexes

Unique indexes

Indexes with multiple columns

Types of indexes

Generalized Inverted Indexes (GIN)

Generalized Search Tree (GIST)

Hash indexes

Block Range Indexes (BRIN)

Summary