Nest.js Tutorial

Generating statistics using aggregate functions in raw SQL

Marcin Wanago
JavaScriptNestJSSQL

So far, we’ve been mostly writing SQL queries that either store or retrieve the data from the database. Besides that, we can rely on PostgreSQL to process the data and get the computed results. By doing that, we can learn more about the rows in our tables. In this article, we look into how we can use aggregate functions to generate statistics about our data.

For the code from this article check out this repository.

The purpose of aggregate functions

The job of an aggregate function is to compute a single result from multiple input rows. One of the most popular aggregate functions is count(). When used with an asterisk, it measures the total number of rows in the table.

1SELECT count(*) FROM users

When we provide the count function with a column name, it counts the number of rows with a non-NULL value for that column.

1SELECT count(address_id) AS number_of_users_with_address FROM users

The count() function was handy in the previous article when we implemented pagination.

1import { Injectable } from '@nestjs/common';
2import DatabaseService from '../database/database.service';
3import PostModel from './post.model';
4 
5@Injectable()
6class PostsRepository {
7  constructor(private readonly databaseService: DatabaseService) {}
8 
9  async get(offset = 0, limit: number | null = null, idsToSkip = 0) {
10    const databaseResponse = await this.databaseService.runQuery(
11      `
12      WITH selected_posts AS (
13        SELECT * FROM posts
14        WHERE id > $3
15        ORDER BY id ASC
16        OFFSET $1
17        LIMIT $2
18      ),
19      total_posts_count_response AS (
20        SELECT COUNT(*)::int AS total_posts_count FROM posts
21      )
22      SELECT * FROM selected_posts, total_posts_count_response
23    `,
24      [offset, limit, idsToSkip],
25    );
26    const items = databaseResponse.rows.map(
27      (databaseRow) => new PostModel(databaseRow),
28    );
29    const count = databaseResponse.rows[0]?.total_posts_count || 0;
30    return {
31      items,
32      count,
33    };
34  }
35 
36  // ...
37}
38 
39export default PostsRepository;

Grouping data in the table

Aggregate functions work great when we perform them on groups of data

1SELECT author_id, count(*) FROM posts
2GROUP BY author_id

When we do the above, PostgreSQL divides the data into groups and runs the aggregate function on each group individually.

We could make our query even more helpful and order our results. By doing that, we can ensure the authors with the highest number of posts are at the top of the list.

1SELECT author_id, count(*) AS posts_count FROM posts
2GROUP BY author_id
3ORDER BY posts_count DESC

Let’s create a model that can hold the above data.

1export interface PostAuthorStatisticsModelData {
2  author_id: number;
3  posts_count: number;
4}
5class PostAuthorStatisticsModel {
6  authorId: number;
7  postsCount: number;
8  constructor(postAuthorStatisticsData: PostAuthorStatisticsModelData) {
9    this.authorId = postAuthorStatisticsData.author_id;
10    this.postsCount = postAuthorStatisticsData.posts_count;
11  }
12}
13 
14export default PostAuthorStatisticsModel;

Let’s create a separate statistics repository to prevent our PostsRepository class from getting too big.

1import {
2  Injectable,
3} from '@nestjs/common';
4import DatabaseService from '../database/database.service';
5import PostAuthorStatisticsModel from './postAuthorStatistics.model';
6 
7@Injectable()
8class PostsStatisticsRepository {
9  constructor(private readonly databaseService: DatabaseService) {}
10 
11  async getPostsAuthorStatistics() {
12    const databaseResponse = await this.databaseService.runQuery(
13      `
14      SELECT author_id, count(*)::int AS posts_count FROM posts
15      GROUP BY author_id
16      ORDER BY posts_count DESC
17    `,
18      [],
19    );
20    return databaseResponse.rows.map(
21      (databaseRow) => new PostAuthorStatisticsModel(databaseRow),
22    );
23  }
24}
25 
26export default PostsStatisticsRepository;
The count() function returns the value using the bigint data type. Because of that, we convert it to a regular integer. If you want to know more, check out the previous article.

We also need to point to our new repository in the PostsService class.

1import { Injectable } from '@nestjs/common';
2import PostsRepository from './posts.repository';
3import PostsStatisticsRepository from './postsStatistics.repository';
4 
5@Injectable()
6export class PostsService {
7  constructor(
8    private readonly postsRepository: PostsRepository,
9    private readonly postsStatisticsRepository: PostsStatisticsRepository,
10  ) {}
11 
12  getPostAuthorStatistics() {
13    return this.postsStatisticsRepository.getPostsAuthorStatistics();
14  }
15  
16  // ...
17}

The last step is to use it in the controller.

1import {
2  ClassSerializerInterceptor,
3  Controller,
4  Get,
5  Query,
6  UseInterceptors,
7} from '@nestjs/common';
8import { PostsService } from './posts.service';
9import GetPostsByAuthorQuery from './getPostsByAuthorQuery';
10import PaginationParams from '../utils/paginationParams';
11 
12@Controller('posts')
13@UseInterceptors(ClassSerializerInterceptor)
14export default class PostsController {
15  constructor(private readonly postsService: PostsService) {}
16 
17  @Get()
18  getPosts(
19    @Query() { authorId }: GetPostsByAuthorQuery,
20    @Query() { offset, limit, idsToSkip }: PaginationParams,
21  ) {
22    return this.postsService.getPosts(authorId, offset, limit, idsToSkip);
23  }
24 
25  @Get('statistics')
26  getStatistics() {
27    return this.postsService.getPostAuthorStatistics();
28  }
29 
30  // ...
31}

Other aggregate functions

There are more aggregate functions besides count(). Let’s go through them.

max and min

Using the max() function, we can find the largest value of the selected column. Respectively, the min() function returns the smallest value of the column.

Since we don’t have any numerical columns in our posts, let’s pair the above functions with length(). This way, we can get the longest and shortest posts of a particular author.

The length() function returns the length of a string.
1import { Injectable } from '@nestjs/common';
2import DatabaseService from '../database/database.service';
3import PostAuthorStatisticsModel from './postAuthorStatistics.model';
4 
5@Injectable()
6class PostsStatisticsRepository {
7  constructor(private readonly databaseService: DatabaseService) {}
8 
9  async getPostsAuthorStatistics() {
10    const databaseResponse = await this.databaseService.runQuery(
11      `
12      SELECT
13        author_id,
14        count(*)::int AS posts_count,
15        max(length(post_content)) AS longest_post_length,
16        min(length(post_content)) AS shortest_post_length
17      FROM posts
18      GROUP BY author_id
19      ORDER BY posts_count DESC
20    `,
21      [],
22    );
23    return databaseResponse.rows.map(
24      (databaseRow) => new PostAuthorStatisticsModel(databaseRow),
25    );
26  }
27}
28 
29export default PostsStatisticsRepository;
Running max() or min() on a text column returns a string based on the alphabetical order.

sum

With the sum() function, we can return a total sum of a particular column. Since it only works with numerical values, we also need the length() function.

Since the sum() function also returns the value in the bigint format, we transform it to a regular integer. We can do it because we don’t expect values bigger than 2³¹⁻¹.
1import { Injectable } from '@nestjs/common';
2import DatabaseService from '../database/database.service';
3import PostAuthorStatisticsModel from './postAuthorStatistics.model';
4 
5@Injectable()
6class PostsStatisticsRepository {
7  constructor(private readonly databaseService: DatabaseService) {}
8 
9  async getPostsAuthorStatistics() {
10    const databaseResponse = await this.databaseService.runQuery(
11      `
12      SELECT
13        author_id,
14        count(*)::int AS posts_count,
15        max(length(post_content)) AS longest_post_length,
16        min(length(post_content)) AS shortest_post_length,
17        sum(length(post_content))::int AS all_posts_content_sum
18      FROM posts
19      GROUP BY author_id
20      ORDER BY posts_count DESC
21    `,
22      [],
23    );
24    return databaseResponse.rows.map(
25      (databaseRow) => new PostAuthorStatisticsModel(databaseRow),
26    );
27  }
28}
29 
30export default PostsStatisticsRepository;

avg

The avg() function calculates the average of the values in a group. Let’s combine it with the length() function to calculate the average length of all posts of a particular author.

1import { Injectable } from '@nestjs/common';
2import DatabaseService from '../database/database.service';
3import PostAuthorStatisticsModel from './postAuthorStatistics.model';
4 
5@Injectable()
6class PostsStatisticsRepository {
7  constructor(private readonly databaseService: DatabaseService) {}
8 
9  async getPostsAuthorStatistics() {
10    const databaseResponse = await this.databaseService.runQuery(
11      `
12      SELECT
13        author_id,
14        count(*)::int AS posts_count,
15        max(length(post_content)) AS longest_post_length,
16        min(length(post_content)) AS shortest_post_length,
17        sum(length(post_content))::int AS all_posts_content_sum,
18        avg(length(post_content))::real AS average_post_content_length
19      FROM posts
20      GROUP BY author_id
21      ORDER BY posts_count DESC
22    `,
23      [],
24    );
25    return databaseResponse.rows.map(
26      (databaseRow) => new PostAuthorStatisticsModel(databaseRow),
27    );
28  }
29}
30 
31export default PostsStatisticsRepository;

A significant thing about the avg() function is that it returns the data in the numeric type. It can store many digits and is very useful when exactness is crucial. Parsing this data type to JSON converts it to a string by default. Since we don’t need many digits after the decimal, we convert it to the real data type.

Aggregating data from more than one table

So far, we’ve been grouping and aggregating data in one table. However, a typical case might be when we want to aggregate data using more than one table. In this case, we need to use the JOIN keyword.

1SELECT addresses.country, COUNT(*) AS number_of_users FROM users
2LEFT JOIN addresses ON users.address_id = addresses.id
3GROUP BY addresses.country
Above, we perform an outer join using the LEFT JOIN keyword. If you want to know more, check out API with NestJS #73. One-to-one relationships with raw SQL queries

Filtering using aggregate functions and grouping

So far, to filter the results from the database, we’ve been using the WHERE keyword.

1SELECT * FROM posts
2WHERE id = 1

When we want to filter using aggregate functions and grouping, we need to look at the execution order of SQL clauses.

  • FROM
  • WHERE
  • GROUP BY
  • HAVING
  • DISTINCT
  • SELECT
  • ORDER BY
  • LIMIT

Since PostgreSQL executes the WHERE clause before GROUP BY, we can’t use it with aggregate functions.

1SELECT author_id, count(*) as posts_count FROM posts
2WHERE count(*) > 100
3GROUP BY author_id
ERROR: aggregate functions are not allowed in WHERE LINE 2: WHERE count(*) > 100

Instead, we need to use the HAVING keyword.

1SELECT author_id, count(*) AS posts_count FROM posts
2GROUP BY author_id
3HAVING count(*) > 100

Using aliases

An important caveat is that we can’t use column aliases with the HAVING keyword. So, for example, the following code wouldn’t work:

1SELECT author_id, count(*) AS posts_count FROM posts
2GROUP BY author_id
3HAVING posts_count > 100

Instead, we need to use the count() function twice and rely on PostgreSQL to optimize it.

Summary

In this article, we’ve gone through how to use aggregate functions together with grouping. When doing so, we’ve implemented an endpoint that returns statistics about a particular table. We also wrote an example that uses an aggregate function and grouping when joining two tables. Finally, we’ve also learned how to filter our data using aggregate functions and why we can’t do that with the WHERE keyword.

There is still more to cover when writing raw SQL with NestJS, so stay tuned!