So far, in this series, we’ve used PostgreSQL to store data structured in columns. This approach has many benefits, but sometimes, we want some more flexibility, though.
Chances are you already know JSON and use it in your web applications. There are databases like MongoDB that store JSON-like documents. With this approach, we can describe any data structures with ease. Using MongoDB has its pros and cons, and we might still want to use an SQL database instead. Some parts of our schema might be fluid and change frequently, though. Fortunately, PostgreSQL has tools to deal with it.
Using a json column with PostgreSQL
In theory, we can store JSON as a regular string. We would miss on a ton of features that Postgres provides to help us work with JSON.
The first column type we want to look into is json. It stores the exact copy of the text we put in. When we use the data, Postgres has to reparse it on each execution.
The json type also preserves the order of keys, duplicates, and whitespace characters.
Creating tables with the json type
Let’s create a few tables and use the json type.
1CREATE TABLE product_categories (
2 id SERIAL PRIMARY KEY,
3 name text
4)1CREATE TABLE products (
2 id SERIAL PRIMARY KEY,
3 category_id INT NOT NULL,
4 name TEXT NOT NULL,
5 properties json,
6 FOREIGN KEY(category_id) REFERENCES product_categories(id)
7)Above, we can see that our products table aside from json has other columns also. Even though some parts of our database might benefit from a flexible approach, other places might still use a more traditional approach.
Inserting JSON data
When inserting data, Postgres makes sure that it is formatted properly. If it is not, we can expect an error.
1INSERT INTO product_categories (
2 name
3)
4VALUES (
5 'Books'
6)1INSERT INTO products (
2 category_id,
3 name,
4 properties
5)
6VALUES (
7 1,
8 'Introduction to Algorithms',
9 '{ "authors": ["Thomas H. Cormen", "Charles E. Leiserson", "Ronald L. Rivest", "Clifford Stein"], "publicationYear": "1990" }'
10)Above, we’ve added our first product. Since it is a book, it might have properties such as authors and publicationYear. Without using JSON, we would have to add those as additional columns of the products table.
1INSERT INTO product_categories (
2 name
3)
4VALUES (
5 'Cars'
6)1INSERT INTO products (
2 category_id,
3 name,
4 properties
5)
6VALUES (
7 2,
8 'A8',
9 '{ "brand": "Audi", "engine": { "fuel": "petrol", "numberOfCylinders": 6 } }'
10)Thanks to the fact that our properties column is flexible, we can use it to manage any type of product.
If we would have only books and cars, it might have been a good idea to create separate books and cars tables. If we had tens or hundreds of types of products, it would have been quite a hassle, though.
Manipulating JSON data
Postgres has quite a few operators and functions built-in that handle JSON data. The most important is the -> operator that allows us to get object fields by key.
1SELECT properties->'engine'->'fuel' as fuel FROM productsWe can also use the -> operator to access array elements.
1SELECT properties->'authors'->0 as authors FROM productsThe jsonb column
There is a drawback of using the above operator with a json column. Unfortunately, Postgres has to parse the data on each execution.
With PostgreSQL, we can also use the jsonb column. When we put values in, the database parses our data into a binary format. While it might be a bit slower when inserting, it significantly reduces the processing time. The jsonb format also doesn’t preserve whitespace, duplicates, and the order of keys.
1CREATE TABLE products (
2 id SERIAL PRIMARY KEY,
3 category_id INT NOT NULL,
4 name TEXT NOT NULL,
5 properties jsonb,
6 FOREIGN KEY(category_id) REFERENCES product_categories(id)
7)Doing that gives us all of the functionalities of the json type and more. Aside from the performance improvements when querying data, we also get more operators.
Another significant feature is creating indexes for our JSON data.
1CREATE INDEX brand_index ON products ((properties->>'brand'));Above, we use the ->> operator to convert the values to text. Thanks to that, it can be used for indexing.
If you want to know more about indexes with PostgreSQL, check out API with NestJS #14. Improving performance of our Postgres database with indexes
Using the jsonb type with TypeORM
The official Postgres documentation encourages the use of the jsonb format in most cases. With TypeORM, it is very straightforward to create a jsonb column.
1import { Column, Entity, PrimaryGeneratedColumn, ManyToOne } from 'typeorm';
2import ProductCategory from '../productCategories/productCategory.entity';
3import { CarProperties } from './types/carProperties.interface';
4import { BookProperties } from './types/bookProperties.interface';
5
6@Entity()
7class Product {
8 @PrimaryGeneratedColumn()
9 public id: number;
10
11 @Column()
12 public name: string;
13
14 @ManyToOne(() => ProductCategory, (category: ProductCategory) => category.products)
15 public category: ProductCategory;
16
17 @Column({
18 type: 'jsonb'
19 })
20 public properties: CarProperties | BookProperties;
21}
22
23export default Product;Above, we create a union between CarProperties and BookProperties.
1export interface CarProperties {
2 brand: string;
3 engine: {
4 fuel: string;
5 numberOfCylinders: number;
6 }
7}1export interface BookProperties {
2 authors: string[];
3 publicationYear: string;
4}Creating such an entity allows us to start inserting the data into our database. From the API perspective, properties do not differ much from other columns.
Although the database does not check if our properties match any of the above interfaces, it would be a good idea to validate it. One of the possible approaches would be to save information about the fields in the category. When the user inserts a product, we would then check what fields should a product of that category contain.
Using more advanced queries with TypeORM
While TypeORM might not support all of the features that json and jsonb columns provide, we can work around it. Fortunately, we can use bare SQL queries with TypeORM.
1async getAllBrands() {
2 return this.productsRepository
3 .query(`SELECT properties->'brand' as brand from product`);
4}To do the above, we need to know some of the JSON operators that Postgres supports.
If we need to use parameters in our query and we worry about SQL injection, we can create a parameterized query.
1async getBrand(productId: number) {
2 return this.productsRepository
3 .query(`SELECT properties->'brand' as brand from product WHERE id = $1`, [productId]);
4}Summary
In this article, we’ve explored the idea of storing JSON within a PostgreSQL database. We’ve done that both through SQL queries and TypeORM. While it is a flexible solution, it is not always fitting. It has some drawbacks, such as slower queries and higher disk usage. Knowing how it works will help us decide if it is a valid approach to the issue that we want to solve.