Toad World Blog

An Illustration of JSONB Capabilities in Postgres 9.5

Feb 28, 2017 12:36:08 PM by Marc Linster

Postgres introduced JSON and JSONB in versions 9.3 and 9.4, respectively, with JSONB (Binary Javascript Object Notation) being the canonical form of JSON, stored as binary objects with improved compression and more capabilities. Early releases of JSONB had a limited set of operators, but 9.5 introduced a complete set of JSONB capabilities, including operators for updating and deleting, and very good indexing capabilities.

In this blog, I will review some of those capabilities and show that in Postgres, JSONB is a first order data type that can be used easily and efficiently in standard SQL queries, indexes, and constraints.

To illustrate these capabilities, I have created a simple data model of customers, products, orders and order lines. This example primarily addresses the customer table.

jsonb=# \d customer
                 Table "example.customer"
     Column     |            Type             | Modifiers
----------------+-----------------------------+-----------
 id             | character varying(40)       | not null
 name           | jsonb                       |
 customer_since | timestamp without time zone |
 
Indexes:
    "customer_pkey" PRIMARY KEY, btree (id)
 
Check constraints:
    "validate_customer_name" CHECK ((name ->> 'first name'::text) IS NOT NULL AND (name ->> 'last name'::text) IS NOT NULL)
 
Referenced by:
    TABLE "customer_order" CONSTRAINT "customer_order_customer_id_fkey" FOREIGN KEY (customer_id) REFERENCES customer(id)
 
The customer table has 4 columns, and most of our example will deal with the ‘name’ column, which is of type JSONB.
A typical customer record looks like this:
 
jsonb=# \x
Expanded display is on.
jsonb=# select * from customer where id = 'cust000190943';
-[ RECORD 1 ]--+----------------------------------------------------------------------------------
id             | cust000190943
name           | {"id": "cust000190943", "initial": "R", "last name": "Smith", "first name": "Hans"}
customer_since | 17-JUN-16 11:25:53.44615
 
Alternatively, I can also use the jsonb_pretty function to make the output more readable:
 
jsonb=# SELECT id, customer_since, jsonb_pretty(name)
from customer where id = 'cust000190943';
-[ RECORD 1 ]--+---------------------------
id             | cust000190943
customer_since | 17-JUN-16 11:25:53.44615
jsonb_pretty   | {                         +
               |     "id": "cust000190943",+
               |     "initial": "R",       +
               |     "last name": "Smith", +
               |     "first name": "Hans"  +
               | }

 

My sample data set has 1,000,000 records. For illustration purposes, I have replicated the primary key ‘id’ in the JSONB column ‘name’. Please note: all query performance is measured on a VM on OSX running Windows 10.

Query performance for JSONB columns

When I run a query based on the primary key ‘id’, then response time is shorter than  0.6 milliseconds, as the primary key is supported by a btree index.

jsonb=# \timing
Timing is on.
jsonb=# select * from customer where id = 'cust000000001';
-[ RECORD 1 ]--+----------------------------------------------------------------------------------------------------
id             | cust000000001
name           | {"id": "cust000000001", "title": "Dr.", "initial": "Q", "last name": "Kordon", "first name": "Bob"}
customer_since |
 
Time: 0.569 ms

When I run the same query against the value of the key ‘id’ in the JSONB column ‘name’, I get an average performance of approximately 430 milliseconds, as that column is not indexed.

jsonb=# select * from customer where name->>'id' = 'cust000000001';
-[ RECORD 1 ]--+----------------------------------------------------------------------------------------------------
id             | cust000000001
name           | {"id": "cust000000001", "title": "Dr.", "initial": "Q", "last name": "Kordon", "first name": "Bob"}
customer_since |
 
Time: 430.838 ms

Once I index that column with a btree index, I get similar performance on the JSONB data element as I get in the primary key defined as character varying(40).

jsonb=# CREATE INDEX customer_jsonbid_idx ON customer ((name->>'id'));
 
jsonb=# select * from customer where name->>'id' = 'cust000000001';
-[ RECORD 1 ]--+----------------------------------------------------------------------------------------------------
id             | cust000000001
name           | {"id": "cust000000001", "title": "Dr.", "initial": "Q", "last name": "Kordon", "first name": "Bob"}
customer_since |
 
Time: 0.640 ms

This allows me to use JSONB data in complex queries efficiently and effectively, just as if it were a conventional SQL data type. In the sample query below, I am joining on a JSONB field (c.name->>id) and a conventional SQL data type (co.customer_id) to find all the customers who have bought products made in New Zealand.

SELECT
       c.name->>'title' "Title",
       c.name->>'first name' "First Name",
       c.name->>'last name' "Last Name",
       to_char(co.order_date, 'Day - DD, Mon, YYYY') "Purchase Date",
       col.quantity "Quantity",
       p.name "Product Name"
FROM customer c
JOIN customer_order co ON c.name->>'id' = co.customer_id
JOIN customer_order_line col ON co.id = col.customer_order_id
JOIN product p ON p.id = col.product_id
WHERE p.country_iso3 = 'NZL';
 
 Title | First Name | Last Name |       Purchase Date       | Quantity | Product Name
-------+------------+-----------+---------------------------+----------+--------------
 Mr.   | Haley      | Horn      | Friday    - 24, Jun, 2016 |       24 | Green
 Dr.   | Joe        | Kordon    | Tuesday   - 31, May, 2016 |       52 | Green
 Dr.   | Sue        | Smith     | Sunday    - 05, Jun, 2016 |       74 | Yellow
 Hon.  | Breanna    | Brown     | Sunday    - 15, May, 2016 |       62 | Green
 Dr.   | Marc       | Wagner    | Wednesday - 18, May, 2016 |       84 | White
 Mr.   | Haley      | Tinker    | Sunday    - 26, Jun, 2016 |        9 | White
 Mr.   | Sue        | Tinker    | Saturday  - 18, Jun, 2016 |       31 | Green
 Mr.   | Sue        | Tinker    | Saturday  - 18, Jun, 2016 |       50 | White
       | Kahleesi   | Horn      | Thursday  - 26, May, 2016 |       16 | Green
 Dr.   | Kahleesi   | Wagner    | Thursday  - 28, Apr, 2016 |       42 | Yellow
 Ms.   | Marc       | Kordon    | Tuesday   - 31, May, 2016 |       58 | White
 Hon.  | Joe        | Wagner    | Saturday  - 16, Apr, 2016 |       63 | Yellow

 

Integrity Constraints on JSONB columns

JSONB in Postgres 9.5 also supports integrity constraints. In the customer table defined above, we have a check constraint ‘validate_customer_name’ to make sure that customers cannot have first names or last names that are undefined, i.e., the JSONB keys ‘first name’ and ‘last name’ cannot be null.

Check constraints:

    "validate_customer_name" CHECK ((name ->> 'first name'::text) IS NOT NULL AND (name ->> 'last name'::text) IS NOT NULL)

If I run the following query:

INSERT INTO customer VALUES (
       'test2',
       '{"id": "test2","first name": "Hans"}')

I get an error as that constraint is violated

ERROR:  new row for relation "customer" violates check constraint "validate_customer_name"
DETAIL:  Failing row contains (test2, {"id": "test2", "first name": "Hans"}, null, null).

This query, which adds a first name and a last name, succeeds, as it meets the check constraint

INSERT INTO customer VALUES (
       'test2',
       '{"id": "test2","first name": "Hans", "last name": "Prince"}')

 

JSONB data manipulation

Postgres 9.5 also provides numerous capabilities to manipulate JSONB data elements. The previous queries created a record with the following data:

jsonb=# SELECT id, jsonb_pretty(name)from customer where id = 'test2';
  id   |        jsonb_pretty
-------+----------------------------
 test2 | {                         +
       |     "id": "test2",        +
       |     "last name": "Prince",+
       |     "first name": "Hans"  +
       | }

I now add a new key ‘title’ with value ‘Hon’ using ‘||’ operator to concatenate two JSONB values.

UPDATE customer
       SET name = name || '{"title": "Hon"}'
       WHERE id = 'test2';

This  results in:

jsonb=# SELECT id, jsonb_pretty(name)from customer where id = 'test2';

  id   |        jsonb_pretty
-------+----------------------------
 test2 | {                         +
       |     "id": "test2",        +
       |     "title": "Hon",       +
       |     "last name": "Prince",+
       |     "first name": "Hans"  +
       | }

If I want to change the value of the key ‘title’, I run a simple update query that replaces the value for the key ‘title’:

UPDATE customer

       SET name = name || '{"title": "His Excellence"}'
       WHERE id = 'test2';

Resulting in

jsonb=# SELECT id, jsonb_pretty(name)from customer where id = 'test2';
  id   |          jsonb_pretty
-------+--------------------------------
 test2 | {                             +
       |     "id": "test2",            +
       |     "title": "His Excellence",+
       |     "last name": "Prince",    +
       |     "first name": "Hans"      +
       | }
(1 row)

The same operator can be used to add nested documents

UPDATE customer
       SET name = name ||
'{"contact": {"fax": "617-123-5678", "cell": "617-123-4567"}}'
       WHERE id = 'test2';
 
jsonb=# SELECT id, jsonb_pretty(name)from customer where id = 'test2';
  id   |          jsonb_pretty
-------+--------------------------------
 test2 | {                             +
       |     "id": "test2",            +
       |     "title": "His Excellence",+
       |     "contact": {              +
       |         "fax": "617-123-5678",+
       |         "cell": "617-123-4567"+
       |     },                        +
       |     "last name": "Prince",    +
       |     "first name": "Hans"      +
       | }
(1 row)

The operator jsonb_set allows me to update values inside the nested documents

UPDATE customer
       SET name = jsonb_set (name, '{contact, cell}', '"617-123-0101"')
              WHERE id = 'test2';
 
jsonb=# SELECT id, jsonb_pretty(name)from customer where id = 'test2';
  id   |          jsonb_pretty
-------+--------------------------------
 test2 | {                             +
       |     "id": "test2",            +
       |     "title": "His Excellence",+
       |     "contact": {              +
       |         "fax": "617-123-5678",+
       |         "cell": "617-123-0101"+
       |     },                        +
       |     "last name": "Prince",    +
       |     "first name": "Hans"      +
       | }
(1 row)

The ‘#-’ operator allows me to delete keys and their values inside a nested JSONB document.

UPDATE customer
       SET name = name #- '{contact, fax}'
              WHERE id = 'test2';
 
jsonb=# SELECT id, jsonb_pretty(name)from customer where id = 'test2';
  id   |          jsonb_pretty
-------+--------------------------------
 test2 | {                             +
       |     "id": "test2",            +
       |     "title": "His Excellence",+
       |     "contact": {              +
       |         "cell": "617-123-0101"+
       |     },                        +
       |     "last name": "Prince",    +
       |     "first name": "Hans"      +
       | }
(1 row)

 

Make JSONB data look like a standard SQL table

The following query uses the LATERAL command (see https://www.postgresql.org/docs/9.5/static/sql-select.html), which allows columns in the FROM clause in a JOIN or a sub-select to refer to columns in the SELECT clause, combined with jsonb_to_record, which builds a record from a JSONB object.

SELECT 
      c.id "Customer Id",
      l.title "Title",
      l."first name" "First Name",
      l.initial "Initials",
      l."last name" "Last Name",
      l."postfix" "Postfix"
FROM  customer c
JOIN  LATERAL
      ( SELECT *
             FROM jsonb_to_record(name)
                      AS detail(
                             id text,
                             title text,
                             initial text,
                             postfix text,
                             "last name" text,
                             "first name" text
                             )) l
ON (l.id = c.id)
ORDER BY c.id ASC
LIMIT 20;
 
  Customer Id  | Title | First Name | Initials | Last Name | Postfix
---------------+-------+------------+----------+-----------+---------
 cust000000001 | Dr.   | Bob        | Q        | Kordon    |
 cust000000002 | Mr.   | James      | F        | Schiller  |
 cust000000003 | Dr.   | Haley      | V        | Buchholz  | III
 cust000000004 | Hon.  | Marc       | V        | Goethe    | Sr
 cust000000005 |       | James      | W        | Schiller  | II
 cust000000006 | Ms.   | Breanna    | J        | Horn      | II
 cust000000007 | Ms.   | Haley      | H        | Cleland   | Jr.
 cust000000008 | Mr.   | James      | E        | Tinker    | Jr.
 cust000000009 |       | Hans       | V        | Kordon    | Jr.

 

Conclusions

JSONB in Postgres 9.5 provides a rich array of capabilities that makes it easy to implement solutions that need the flexibility of NoSQL, without giving up on the strengths of the relational model. The fact that standard SQL and JSONB can be used in the same transactional context, with a tight integration that allows converting JSONB to records (and vice-versa), creates a perfect NoSQL (Not only SQL) solution.

For a complete definition of the JSONB capabilities in Postgres 9.5, please review the JSON and JSONB Functions and Operators page in the Postgres documentation. 

Marc Linster, Ph.D., is Senior Vice President, Products and Services, at EnterpriseDB.

Tags: PostgreSQL Postgres

Marc Linster

Written by Marc Linster

Marc Linster, Ph.D., is EDB’s Senior Vice President of Products and Services and leads EDB’s engineering divisions and product development groups, as well as guides EDB’s professional services practices. Marc has an extensive background in engineering, technology and logistics with 20 years of management experience. Before joining EDB, Marc spent four years at Polycom, the leading maker of video communications equipment, where most recently he was a Senior Director, Engineering for Cloud and Hosted Solutions. Before Polycom, Marc was Co-founder and President of TriPoint Interactive, a global supply chain consulting and systems integration company. He spent six years at Avicon Group, first as CTO and then as Vice President of Operations. Marc holds a Ph.D. (Dr. rer. nat) in Computer Sciences from the University of Kaiserslautern in Germany.