Postgres introduced JSON and JSONB in versions 9.3 and 9.4, respectively, with JSONB (Binary Javascript Object Notation) being the canonical form of JSON, stored as binary objects with improved compression and more capabilities. Early releases of JSONB had a limited set of operators, but 9.5 introduced a complete set of JSONB capabilities, including operators for updating and deleting, and very good indexing capabilities.
In this blog, I will review some of those capabilities and show that in Postgres, JSONB is a first order data type that can be used easily and efficiently in standard SQL queries, indexes, and constraints.
To illustrate these capabilities, I have created a simple data model of customers, products, orders and order lines. This example primarily addresses the customer table.
jsonb=# \d customer
Table "example.customer"
Column | Type | Modifiers
----------------+-----------------------------+-----------
id | character varying(40) | not null
name | jsonb |
customer_since | timestamp without time zone |
Indexes:
"customer_pkey" PRIMARY KEY, btree (id)
Check constraints:
"validate_customer_name" CHECK ((name ->> 'first name'::text) IS NOT NULL AND (name ->> 'last name'::text) IS NOT NULL)
Referenced by:
TABLE "customer_order" CONSTRAINT "customer_order_customer_id_fkey" FOREIGN KEY (customer_id) REFERENCES customer(id)
The customer table has 4 columns, and most of our example will deal with the ‘name’ column, which is of type JSONB.
A typical customer record looks like this:
jsonb=# \x
Expanded display is on.
jsonb=# select * from customer where id = 'cust000190943';
-[ RECORD 1 ]--+----------------------------------------------------------------------------------
id | cust000190943
name | {"id": "cust000190943", "initial": "R", "last name": "Smith", "first name": "Hans"}
customer_since | 17-JUN-16 11:25:53.44615
Alternatively, I can also use the jsonb_pretty function to make the output more readable:
jsonb=# SELECT id, customer_since, jsonb_pretty(name)
from customer where id = 'cust000190943';
-[ RECORD 1 ]--+---------------------------
id | cust000190943
customer_since | 17-JUN-16 11:25:53.44615
jsonb_pretty | { +
| "id": "cust000190943",+
| "initial": "R", +
| "last name": "Smith", +
| "first name": "Hans" +
| }
My sample data set has 1,000,000 records. For illustration purposes, I have replicated the primary key ‘id’ in the JSONB column ‘name’. Please note: all query performance is measured on a VM on OSX running Windows 10.
Query performance for JSONB columns
When I run a query based on the primary key ‘id’, then response time is shorter than 0.6 milliseconds, as the primary key is supported by a btree index.
jsonb=# \timing
Timing is on.
jsonb=# select * from customer where id = 'cust000000001';
-[ RECORD 1 ]--+----------------------------------------------------------------------------------------------------
id | cust000000001
name | {"id": "cust000000001", "title": "Dr.", "initial": "Q", "last name": "Kordon", "first name": "Bob"}
customer_since |
Time: 0.569 ms
When I run the same query against the value of the key ‘id’ in the JSONB column ‘name’, I get an average performance of approximately 430 milliseconds, as that column is not indexed.
jsonb=# select * from customer where name->>'id' = 'cust000000001';
-[ RECORD 1 ]--+----------------------------------------------------------------------------------------------------
id | cust000000001
name | {"id": "cust000000001", "title": "Dr.", "initial": "Q", "last name": "Kordon", "first name": "Bob"}
customer_since |
Time: 430.838 ms
Once I index that column with a btree index, I get similar performance on the JSONB data element as I get in the primary key defined as character varying(40).
jsonb=# CREATE INDEX customer_jsonbid_idx ON customer ((name->>'id'));
jsonb=# select * from customer where name->>'id' = 'cust000000001';
-[ RECORD 1 ]--+----------------------------------------------------------------------------------------------------
id | cust000000001
name | {"id": "cust000000001", "title": "Dr.", "initial": "Q", "last name": "Kordon", "first name": "Bob"}
customer_since |
Time: 0.640 ms
This allows me to use JSONB data in complex queries efficiently and effectively, just as if it were a conventional SQL data type. In the sample query below, I am joining on a JSONB field (c.name->>id) and a conventional SQL data type (co.customer_id) to find all the customers who have bought products made in New Zealand.
SELECT
c.name->>'title' "Title",
c.name->>'first name' "First Name",
c.name->>'last name' "Last Name",
to_char(co.order_date, 'Day - DD, Mon, YYYY') "Purchase Date",
col.quantity "Quantity",
p.name "Product Name"
FROM customer c
JOIN customer_order co ON c.name->>'id' = co.customer_id
JOIN customer_order_line col ON co.id = col.customer_order_id
JOIN product p ON p.id = col.product_id
WHERE p.country_iso3 = 'NZL';
Title | First Name | Last Name | Purchase Date | Quantity | Product Name
-------+------------+-----------+---------------------------+----------+--------------
Mr. | Haley | Horn | Friday - 24, Jun, 2016 | 24 | Green
Dr. | Joe | Kordon | Tuesday - 31, May, 2016 | 52 | Green
Dr. | Sue | Smith | Sunday - 05, Jun, 2016 | 74 | Yellow
Hon. | Breanna | Brown | Sunday - 15, May, 2016 | 62 | Green
Dr. | Marc | Wagner | Wednesday - 18, May, 2016 | 84 | White
Mr. | Haley | Tinker | Sunday - 26, Jun, 2016 | 9 | White
Mr. | Sue | Tinker | Saturday - 18, Jun, 2016 | 31 | Green
Mr. | Sue | Tinker | Saturday - 18, Jun, 2016 | 50 | White
| Kahleesi | Horn | Thursday - 26, May, 2016 | 16 | Green
Dr. | Kahleesi | Wagner | Thursday - 28, Apr, 2016 | 42 | Yellow
Ms. | Marc | Kordon | Tuesday - 31, May, 2016 | 58 | White
Hon. | Joe | Wagner | Saturday - 16, Apr, 2016 | 63 | Yellow
…
Integrity Constraints on JSONB columns
JSONB in Postgres 9.5 also supports integrity constraints. In the customer table defined above, we have a check constraint ‘validate_customer_name’ to make sure that customers cannot have first names or last names that are undefined, i.e., the JSONB keys ‘first name’ and ‘last name’ cannot be null.
Check constraints:
"validate_customer_name" CHECK ((name ->> 'first name'::text) IS NOT NULL AND (name ->> 'last name'::text) IS NOT NULL)
If I run the following query:
INSERT INTO customer VALUES (
'test2',
'{"id": "test2","first name": "Hans"}')
I get an error as that constraint is violated
ERROR: new row for relation "customer" violates check constraint "validate_customer_name"
DETAIL: Failing row contains (test2, {"id": "test2", "first name": "Hans"}, null, null).
This query, which adds a first name and a last name, succeeds, as it meets the check constraint
INSERT INTO customer VALUES (
'test2',
'{"id": "test2","first name": "Hans", "last name": "Prince"}')
JSONB data manipulation
Postgres 9.5 also provides numerous capabilities to manipulate JSONB data elements. The previous queries created a record with the following data:
jsonb=# SELECT id, jsonb_pretty(name)from customer where id = 'test2';
id | jsonb_pretty
-------+----------------------------
test2 | { +
| "id": "test2", +
| "last name": "Prince",+
| "first name": "Hans" +
| }
I now add a new key ‘title’ with value ‘Hon’ using ‘||’ operator to concatenate two JSONB values.
UPDATE customer
SET name = name || '{"title": "Hon"}'
WHERE id = 'test2';
This results in:
jsonb=# SELECT id, jsonb_pretty(name)from customer where id = 'test2';
id | jsonb_pretty
-------+----------------------------
test2 | { +
| "id": "test2", +
| "title": "Hon", +
| "last name": "Prince",+
| "first name": "Hans" +
| }
If I want to change the value of the key ‘title’, I run a simple update query that replaces the value for the key ‘title’:
UPDATE customer
SET name = name || '{"title": "His Excellence"}'
WHERE id = 'test2';
Resulting in
jsonb=# SELECT id, jsonb_pretty(name)from customer where id = 'test2';
id | jsonb_pretty
-------+--------------------------------
test2 | { +
| "id": "test2", +
| "title": "His Excellence",+
| "last name": "Prince", +
| "first name": "Hans" +
| }
(1 row)
The same operator can be used to add nested documents
UPDATE customer
SET name = name ||
'{"contact": {"fax": "617-123-5678", "cell": "617-123-4567"}}'
WHERE id = 'test2';
jsonb=# SELECT id, jsonb_pretty(name)from customer where id = 'test2';
id | jsonb_pretty
-------+--------------------------------
test2 | { +
| "id": "test2", +
| "title": "His Excellence",+
| "contact": { +
| "fax": "617-123-5678",+
| "cell": "617-123-4567"+
| }, +
| "last name": "Prince", +
| "first name": "Hans" +
| }
(1 row)
The operator jsonb_set allows me to update values inside the nested documents
UPDATE customer
SET name = jsonb_set (name, '{contact, cell}', '"617-123-0101"')
WHERE id = 'test2';
jsonb=# SELECT id, jsonb_pretty(name)from customer where id = 'test2';
id | jsonb_pretty
-------+--------------------------------
test2 | { +
| "id": "test2", +
| "title": "His Excellence",+
| "contact": { +
| "fax": "617-123-5678",+
| "cell": "617-123-0101"+
| }, +
| "last name": "Prince", +
| "first name": "Hans" +
| }
(1 row)
The ‘#-’ operator allows me to delete keys and their values inside a nested JSONB document.
UPDATE customer
SET name = name #- '{contact, fax}'
WHERE id = 'test2';
jsonb=# SELECT id, jsonb_pretty(name)from customer where id = 'test2';
id | jsonb_pretty
-------+--------------------------------
test2 | { +
| "id": "test2", +
| "title": "His Excellence",+
| "contact": { +
| "cell": "617-123-0101"+
| }, +
| "last name": "Prince", +
| "first name": "Hans" +
| }
(1 row)
Make JSONB data look like a standard SQL table
The following query uses the LATERAL command (see https://www.postgresql.org/docs/9.5/static/sql-select.html), which allows columns in the FROM clause in a JOIN or a sub-select to refer to columns in the SELECT clause, combined with jsonb_to_record, which builds a record from a JSONB object.
SELECT
c.id "Customer Id",
l.title "Title",
l."first name" "First Name",
l.initial "Initials",
l."last name" "Last Name",
l."postfix" "Postfix"
FROM customer c
JOIN LATERAL
( SELECT *
FROM jsonb_to_record(name)
AS detail(
id text,
title text,
initial text,
postfix text,
"last name" text,
"first name" text
)) l
ON (l.id = c.id)
ORDER BY c.id ASC
LIMIT 20;
Customer Id | Title | First Name | Initials | Last Name | Postfix
---------------+-------+------------+----------+-----------+---------
cust000000001 | Dr. | Bob | Q | Kordon |
cust000000002 | Mr. | James | F | Schiller |
cust000000003 | Dr. | Haley | V | Buchholz | III
cust000000004 | Hon. | Marc | V | Goethe | Sr
cust000000005 | | James | W | Schiller | II
cust000000006 | Ms. | Breanna | J | Horn | II
cust000000007 | Ms. | Haley | H | Cleland | Jr.
cust000000008 | Mr. | James | E | Tinker | Jr.
cust000000009 | | Hans | V | Kordon | Jr.
Conclusions
JSONB in Postgres 9.5 provides a rich array of capabilities that makes it easy to implement solutions that need the flexibility of NoSQL, without giving up on the strengths of the relational model. The fact that standard SQL and JSONB can be used in the same transactional context, with a tight integration that allows converting JSONB to records (and vice-versa), creates a perfect NoSQL (Not only SQL) solution.
For a complete definition of the JSONB capabilities in Postgres 9.5, please review the JSON and JSONB Functions and Operators page in the Postgres documentation.
Marc Linster, Ph.D., is Senior Vice President, Products and Services, at EnterpriseDB.
Start the discussion at forums.toadworld.com