PostgreSQL JSON: When to Use JSONB and When to Normalize

PostgreSQL's JSONB support is genuinely excellent, and it gets used in two ways: appropriately as a tool for genuinely flexible data, and inappropriately as a way to avoid schema design. The consequences of misuse are slow queries, complex maintenance, and lost query planner intelligence.

Here is how to use JSONB well and know when not to use it at all.

JSON vs JSONB

PostgreSQL has two JSON types:

JSON stores the raw JSON string, preserving whitespace and key order. Parsing happens at query time. It is essentially TEXT with JSON validation.

JSONB stores JSON in a decomposed binary format. It is parsed at insert time, key order is not preserved, duplicate keys are discarded, and it supports indexing and operators. JSONB is almost always the right choice.

The only case to reach for plain JSON is when you specifically need to preserve key ordering or duplicate keys, which is rare.

When JSONB Makes Sense

Schema-less or highly variable attributes. Product attributes in an e-commerce system are a classic example. A shirt has size, color, and material. An electronics item has voltage, frequency, and certification. Storing every possible attribute as a column would require hundreds of nullable columns. JSONB handles this elegantly:

CREATE TABLE products (
 id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
 name TEXT NOT NULL,
 category TEXT NOT NULL,
 price NUMERIC(10, 2) NOT NULL,
 attributes JSONB NOT NULL DEFAULT '{}'
);

-- Electronics product
INSERT INTO products (name, category, price, attributes) VALUES (
 'Laptop Pro X',
 'electronics',
 1299.99,
 '{"voltage": "100-240V", "wattage": 65, "ports": ["USB-C", "HDMI", "USB-A"], "certifications": ["CE", "FCC"]}'
);

-- Clothing product
INSERT INTO products (name, category, price, attributes) VALUES (
 'Cotton T-Shirt',
 'apparel',
 29.99,
 '{"sizes": ["S", "M", "L", "XL"], "material": "100% cotton", "care": "Machine wash cold"}'
);

Configuration and settings. Application configuration that varies per user or per tenant, especially when the schema of the configuration is expected to evolve:

CREATE TABLE user_settings (
 user_id UUID REFERENCES users(id),
 settings JSONB NOT NULL DEFAULT '{}',
 PRIMARY KEY (user_id)
);

Audit logs and event stores. When you want to store the entire state of an object at a point in time, JSONB lets you store the serialized object without defining a schema for every possible past version.

Third-party API responses. When you receive JSON from an external API and need to store it for later processing or reference, JSONB avoids the need to define a schema for every field.

Querying JSONB

-- Access a top-level key
SELECT attributes->>'voltage' FROM products;
-- Returns the value as text

-- Access a nested key
SELECT attributes->'dimensions'->>'width' FROM products;

-- Filter by a key value
SELECT * FROM products WHERE attributes->>'material' = '100% cotton';

-- Check if a key exists
SELECT * FROM products WHERE attributes ? 'voltage';

-- Check if all keys exist
SELECT * FROM products WHERE attributes ?& ARRAY['voltage', 'wattage'];

-- JSONB contains: left operand contains right operand
SELECT * FROM products WHERE attributes @> '{"certifications": ["CE"]}';

The @> containment operator is particularly powerful — it checks whether one JSONB document contains another, supporting deep nesting:

-- Find all products with USB-C port
SELECT * FROM products WHERE attributes @> '{"ports": ["USB-C"]}';

-- Find all products with certifications including CE
SELECT * FROM products WHERE attributes @> '{"certifications": ["CE"]}';

Indexing JSONB

Without indexes, JSONB queries require sequential scans. Three index types are available:

GIN index on the entire column — supports @>, ?, ?&, ?| operators:

CREATE INDEX idx_products_attributes ON products USING GIN(attributes);

This is the most flexible option but creates a large index. Use it when you query many different paths.

GIN index with jsonb_path_ops — more compact, only supports @>:

CREATE INDEX idx_products_attributes_path ON products USING GIN(attributes jsonb_path_ops);

Smaller and faster for @> queries. Use this when containment queries are your primary pattern.

B-tree index on a specific path — for equality and range queries on a specific key:

CREATE INDEX idx_products_voltage ON products((attributes->>'voltage'));
CREATE INDEX idx_products_wattage ON products(((attributes->>'wattage')::numeric));

Expression indexes are the most efficient for querying a specific well-known path. Use them when you know which JSONB paths you will query frequently.

When NOT to Use JSONB

Columns you filter, join, or aggregate on regularly. If you find yourself running WHERE attributes->>'status' = 'active' on millions of rows frequently, that should be a real column with an index. JSONB operators are more expensive than native column comparisons.

Data with a stable, well-known schema. If the schema is known and stable, normalize it. Normalized data has better query planner support, cleaner constraints, and easier joins.

Foreign key targets. You cannot define a foreign key that references a value inside a JSONB column. If you have attributes->>'category_id' that should reference the categories table, that belongs as a proper column.

Aggregations. SUM((attributes->>'price')::numeric) is slower and syntactically ugly compared to SUM(price). If you need to aggregate on a field, it belongs as a column.

The Decision Matrix

Use JSONB when:

The structure varies significantly between rows
The schema will evolve in ways you cannot predict
You are storing third-party data with its own schema
The data is read as a whole object more often than queried by individual fields

Use normalized columns when:

You filter, sort, join, or aggregate on the field
You need foreign key constraints
The field has a fixed schema across all rows
Type safety and constraints are important (NOT NULL, CHECK constraints, etc.)

Combining Both Approaches

The best designs often combine normalized and JSONB fields:

CREATE TABLE products (
 id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
 name TEXT NOT NULL, -- Always needed, indexed
 slug TEXT UNIQUE NOT NULL, -- Always needed, indexed
 category_id UUID REFERENCES categories(id), -- Join target
 price NUMERIC(10, 2) NOT NULL, -- Aggregated
 status TEXT NOT NULL DEFAULT 'draft', -- Filtered
 attributes JSONB NOT NULL DEFAULT '{}', -- Variable schema data
 created_at TIMESTAMP DEFAULT NOW()
);

The core fields that you filter, join, and aggregate are proper columns. The variable per-product attributes live in JSONB. This gives you the best of both worlds.

Validation With CHECK Constraints

You can validate JSONB structure at the database level:

-- Ensure the attributes object has required keys
ALTER TABLE products ADD CONSTRAINT attributes_required_keys
 CHECK (
 attributes ? 'weight'
 AND attributes ? 'dimensions'
 )
 WHERE category = 'physical';

-- Ensure a specific key has the right type
ALTER TABLE products ADD CONSTRAINT attributes_price_numeric
 CHECK (jsonb_typeof(attributes->'price') = 'number');

These constraints enforce structure where it matters without requiring a fully normalized schema for every attribute.

JSONB in PostgreSQL is a genuinely powerful feature. Used well, it solves real schema flexibility problems without the operational complexity of a separate document store. Used carelessly, it turns your structured relational data into a bag of blobs that is hard to query and harder to maintain.

Designing a data model that needs to balance relational structure with schema flexibility? I help teams make these trade-off decisions. Book a call: calendly.com/jamesrossjr.