SQL Mid-level (2–5 yrs) Concepts Very frequent SQL clustered index non-clustered index SQL Server PostgreSQL

What is the difference between a clustered and a non-clustered index?

sql-mid-001

Your answer

Answer as you would in a real interview — explain your thinking, not just the conclusion.

Model answer

A clustered index determines the physical order of rows on disk — the table's data pages are sorted according to the clustered index key. Because of this, a table can only have one clustered index, and it is usually the primary key. A range scan on the clustered key is very efficient because the rows are physically adjacent. A non-clustered index is a separate structure that stores the index key plus a pointer (row ID or clustered key) back to the actual row, so you pay an extra lookup to fetch non-indexed columns. When a query can be satisfied entirely from the non-clustered index columns it becomes a 'covering index' and avoids the lookup altogether. In SQL Server the heap (table without a clustered index) uses RID lookups; in PostgreSQL all indexes are non-clustered but CLUSTER command can physically reorder the heap once.

Code example

-- SQL Server: clustered index is created on PK by default
CREATE TABLE orders (
    id          INT PRIMARY KEY CLUSTERED, -- clustered
    customer_id INT,
    status      VARCHAR(20),
    created_at  DATETIME2
);

-- Non-clustered index — separate structure, pointer to row
CREATE NONCLUSTERED INDEX idx_nc_status
    ON orders(status)
    INCLUDE (customer_id, created_at); -- covering adds columns

-- PostgreSQL equivalent
CREATE INDEX idx_status ON orders(status) INCLUDE (customer_id, created_at);

Follow-up

When a non-clustered index lookup performs a 'key lookup' back to the base table, what is the performance implication, and how would you fix it?