Tell me about a time you significantly improved database performance in production.
sql-sen-003
Your answer
Answer as you would in a real interview — explain your thinking, not just the conclusion.
Model answer
Our main product listing API was taking three to four seconds under normal load and spiking to fifteen seconds during peak. I started by enabling pg_stat_statements to identify the top ten slowest queries by total time, not just individual execution time. The worst offender was a correlated subquery counting cart items inside a SELECT over half a million products — it was running the subquery once per product row. I rewrote it as a LEFT JOIN with a COUNT grouped in a CTE, which dropped that query from two seconds to forty milliseconds. I also discovered that autovacuum had stopped running on the products table due to a misconfigured lock timeout, leaving dead row versions that bloated the heap. Fixing the autovacuum and running a manual VACUUM ANALYZE halved the planner's estimated row counts back to reality. The combined change reduced p99 API latency by eighty percent. I wrote up a runbook and added the queries as regression tests in our CI staging database.
Follow-up
How did you validate that your changes were safe to deploy without taking the site down? What rollback strategy did you prepare?