Tell me about a time you optimised a slow Python service in production
py-mid-005
Your answer
Answer as you would in a real interview — explain your thinking, not just the conclusion.
Model answer
STAR structure: Situation — a Django REST endpoint for generating personalised product recommendations was timing out under load (p99 > 5s). Task — get p99 under 500ms without increasing server count. Action — used cProfile and py-spy (a sampling profiler that runs on production without restarts) to find the hot path: a nested loop joining product data from two ORM calls — a classic N+1 query. Fixed with: (1) select_related / prefetch_related to collapse the N+1 into 2 queries; (2) replaced a pure-Python jaccard similarity loop with a NumPy vectorised operation (6x speedup); (3) cached the top-100 recommendations per user segment in Redis with a 5-minute TTL. Result: p99 dropped from 5200ms to 180ms. Key lesson: always profile before optimising — the slowest code is almost never where you think it is. py-spy is invaluable because it can attach to a running CPython process without modifying code.
Follow-up
How do you balance the risk of adding a caching layer (stale data, invalidation complexity) against the performance gain?