Tell me about a time you improved the reliability or performance of a production system
go-mid-003
Your answer
Answer as you would in a real interview — explain your thinking, not just the conclusion.
Model answer
Our Go API was experiencing intermittent timeouts under load. I used pprof profiling and found that a shared map protected by sync.RWMutex was a contention hotspot — 60% of goroutines were blocked waiting for the write lock during cache invalidation. I replaced it with sync.Map for the hot read path and rewrote the invalidation logic to batch updates instead of invalidating individual entries. P99 latency dropped from 450ms to 35ms under the same load. The key lesson was: profile before optimising — the bottleneck was contention, not allocation, which I would not have guessed without data.
Follow-up
When would you choose sync.Map over a plain map with a mutex? What are sync.Map's trade-offs?