The query kept timing out and no one knew why.
Edge access control was working fine at small scale, but as soon as real traffic hit, the DynamoDB query patterns exposed hidden flaws. Investigation showed that each request was touching more partitions than expected. Access checks at the edge were fast, but not fast enough to match the SLA. The index design was sound on paper. The problem was in how policies were stored and retrieved.
When building edge access control with DynamoDB, the table design often decides your fate. The fastest systems push as much filtering as possible into a single partition and avoid scatter-gather queries. Keys need to reflect both the user access scope and the resource a user targets. Avoid queries that require reading thousands of small items spread across multiple keys. Even with on-demand capacity, latency grows.
Query runbooks reduce firefighting time. Instead of guessing at runtime causes, a well-written DynamoDB query runbook shows exactly what to check first: partition key patterns, consumed capacity metrics, throttling events, hot keys, and conditional checks. The runbook should contain pre-written queries for metrics, steps for reproducing the problem, and instructions to make changes without breaking production.