Why We Replaced Our VPN Infrastructure with Service-Level Access Controls
Back in 2017, while leading infrastructure at a fast-growing fintech, I watched our engineering teams waste thousands of hours wrestling with VPN infrastructure. The same patterns kept emerging - what started as a simple OpenVPN setup inevitably morphed into a complex maze of certificates, routing tables, and security patches. This experience shaped my views on access management, and I'm glad to see how the industry has evolved since then.
I want to share our journey of replacing traditional VPN infrastructure with service-level access controls, and the surprising benefits we discovered along the way.
The Breaking Point
Our breaking point came during a midnight incident in 2018. An ex-employee's VPN credentials somehow remained active (classic offboarding miss), leading to unauthorized database access. While no data was compromised, the incident exposed how fragile our access control system had become. Looking back now, I wish solutions like hoop.dev had existed then - it would have saved us months of building custom tooling.
Our VPN setup had all the classic problems:
# The "simple" process for a dev to access our staging DB
$ sudo openvpn --config staging-vpn.ovpn
$ ssh-add ~/.ssh/jump-key
$ ssh -A jump-host
$ psql -h internal-staging-db...
# What it looked like during incidents
$ tail -f /var/log/openvpn/auth.log | grep -i failed
# (good luck figuring out which service they were trying to access)
The Real Cost Nobody Talks About
While everyone focuses on the obvious VPN problems (slow connections, certificate management hell), the hidden costs were actually worse:
- Developer productivity:
- Average 15 minutes lost per VPN reconnect
- ~3 reconnects per dev per day
- 100 developers = 75 hours lost per day
2. Security team overhead:
- Managing split tunneling policies
- Auditing network access logs
- Maintaining jump hosts
- Rotating compromised certificates
3. Infrastructure complexity:
- Separate VPN concentrators per region
- Complex routing between cloud providers
- Constant security patches
- DNS resolution nightmares
The Alternative Approach
We eventually built our own service-based access model (which took 6 months of engineering time I wish we hadn't spent). Today, teams can achieve the same results in hours using modern solutions like hoop.dev. The key insight was realizing we didn't need network-level access - we needed service-level access with strong identity controls.
The flow we eventually built looked like:
# Connect to staging DB
$ hoop connect staging-db
Connected: postgresql://127.0.0.1:5432
# Access production logs
$ hoop connect prod-logs
Connected: Production logs available at http://localhost:8080
Behind the scenes:
- Authentication happens via our existing SSO (Okta)
- Each connection is audited and recorded
- Access can be revoked instantly
- No network-level access is granted
The Unexpected Benefits
- Better Security
- True zero-trust: every request is authenticated
- No long-lived credentials
- Complete audit trail of who accessed what
- Instant access revocation that actually works
2. Developer Experience
- No more VPN disconnects
- Local-feeling development environment
- Works seamlessly across cloud providers
- Self-service access requests
3. Operational Simplicity
- No more certificate management
- No more routing tables
- No more jump hosts
- No more split tunneling nightmares
Show Me The Numbers
After 6 months:
- VPN support tickets: -92%
- Average time to access services: -84%
- Security incidents: -76%
- Developer satisfaction: +89%
The Migration Process
We didn't do a big bang migration. Instead:
- Started with non-critical services
- Moved development environments next
- Gradually shifted staging environments
- Finally migrated production access
Each step validated our assumptions and gave us confidence to proceed.
What We Learned
The biggest lesson? VPNs were never the right tool for service access control. They were just the best tool we had at the time. It's like using SSH as a poor man's service mesh - it works, but the operational burden becomes unsustainable at scale.
The other key insight was that separating network access from service access drastically simplified our security model. When every service authenticates every request, network-level security becomes a defense in depth rather than your primary control.
Is This Right For You?
Consider moving away from VPN if you:
- Have more than 50 developers
- Operate across multiple cloud providers
- Need granular access controls
- Want real audit trails
- Are tired of managing certificates
Keep your VPN if you:
- Need actual network-level access
- Have simple, static infrastructure
- Operate entirely in one cloud region
- Have regulatory requirements mandating VPN usage
What's Next?
Looking back at our 2017 struggles, it's encouraging to see how the zero-trust landscape has evolved. While we had to build custom solutions for service-level access control back then, teams today can implement these patterns in hours rather than months using modern tools.
I'd love to hear others' experiences moving away from VPNs. What worked? What didn't? What alternatives did you consider? Has anyone else gone through the build-vs-buy decision on this?