Articles

Why We Replaced Our VPN Infrastructure with Service-Level Access Controls

Andrios Robert

Nov 4, 2024 • 3 min read

Back in 2017, while leading infrastructure at a fast-growing fintech, I watched our engineering teams waste thousands of hours wrestling with VPN infrastructure. The same patterns kept emerging - what started as a simple OpenVPN setup inevitably morphed into a complex maze of certificates, routing tables, and security patches. This experience shaped my views on access management, and I'm glad to see how the industry has evolved since then.

I want to share our journey of replacing traditional VPN infrastructure with service-level access controls, and the surprising benefits we discovered along the way.

The Breaking Point

Our breaking point came during a midnight incident in 2018. An ex-employee's VPN credentials somehow remained active (classic offboarding miss), leading to unauthorized database access. While no data was compromised, the incident exposed how fragile our access control system had become. Looking back now, I wish solutions like hoop.dev had existed then - it would have saved us months of building custom tooling.

Our VPN setup had all the classic problems:

# The "simple" process for a dev to access our staging DB
$ sudo openvpn --config staging-vpn.ovpn
$ ssh-add ~/.ssh/jump-key
$ ssh -A jump-host
$ psql -h internal-staging-db...

# What it looked like during incidents
$ tail -f /var/log/openvpn/auth.log | grep -i failed
# (good luck figuring out which service they were trying to access)

The Real Cost Nobody Talks About

While everyone focuses on the obvious VPN problems (slow connections, certificate management hell), the hidden costs were actually worse:

Developer productivity:

Average 15 minutes lost per VPN reconnect
~3 reconnects per dev per day
100 developers = 75 hours lost per day

2. Security team overhead:

Managing split tunneling policies
Auditing network access logs
Maintaining jump hosts
Rotating compromised certificates

3. Infrastructure complexity:

Separate VPN concentrators per region
Complex routing between cloud providers
Constant security patches
DNS resolution nightmares

The Alternative Approach

We eventually built our own service-based access model (which took 6 months of engineering time I wish we hadn't spent). Today, teams can achieve the same results in hours using modern solutions like hoop.dev. The key insight was realizing we didn't need network-level access - we needed service-level access with strong identity controls.

The flow we eventually built looked like:

# Connect to staging DB
$ hoop connect staging-db
Connected: postgresql://127.0.0.1:5432

# Access production logs
$ hoop connect prod-logs
Connected: Production logs available at http://localhost:8080

Behind the scenes:

Authentication happens via our existing SSO (Okta)
Each connection is audited and recorded
Access can be revoked instantly
No network-level access is granted

The Unexpected Benefits

Better Security

True zero-trust: every request is authenticated
No long-lived credentials
Complete audit trail of who accessed what
Instant access revocation that actually works

2. Developer Experience

No more VPN disconnects
Local-feeling development environment
Works seamlessly across cloud providers
Self-service access requests

3. Operational Simplicity

No more certificate management
No more routing tables
No more jump hosts
No more split tunneling nightmares

Show Me The Numbers

After 6 months:

VPN support tickets: -92%
Average time to access services: -84%
Security incidents: -76%
Developer satisfaction: +89%

The Migration Process

We didn't do a big bang migration. Instead:

Started with non-critical services
Moved development environments next
Gradually shifted staging environments
Finally migrated production access

Each step validated our assumptions and gave us confidence to proceed.

What We Learned

The biggest lesson? VPNs were never the right tool for service access control. They were just the best tool we had at the time. It's like using SSH as a poor man's service mesh - it works, but the operational burden becomes unsustainable at scale.

The other key insight was that separating network access from service access drastically simplified our security model. When every service authenticates every request, network-level security becomes a defense in depth rather than your primary control.

Is This Right For You?

Consider moving away from VPN if you:

Have more than 50 developers
Operate across multiple cloud providers
Need granular access controls
Want real audit trails
Are tired of managing certificates

Keep your VPN if you:

Need actual network-level access
Have simple, static infrastructure
Operate entirely in one cloud region
Have regulatory requirements mandating VPN usage

What's Next?

Looking back at our 2017 struggles, it's encouraging to see how the zero-trust landscape has evolved. While we had to build custom solutions for service-level access control back then, teams today can implement these patterns in hours rather than months using modern tools.

I'd love to hear others' experiences moving away from VPNs. What worked? What didn't? What alternatives did you consider? Has anyone else gone through the build-vs-buy decision on this?