Packet Manipulation: The Missing Link in Network Security Nobody's Talking About
Why is SSH still around if we have so many better options? Because they're not 10x better. Solving SSO and recording sessions isn't enough to make people move. They're getting secure connections and if they do a good job of securing their passwords, everything is fine. Passwords aren't good, but they've been around forever and people got used to them. The transition to passwordless hasn't happened in easier places like web apps, a hint that the transition won't happen just because it's safer. What do you need to make the transition? 10x better experience.
Let's look at some alternatives. Which problems exist that could be solved inside SSH? It is hard to see because there's a whole ecosystem of tools and systems around remote access, but they aren't connected to remote access at first glance. To visualize it, you need to look at the next problem: what happens after you solve the secure connection problem? Authorization, automation, collaboration, reliability and security guard-rails.
What happens after SSH?
1 - Authorization: can a person perform a specific action on a given resource? This problem can be further simplified by breaking it down into reads and writes. This simplification is a game-changer as we'll see in a minute.
- For reads, the core issue boils down to limiting sensitive data access. There are no problems with people accessing data in general—we actually want to encourage this—the problem arises with sensitive data.
- Writes, on the other hand, present a more complex challenge. Sadly, security teams waste thousands of hours writing policies, but this is ultimately a failed solution. Sometimes they even push for developers to write these policies. However, at the scale of cloud computing, it is impossible to build least-access policies effectively. Instead, we should abandon traditional policies and take an intent-based approach, implementing a granular permissions system that distinguishes between different types of writes (e.g., append vs. modify vs. delete).
2 - Automation is crucial when dealing with resources that have the same structure, as repeat actions on these resources should be automated. However, automating repeating actions is a time-consuming process. CI/CD/rollout pipelines, for instance, require extensive building and maintenance. When this is multiplied by the different sets of technologies in use, it becomes virtually impossible to automate everything. The complexity and diversity of technological ecosystems blocks automation, despite its clear benefits in principle. This gives rise to the break-glass anti-pattern, which by the name you can tell is something you better avoid doing.
3 - Collaboration's critical in high-stakes environments. Someone needs to watch what I'm doing. Peer reviews or seniors checking juniors' work. I once caused a 1-hour payment outage for millions by accidentally nuking a prod Kubernetes namespace. Wrong cluster, oops. Most say: "You shouldn't have that access." Wrong answer. As a startup, we needed it. Most companies do this. It's why people stick to SSH. Don't throw requirements at people, adapt to their workflow.
Deleting the prod Kuberentes namespace wasn't a security or reliability problem, it was a collaboration problem. Why didn't someone get a notification about my execution? We need a smart system alerting relevant folks based on potential impact, using ML to gauge criticality. Why didn't my Kubernetes client warn me about deleting stuff in a high-usage environment? We need an AI-powered impact tool simulating consequences before execution, giving clear warnings. That's the ticket.
4 - Guard-rails are a pain to build, so teams default to: deny access. How many times have I seen access-tightening crusades kick off after someone fat-fingers a bad command in prod and takes systems down? Can only imagine the shitstorm inside Google Cloud after that recent manual storage object deletion fiasco.
There's all this high-minded talk about blameless culture and how we shouldn't go down these paths, but everybody's got a plan 'til they get punched in the face. Reality check: most companies won't do blameless. Circles back to the automation cost thing - it's cheaper to yank access than build out all the fancy automations.
Packet Manipulation: A New Hope
First we need to flip our thinking on secure connections. We've got all the tools we need to crack these problems inside these pipes. We can squeeze insights and patterns out of encrypted traffic without compromising security. Only one thing's missing: the ability to tap into it without breaking a sweat. We need a unified API that lets devs access and manipulate connection data across different protocols and security layers. That's the missing piece.
Packet manipulation's been a headache for ages, and there's a good reason it hasn't been solved yet: it's devilishly complex. Trying to understand network packets, let alone modify them, is like herding cats. And the protocol situation is a nightmare - there are so many only for the secure access world: Mysql, Postgres, Kubernetes, SSH, HTTP/REST, to name a few. But here's the kicker: leveraging the amazing work of protocol-agnostic packet analysis tools, folks are building an universal packet manipulation library that works with any protocol - just a clean, consistent API across the board. It's the holy grail of network engineering, potentially making cross-protocol development a walk in the park.
Why bother?
1 - Middlewares are the unsung heroes of network infrastructure, but we've barely scratched the surface of their potential. Picture this: a "benevolent MITM" system that beefs up security and functionality without breaking encryption. We're talking about slapping custom logic right onto the network layer, similar to what API gateways and reverse proxies do for HTTP. But here's the million-dollar question: why don't we have this for other protocols?
It's high time we started designing protocol-specific middleware for the bread-and-butter infrastructure protocols like SSH, RDP, and SFTP. Imagine implementing policy enforcement at the middleware layer. And the holy grail? An access gateway that can apply all those nifty API gateway features - rate limiting, authentication, you name it - to any infrastructure access protocol under the sun.
2 - Instrumentation is the secret sauce of site reliability for applications, but we're dropping the ball when it comes to infrastructure connections. It's like we've got this killer recipe and we're only using half the ingredients. Why aren't we bringing this concept to the nuts and bolts of our systems?
Visibility and actionability are game changers. We're talking about developing real-time monitoring and alerting for infrastructure connections - imagine catching issues before they blow up in your face. And let's not forget about creating dashboards that actually make sense of all those connection metrics and patterns.
It's like giving your ops team x-ray vision into the guts of your infrastructure. No more playing whack-a-mole with issues. We've got the tech, we've got the know-how - we just need to pull our heads out of the sand and apply these principles across the board. Do this right, and you'll be running your infrastructure like a well-oiled machine, instead of constantly putting out fires.
2 - Extensibility is the game changer. Here's the deal: we need to make the API for adding new middlewares. The key is to keep it simple enough that any Developer can roll out of bed, chug a coffee, and create a new middleware without wanting to throw their laptop out the window. It should be a plug-and-play situation - once you've got the basics for reading and tweaking packets, everything else just falls into place like dominoes.
3 - Artificial Intelligence isn't just hype - it's a game-changer for network infrastructure. Once we've got our protocols implemented, what we're really dealing with is just text data. And guess what AI excels at? Chewing through text like a beaver on steroids. We're sitting on a goldmine of potential.
Imagine implementing AI-driven anomaly detection for connection patterns. It's like having a hyper-intelligent guard dog that not only barks at intruders but tells you exactly what kind of shifty business they're up to. We're talking about catching weird things in your network before it becomes a full-blown clusterfuck.
And let's not stop there. How about developing natural language interfaces for infrastructure management? Picture this: instead of wrestling with arcane command-line instructions, you're chatting with your infrastructure like it's your best friend. "Hey, what's the deal with these SSH connections?"
This isn't just about making things easier - it's about unlocking a whole new level of infrastructure management. We're talking about systems that don't just respond to commands, but actually understand context, predict issues, and suggest optimizations. Give your infrastructure a brain transplant and upgrading it from a flip phone to a quantum computer.
The AI revolution is here, and it's time we stopped using it just for chatbots and started applying it to the backbone of our digital world. Get this right, and we'll be running networks so smart they make Skynet look like a pocket calculator.
Real-world Applications
We've got real-world examples that are knocking it out of the park. Let's break this down:
- Command and Query Copilot. This AI system doesn't just understand protocol-specific commands and configs - it generates them across multiple languages and systems. It's like a universal translator for your infrastructure, making cross-platform management a breeze.
- AI data masking is solving the read authorization problem with a plug-and-play solution. This system dynamically adjusts masking levels based on the user's context and authorization. It's like having a bouncer that knows exactly how much to let each person see, no manual tweaking required.
- The review of critical commands is a game-changer for automation. CI/CD rollouts become nearly invisible, working out of the box for any tech stack. We're talking about a universal adapter system that integrates the review process seamlessly with any toolchain.
- Runbooks are getting a major upgrade. Imagine running a versioned script against any piece of infrastructure with a simple REST API. We've got AI-powered runbook generators creating and updating based on system logs and operator actions. And with low-code/no-code interfaces, we're slashing the cost and complexity of building robust automations. It's like giving everyone in your org the power of a senior DevOps engineer.
- The Access IDE is a game-changer built from the ground up for these new use cases. We're talking about a collaborative, cloud-based IDE where multiple team members can work on infrastructure code, with built-in security checks and real-time impact analysis. It's like Google Docs met your infrastructure and had a super-smart baby.
- Collaborative troubleshooting enables real-time collaboration during incident response across multiple protocols.
These are real solutions solving real problems right now. We're not just talking about incremental improvements; this is a paradigm shift in how we manage and optimize our infrastructure. The future isn't just coming - it's here, and it's time we all got on board.