You finally get Vertex AI to train a model the way you want, but now you need it running inside your k3s cluster. Then the permissions nightmare begins. Identity tokens, secure endpoints, flaky service accounts. It feels less like automation and more like plumbing.
Google’s Vertex AI handles model creation and prediction services. k3s brings lightweight Kubernetes to almost any node. When you connect the two, you get on-demand inference at the edge without paying full data-center tax. The trick is wiring access and workloads cleanly so both sides trust each other.
The workflow starts with identity. Vertex AI must call your k3s services securely, so map its service account to a role in your cluster’s RBAC. Use OIDC to federate identity between Google Cloud and k3s. Then apply network policies that narrow which pods can reach Vertex endpoints. Keep credentials short-lived, and rotate secrets automatically. Once your cluster and Vertex AI recognize each other’s signatures, the rest becomes straightforward: scheduled retraining jobs in Vertex push newly built models directly into your k3s deployment pipeline.
Treat CI/CD as the junction. Model versions flow through containers tagged by model SHA rather than vague version numbers. Vertex AI handles the learning logic, k3s deploys updated containers, and the pipeline enforces immutable artifacts. That’s how you move from “hope it works” to deterministic infrastructure.
Common issues usually boil down to misaligned tokens or expired keys. If predictions stall, check whether your k3s cluster clock matches Google’s time service—JWT expiration is brutal when clocks drift. Always test your identity federation before scaling nodes; nothing slows deployment like one worker stuck waiting for credential refresh.