The moment your infrastructure team starts spinning up cloud data platforms across environments, the once-clean boundaries between compute, storage, and identity turn into spaghetti. You need a way to define and manage Databricks workspaces like any other cloud resource. That is where Crossplane Databricks comes in.
Crossplane handles infrastructure as code, but instead of limiting you to Terraform modules or CLI calls, it runs inside Kubernetes. Think of it as a universal control plane that speaks cloud APIs while respecting policy and identity boundaries. Databricks, on the other hand, is a data engineering powerhouse built for collaborative analytics and machine learning. Combine them, and you get repeatable workspace automation baked directly into your cluster orchestration layer.
Connecting Crossplane to Databricks lets you declare a Databricks workspace, users, and tokens alongside your cloud storage and network config, all version-controlled. You can define how jobs and clusters should be provisioned, enforce permissions through Kubernetes RBAC, and treat workspace definitions like any other manifest. Once applied, Crossplane spins up Databricks environments exactly as specified, with credentials stored securely through Kubernetes secrets or external vaults.
Here is how the integration usually works. Crossplane uses providers to talk to cloud APIs. The Crossplane Databricks provider interfaces with the Databricks REST API so you can manage users, tokens, and clusters declaratively. Authentication is handled using OIDC or service principals from systems like Okta, Azure AD, or AWS IAM. Each configuration file becomes an auditable, repeatable piece of infrastructure that lives with your application code. If someone asks how that workspace was built, you have a clear manifest to show.
Common friction points are identity synchronization and secret rotation. Make sure your configuration points to short-lived credentials, ideally issued via an external identity provider. Automate token expiry and reissue using Kubernetes jobs. Lock down roles so your controllers only have minimal permissions to create, list, and tag resources. Fail fast on authentication errors instead of retrying endlessly—it keeps logs clean and alerts meaningful.
Top results you will see after wiring this up: