Introduction
- Best practices for GKE multi-tenancy with enterprise organisations
- Assume teams deploy workloads through Kubernetes API without platform team’s input
- Definitions of a tenant:
- Team responsible for 1+ workloads
- Set of related workloads
- Single workload
Networking
- Shared VPC for each cluster/environment
- In Cluster Networking folder
- Managed by central networking team
- Tenant shared VPC per environment
HA and Reliability
- One cluster admin per project
- Prevents misconfigurations affecting all clusters
- Private clusters
- Disable access to nodes and manage control plane access
- Regional clusters—control plane and nodes
- Utilise autoscaling
- Schedule maintenance windows
- Set up shared Ingress/load balancer
Security
- Network polices
deny-all
for cross-namespace traffic by default
- GKE Sandbox
- User-space kernel
- Stops malicious tenants from affecting others
- Policy-based admission controls
- Prevent pods violating security policies
- Options:
- Gatekeeper OPA—requires GKE Enterprise
- PodSecurity admission controller
- Workload Identity Federation for GKE
- Access to GCP services
- Map Kubernetes service accounts names to virtual Google Cloud service account handles—assign IAM roles
- Authorized Networks
- Restrict IPs which have access to control plane
Provisioning
- Namespace per tenant
- Tenant admin manages users with namespace
- Standardise namespace names—across environments to make config easier, CI/CD scripts etc.
- Project per tenant for non-cluster resources
- Including logs, monitoring, service accounts etc.
- Kubernetes RBAC—fine-grained access to namespaces
- Create tenant-specific service account for each workload
- Security
- Map to Kubernetes service accounts via Workload Identity Federation
- Create resource quota per namespace—CPU and memory
Monitoring, Logging and Usage
- GKE Cost Allocation
- Cost breakdown by namespace and label
- Not supported by Autopilot
- Tenant-specific logs
- Log Router—sink to export to log bucket in tenant projects
References