Introduction

Golden Signals

  • Workload rightsizing (requests vs actuals)
    • 3rd party tools e.g. Kubecost often use requests to calculate cost apportion for chargeback
  • Demand-based down-scaling
    • Ensure apps can handle—startup/shutdown speed
  • Cluster bin packing (requests vs allocatable)
  • Discount coverage (percentage of cluster covered by e.g. spot or sustained use discounts)
    • Elite performers heavily use discounts—they understand what is going on within their clusters
    • Don’t purchase commit before rightsizing—could over-commit, need data

General Best Practices

  • As a minimum: set resource requests
  • Use labels and PodDisruptionBudgets
  • With multi-zonal clusters—look out for inter-zone egress cost
    • Can add up—be aware of cluster/application deployment topology
    • Istio can help
  • With multi-regional failover clusters
    • Ensure failover region can warm up before scheduled failover

GKE Best Practices

  • Use GKE Cost Optimization tab
  • Dashboards available at organizational level
  • If using Autopilot—default min requests set by Autopilot if none set
    • Need to check before deployment—could under/over provision with default

References


Graph View