Kubernetes doesn't save money automatically
The pitch for Kubernetes includes, somewhere, the idea that bin-packing workloads onto shared compute will reduce your cloud bill. At sufficient scale, this is true. But "sufficient scale" is higher than most teams expect, and on the way there, Kubernetes introduces its own set of cost patterns that are easy to miss until they're large.
Here are the ones we find consistently when we do FinOps audits on EKS and AKS clusters.
The control plane isn't free
EKS charges $0.10 per hour per cluster — around $73 per month — just for the control plane, before a single workload runs. This is easy to absorb for a production cluster. It becomes significant when you have a proliferation of clusters: dev, staging, pre-prod, and per-team environments.
A team that creates six clusters "just for isolation" is spending $440/month on empty control planes. Namespaces with RBAC provide workload isolation within a cluster at no additional cost. This is the right default for most non-production environments.
AKS does not charge for the control plane, but charges for the uptime of your node pools regardless of utilisation — so idle dev clusters on AKS cost money in a different way.
Node groups that never scale down
Kubernetes scales up automatically when pods can't be scheduled. It does not always scale down as aggressively.
The Cluster Autoscaler (for EKS) and the cluster autoscaler for AKS both scale down — but only when nodes have been underutilised for a sustained period (default: 10 minutes) and only when the pods on those nodes can be safely evicted. Pods with local storage, pods with `PodDisruptionBudget` constraints set too conservatively, and pods without `descheduler` configuration can all prevent scale-down.
What we often find: a cluster that scaled up to handle a load spike six weeks ago and never fully scaled back down. The node count is stable, utilisation is low, and the team hasn't noticed because everything is "working."
Fix: audit your PodDisruptionBudgets, ensure non-critical workloads have appropriate disruption tolerance, and verify the Cluster Autoscaler logs actually show scale-down events happening.
Resource requests set too high
Kubernetes schedules pods based on resource requests, not actual usage. If your deployments request 2 CPU and 4Gi memory but routinely use 0.3 CPU and 800Mi, you are paying for the requested resources while only consuming a fraction of them.
This happens because engineers (reasonably) set requests high to avoid OOMKills and CPU throttling. The correct fix is not to lower requests blindly — it's to right-size them based on observed usage data.
Vertical Pod Autoscaler (VPA) in recommendation mode will tell you what your pods actually use over time without making any changes. Run it for two weeks, review the recommendations, and you'll have data to set appropriate requests. In our experience, this exercise typically identifies 20–35% of wasted cluster capacity.
Orphaned persistent volumes
When a PersistentVolumeClaim is deleted — or when a namespace is torn down — the underlying cloud disk (EBS volume on AWS, Azure Disk or Azure Files on AKS) is not always deleted with it. The retention policy on the StorageClass determines this behaviour, and the default `Retain` policy means disks accumulate.
A single orphaned EBS gp3 volume at 100Gi costs about $8/month. A cluster that has been running for a year with regular deployments and teardowns can accumulate dozens of these. We've found clusters with $300–500/month in orphaned volumes that no workload was using.
Fix: run a monthly audit. `kubectl get pv | grep Released` shows volumes that are no longer bound to a claim. Cross-reference with your cloud provider's disk inventory to find any that aren't showing up in Kubernetes at all.
Data transfer costs
This is the one that surprises teams the most because it doesn't show up in compute line items.
Kubernetes pods communicating across availability zones generate cross-AZ data transfer charges — $0.01/GB on AWS, each direction. At low traffic this is negligible. At scale, a microservices architecture where services routinely call other services across AZs can generate thousands of dollars per month in transfer costs that appear as a single line item in your bill.
Topology-aware routing (available in Kubernetes 1.21+) prefers routing traffic to endpoints in the same availability zone. Enabling it for high-traffic internal services can meaningfully reduce cross-AZ transfer. On EKS, also audit your VPC endpoints — traffic to AWS services (S3, DynamoDB, ECR) that routes over the public internet rather than through VPC endpoints pays data transfer costs it shouldn't.
The right approach to Kubernetes cost management
These problems share a root cause: Kubernetes abstracts resource consumption, which makes it easy to forget that the underlying infrastructure still has a cost model.
The fix is to make costs visible. Tag everything — clusters, node groups, namespaces if possible — so you can allocate spend to teams and workloads. Tools like Kubecost (open-source tier is useful) or the native cost visibility features in EKS/AKS can give you workload-level cost attribution. Once teams can see what their workloads cost, they tend to right-size them.
Cost optimisation in Kubernetes isn't a one-time event. It's a recurring practice — which is why we treat it as an ongoing part of managed infrastructure work, not a project with an end date.