Kubernetes Cost Optimization on AWS EKS: Reduce Infrastructure Costs by 66% with Proven Strategies
Kubernetes cost optimization on AWS EKS delivers 50-70% infrastructure cost reductions through systematic right-sizing, spot instance strategies, intelligent autoscaling, and comprehensive cost allocation. This guide explores proven optimization techniques including a real case study where we reduced a production EKS cluster from $414/month to $138/month—a 66% cost savings.
For organizations running containerized workloads on Amazon EKS, infrastructure costs can quickly spiral out of control without proper optimization strategies. The combination of over-provisioned nodes, inefficient resource allocation, and lack of cost visibility creates an average 60% waste in Kubernetes infrastructure spending.
The EKS Cost Optimization Opportunity
The Hidden Cost of Unoptimized Kubernetes
Common Cost Drivers in EKS Clusters:
Over-Provisioned Worker Nodes (40% of waste):
- Nodes sized for peak load but running at 30-40% average utilization
- Fixed node groups without auto-scaling creating permanent over-capacity
- Lack of right-sizing based on actual workload requirements
- “Just in case” capacity planning leading to resource waste
Inefficient Resource Requests (25% of waste):
- Pod resource requests set too high “to be safe”
- No resource limits allowing pods to consume excessive resources
- Lack of Vertical Pod Autoscaler leading to static, inefficient allocations
- Copy-paste resource specifications without measurement
Lack of Cost Visibility (20% of waste):
- No cost allocation by team, application, or environment
- Inability to track cost trends and identify expensive workloads
- Missing cost anomaly detection and alerting
- No accountability for infrastructure spending
Underutilization of Cost-Saving Features (15% of waste):
- Not using Spot instances (70% cost savings opportunity)
- Missing Savings Plans or Reserved Instances for stable workloads
- Inefficient data transfer patterns between availability zones
- Unused Load Balancers and EBS volumes
Industry Benchmark Data:
- Average EKS cluster waste: 60% of infrastructure spending
- Typical CPU utilization: 25-35% across Kubernetes clusters
- Memory utilization: 40-50% average across pods
- Cost per vCPU/month: $30-40 on-demand, $4-12 optimized with Spot
Real-World EKS Cost Optimization Results
Case Study: SaaS Platform (Series A Startup, 50 Employees)
Initial State (Pre-Optimization):
- Monthly EKS cost: $414
- Cluster configuration: 3x m5.large nodes (2 vCPU, 8GB RAM each)
- Node pricing: On-demand instances at $0.096/hour per node
- Average CPU utilization: 28% across all nodes
- Average memory utilization: 45% across all pods
- Autoscaling: None (fixed 3-node cluster)
- Cost visibility: No cost allocation or tracking by application
Optimization Analysis:
- Identified 50+ pods with over-provisioned resource requests
- Discovered opportunity for Spot instances (stateless workloads)
- Found 40% of workloads could run on ARM-based Graviton instances
- Detected opportunity for Cluster Autoscaler and right-sizing
Post-Optimization Results (Month 3):
- Monthly EKS cost: $138 (66% reduction, $276 monthly savings)
- Cluster configuration: Managed node groups with Karpenter auto-scaling
- Spot instances: 75% of capacity at $0.029/hour average
- On-demand instances: 25% baseline capacity for critical workloads
- Graviton-based instances (t4g, c7g): 20% cost savings vs. x86
- Average CPU utilization: 65% across all nodes (improved packing)
- Average memory utilization: 70% across all pods (right-sized requests)
- Autoscaling: Dynamic scaling from 2-6 nodes based on demand
- Cost visibility: Full cost allocation with AWS Split Cost Allocation
Annual Impact:
- $3,312 annual savings from infrastructure optimization
- 10x ROI on optimization investment (paid for itself in 1 month)
- Improved reliability: Auto-scaling handles traffic spikes without manual intervention
- Better resource utilization: 2.3x improvement in cluster efficiency
AWS EKS Cost Optimization Strategies
Strategy 1: Spot Instances for EKS Worker Nodes (70% Savings)
Understanding AWS Spot Instances:
- Spare EC2 capacity available at up to 90% discount vs. on-demand pricing
- Can be reclaimed by AWS with 2-minute warning when capacity needed
- Highly available when using diversification strategies (95%+ availability)
- Perfect for fault-tolerant, stateless Kubernetes workloads
Spot Instance Best Practices for Kubernetes:
1. Diversification Across Instance Types
- Request 4-6 different instance types across multiple families
- Example: m5.large, m5a.large, m5n.large, m6i.large, c5.large, c5a.large
- Each instance type has independent spot capacity pool
- Diversification reduces interruption risk by 80-90%
2. Multi-AZ Distribution
- Spread nodes across all 3 availability zones
- Each AZ has independent Spot capacity pools
- Kubernetes scheduler automatically distributes workloads across AZs
- Load balancers route traffic away from interrupted nodes
3. Spot Interruption Handling
- AWS Node Termination Handler: Automatically drains nodes on interruption warning
- Pod Disruption Budgets: Prevent too many replicas terminating simultaneously
- Multiple replicas: Ensure 2+ replicas of each critical application
- Graceful shutdown: Configure pre-stop hooks for clean pod termination
4. Mixed Instances Policy
- On-demand baseline: 20-30% of total capacity for critical workloads
- Spot capacity: 70-80% of total capacity for fault-tolerant workloads
- Automatic failover: Kubernetes reschedules pods from terminated Spot nodes
- Cost optimization: 50-60% total savings with high availability
Implementation with Karpenter (Recommended):
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
name: default
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["spot", "on-demand"]
- key: kubernetes.io/arch
operator: In
values: ["amd64", "arm64"]
limits:
resources:
cpu: 1000
memory: 1000Gi
providerRef:
name: default
weight: 50
Spot vs. On-Demand Cost Comparison:
- On-demand m5.large: $0.096/hour = $70/month
- Spot m5.large: $0.029/hour average = $21/month (70% savings)
- 10-node cluster savings: $490/month = $5,880 annually
When NOT to Use Spot Instances:
- Stateful applications (databases, caching, message queues)
- Single-replica deployments without redundancy
- Applications that can’t tolerate interruptions
- Workloads requiring guaranteed capacity (SLA commitments)
Strategy 2: Right-Sizing Pods and Nodes
Pod Resource Right-Sizing:
Step 1: Analyze Current Resource Usage
- Use Kubernetes Metrics Server for real-time resource metrics
- Analyze 30-day historical usage with Prometheus or CloudWatch Container Insights
- Identify pods with high request-to-usage ratios (over-provisioned)
- Target: Resource requests within 20-30% of actual peak usage
Step 2: Implement Vertical Pod Autoscaler (VPA)
- VPA recommends optimal CPU and memory requests based on historical usage
- Automatic mode: VPA updates pod specs and restarts pods (use cautiously in production)
- Recommendation mode: VPA provides recommendations without applying changes
- Best practice: Start with recommendation mode, manually apply validated changes
Example VPA Configuration:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-app-vpa
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: my-app
updatePolicy:
updateMode: "Recommendation" # Start with recommendations
resourcePolicy:
containerPolicies:
- containerName: '*'
minAllowed:
cpu: 100m
memory: 128Mi
maxAllowed:
cpu: 2000m
memory: 2Gi
Step 3: Set Appropriate Resource Limits
- Resource requests: Guaranteed resources for pod scheduling
- Resource limits: Maximum resources pod can consume (prevent noisy neighbor issues)
- Best practice: Set limits 1.5-2x higher than requests for burstable workloads
- Memory limits: Set conservatively to prevent OOMKilled errors
Real-World Example:
-
Before optimization: Request 1 CPU, 2GB RAM Actual usage: 0.3 CPU, 800MB RAM -
After right-sizing: Request 0.4 CPU, 1GB RAM 60% resource request reduction - Cluster impact: 40% more pods fit on same node count = 40% cost savings
Node Right-Sizing:
Analyzing Node Utilization:
- Monitor node CPU and memory allocation vs. actual usage
- Target: 70-80% node resource allocation for efficient bin-packing
- Identify over-provisioned nodes (>50% of requested resources unused)
- Consider smaller instance types for better granularity and efficiency
Instance Type Selection Framework:
Small workloads (<50 pods): t3.medium, t3a.medium
Standard workloads (50-150 pods): m5.large, m6i.large, c5.large
Large workloads (150+ pods): m5.xlarge, m6i.xlarge
Memory-intensive: r5.large, r6i.large
Compute-intensive: c5.xlarge, c6i.xlarge
Cost-optimized: t4g (Graviton), m7g, c7g
Graviton (ARM) Instance Savings:
- T4g, M7g, C7g instances: 20% cost savings vs. equivalent x86 instances
- Most containerized workloads compatible without changes
- Performance comparable or better for most applications
- Recommendation: Start with 20-30% ARM nodes, increase as validated
Strategy 3: Intelligent Autoscaling
Cluster Autoscaler Configuration:
How Cluster Autoscaler Works:
- Detects pods in Pending state (insufficient node capacity)
- Calculates minimum nodes needed to schedule pending pods
- Provisions new nodes from configured Auto Scaling Groups
- Scales down underutilized nodes after 10 minutes (configurable)
Best Practices for Cluster Autoscaler:
- Set min/max node counts per node group to control costs
- Use multiple node groups for different instance types (on-demand, spot, sizes)
- Configure scale-down utilization threshold (default: 50%)
- Set appropriate scan interval (default: 10 seconds)
Example Cluster Autoscaler Configuration:
apiVersion: apps/v1
kind: Deployment
metadata:
name: cluster-autoscaler
namespace: kube-system
spec:
replicas: 1
template:
spec:
containers:
- image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.28.0
name: cluster-autoscaler
command:
- ./cluster-autoscaler
- --v=4
- --cloud-provider=aws
- --skip-nodes-with-local-storage=false
- --expander=least-waste
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled
- --scale-down-enabled=true
- --scale-down-delay-after-add=5m
- --scale-down-unneeded-time=10m
- --scale-down-utilization-threshold=0.5
Karpenter: Next-Generation Autoscaling (Recommended):
Why Karpenter Over Cluster Autoscaler:
- Node provisioning in seconds (vs. minutes with Cluster Autoscaler)
- No node groups required (directly provisions EC2 instances)
- Better bin-packing with just-in-time node sizing
- Native Spot instance diversification and interruption handling
- 20-30% additional cost savings through superior packing efficiency
Karpenter Benefits:
- Provisions exactly sized nodes for pending pods (no wasted capacity)
- Automatic consolidation replaces nodes with smaller instances when possible
- Weighted provisioning balances cost vs. availability vs. performance
- Built-in Spot interruption handling with automatic failover
Real-World Karpenter Results:
- Cluster autoscaling time: 45 seconds → 15 seconds (3x faster)
- Node utilization improvement: 55% → 75% (36% efficiency gain)
- Cost reduction: 25-35% beyond Cluster Autoscaler savings
- Operational simplicity: No node groups to manage
Horizontal Pod Autoscaler (HPA):
HPA for Application-Level Scaling:
- Automatically scales pod replicas based on CPU, memory, or custom metrics
- Responds to traffic increases before node capacity exhausted
- Works in conjunction with Cluster Autoscaler/Karpenter for node scaling
- Typical configuration: Target 70% CPU utilization
Example HPA Configuration:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
Combined Autoscaling Strategy:
- HPA scales pods based on application metrics (CPU, memory, requests/sec)
- Cluster Autoscaler/Karpenter provisions nodes when pods pending
- VPA optimizes resource requests based on actual usage patterns
- Result: Maximum efficiency and cost optimization with automatic scaling
Strategy 4: AWS Split Cost Allocation and FinOps
AWS Split Cost Allocation Data (October 2024 Feature):
What Split Cost Allocation Provides:
- Cost visibility down to individual Kubernetes resources (pods, namespaces)
- Integration with Kubernetes labels for cost allocation by team/application
- EKS resource costs split proportionally based on CPU and memory requests
- Works with AWS Cost Explorer, CUR (Cost and Usage Reports), and FinOps tools
Implementing Cost Allocation:
Step 1: Enable Split Cost Allocation
- Navigate to AWS Billing Console → Cost Allocation Tags
- Enable “Split Cost Allocation Data” for EKS clusters
- Wait 24 hours for initial data population
- Data appears in Cost Explorer and Cost and Usage Reports
Step 2: Implement Kubernetes Labeling Strategy
- Add cost allocation labels to all pods, deployments, and namespaces
- Recommended labels:
team: Engineering team responsible for applicationapplication: Application or service nameenvironment: dev, staging, productioncost-center: Finance cost center for chargeback
Example Pod Labeling:
apiVersion: v1
kind: Pod
metadata:
name: my-app
labels:
app: my-app
team: platform
application: api-gateway
environment: production
cost-center: engineering
spec:
containers:
- name: my-app
image: my-app:latest
resources:
requests:
cpu: 500m
memory: 1Gi
Step 3: Cost Visibility and Reporting
- AWS Cost Explorer: Filter by Kubernetes label tags
- Grafana dashboards: Real-time cost metrics by team/application
- Monthly cost reports: Automated showback/chargeback to teams
- Cost anomaly detection: Alert on unusual spending patterns
Cost Allocation Use Cases:
Showback Model (Cost Transparency):
- Share infrastructure costs by team/application without budget transfers
- Build cost awareness across engineering organization
- Identify expensive applications and optimization opportunities
- Monthly reports showing cost trends and comparisons
Chargeback Model (Budget Accountability):
- Transfer actual infrastructure costs to team/project budgets
- Hold teams accountable for infrastructure spending
- Incentivize cost optimization at team level
- Requires mature FinOps practices and finance processes
Real-World Cost Allocation Results:
- Identified top 3 cost-driving applications (60% of total EKS costs)
- Discovered development environment consuming 35% of cluster costs (optimization opportunity)
- Enabled team-level cost accountability reducing costs 25% in 6 months
- Improved cost predictability through trend analysis and forecasting
Strategy 5: Network and Storage Optimization
Reducing Data Transfer Costs:
Inter-AZ Data Transfer:
- Cost: $0.01/GB for data transfer between availability zones
- Impact: High-traffic applications can spend $100-500/month on inter-AZ transfer
- Optimization strategies:
- Pod topology spread constraints to prefer same-AZ scheduling
- Node affinity for data-intensive applications
- Service mesh (Istio, Linkerd) with locality-aware load balancing
Example Topology Spread Constraint:
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: my-app
NAT Gateway Cost Optimization:
- NAT Gateway cost: $0.045/hour + $0.045/GB processed = $32/month minimum
- Multiple NAT Gateways (one per AZ): $96/month for 3-AZ cluster
- Optimization: VPC endpoints for AWS services (S3, DynamoDB, ECR)
- Eliminates NAT Gateway data processing charges for AWS API calls
- Typical savings: $50-150/month depending on AWS API usage
EBS Volume Optimization:
Volume Type Selection:
- gp3 (General Purpose SSD): Default choice, 20% cheaper than gp2
- io2 (Provisioned IOPS): High-performance databases only
- Migrate all gp2 volumes to gp3 for immediate 20% savings
Storage Class Configuration:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: gp3
provisioner: ebs.csi.aws.com
parameters:
type: gp3
iops: "3000"
throughput: "125"
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
Volume Lifecycle Management:
- Implement PVC cleanup for terminated pods
- Delete unused EBS volumes (orphaned after pod deletion)
- Regular audits for unattached volumes (use AWS Config rules)
- Typical savings: $50-200/month from orphaned volume cleanup
Load Balancer Optimization:
NLB vs. ALB Cost Comparison:
- Network Load Balancer (NLB): $0.0225/hour + $0.006/LCU = $16/month minimum
- Application Load Balancer (ALB): $0.0225/hour + $0.008/LCU = $16/month + data processing
- Optimization: Use single ALB with Ingress controller for multiple services
AWS Load Balancer Controller:
- Deploy AWS Load Balancer Controller for EKS
- Automatically provision ALB for Kubernetes Ingress resources
- Share single ALB across multiple applications using path-based routing
- Cost savings: $200-500/month consolidating 15-30 LoadBalancer services to single ALB
EKS Cost Monitoring and Governance
Real-Time Cost Visibility Tools
Native AWS Tools:
AWS Cost Explorer:
- Daily cost breakdown by service (EKS, EC2, EBS, data transfer)
- Filter by tags for cost allocation by team/application
- Forecasting for upcoming month cost prediction
- Anomaly detection for unusual spending patterns
CloudWatch Container Insights:
- Real-time resource utilization metrics (CPU, memory, network)
- Container and pod-level performance data
- Integration with CloudWatch Dashboards for visualization
- Alarms for resource utilization thresholds
Third-Party Cost Management Tools:
Kubecost (Popular Open-Source Option):
- Kubernetes-native cost visibility and optimization
- Cost breakdown by namespace, deployment, pod, and label
- Savings recommendations (right-sizing, spot instances, autoscaling)
- Budget alerts and cost anomaly detection
- Free community edition available
Cast.ai, Spot.io, CloudHealth:
- Automated optimization recommendations and implementation
- Multi-cluster cost management and governance
- Showback/chargeback reporting for teams
- Typically 20-30% additional savings beyond manual optimization
- Cost: 5-15% of managed cloud spend
Cost Governance Policies
Resource Quotas and Limits:
Namespace Resource Quotas:
apiVersion: v1
kind: ResourceQuota
metadata:
name: team-quota
namespace: team-platform
spec:
hard:
requests.cpu: "50"
requests.memory: "100Gi"
limits.cpu: "100"
limits.memory: "200Gi"
persistentvolumeclaims: "20"
services.loadbalancers: "2"
Limit Ranges for Pod Defaults:
apiVersion: v1
kind: LimitRange
metadata:
name: default-limits
spec:
limits:
- default:
cpu: 500m
memory: 1Gi
defaultRequest:
cpu: 250m
memory: 512Mi
type: Container
Cost Optimization Policies:
Automated Policy Enforcement:
- OPA (Open Policy Agent) for admission control
- Prevent pods without resource requests/limits
- Enforce cost allocation labels on all resources
- Block expensive instance types in non-production clusters
Monthly Cost Review Process:
- Week 1: Export cost data and identify trends
- Week 2: Review optimization recommendations (right-sizing, Spot, autoscaling)
- Week 3: Team-level cost reviews with showback reports
- Week 4: Implement approved optimizations and track savings
Cost Optimization KPIs:
- Cost per pod per month (target: $5-15 depending on size)
- Average node utilization (target: 65-75%)
- Spot instance coverage (target: 60-80% of total capacity)
- Monthly cost trend (target: <5% growth aligned with business growth)
Implementation Roadmap: 30-Day EKS Cost Optimization
Week 1: Assessment and Quick Wins
Day 1-2: Cost Analysis
- Export 30-60 days of cost data from AWS Cost Explorer
- Analyze cost breakdown by service (EC2, EBS, data transfer, load balancers)
- Identify top cost-driving resources and workloads
- Calculate current cluster efficiency (node utilization, resource requests vs. usage)
Day 3-4: Quick Win Implementation
- Migrate all gp2 EBS volumes to gp3 (20% savings, 30 minutes)
- Delete unused EBS volumes and snapshots (immediate savings)
- Implement VPC endpoints for S3, ECR, and other AWS services
- Consolidate multiple LoadBalancer services to single ALB with Ingress
Day 5: Spot Instance Pilot
- Create Spot instance node group for non-critical workloads
- Configure AWS Node Termination Handler for graceful shutdown
- Deploy test applications with multiple replicas to Spot nodes
- Validate interruption handling and failover behavior
Week 1 Expected Savings: 15-25% cost reduction from quick wins
Week 2: Right-Sizing and Autoscaling
Day 6-8: Pod Right-Sizing
- Deploy Vertical Pod Autoscaler in recommendation mode
- Analyze VPA recommendations for top 20 resource-consuming pods
- Implement right-sized resource requests for identified workloads
- Monitor application performance after right-sizing changes
Day 9-10: Cluster Autoscaling
- Deploy Cluster Autoscaler or Karpenter (Karpenter recommended)
- Configure appropriate min/max node counts for node groups
- Set scale-down thresholds and timers
- Test autoscaling behavior with load tests
Day 11-12: Horizontal Pod Autoscaling
- Implement HPA for applications with variable traffic
- Configure appropriate scaling metrics (CPU, memory, custom metrics)
- Set conservative min replicas to ensure availability
- Validate HPA behavior during traffic spikes
Week 2 Expected Incremental Savings: 20-30% additional cost reduction
Week 3: Advanced Optimization
Day 13-15: Graviton (ARM) Instance Migration
- Identify applications compatible with ARM architecture (80%+ typically compatible)
- Create Graviton-based node groups (t4g, m7g, c7g instances)
- Gradually migrate workloads to Graviton nodes with canary deployments
- Validate performance and stability on ARM instances
Day 16-18: Network Optimization
- Implement topology spread constraints for same-AZ pod placement
- Deploy VPC endpoints for all frequently used AWS services
- Audit and eliminate unnecessary inter-AZ traffic patterns
- Optimize service mesh configuration for locality-aware routing
Day 19-20: Storage Optimization
- Audit all PersistentVolumes and identify unused volumes
- Implement automated PVC cleanup policies
- Optimize EBS volume sizes based on actual usage
- Configure storage class defaults with gp3 volumes
Week 3 Expected Incremental Savings: 10-20% additional cost reduction
Week 4: Governance and Continuous Optimization
Day 21-23: Cost Allocation Implementation
- Enable AWS Split Cost Allocation Data in billing console
- Implement comprehensive labeling strategy across all Kubernetes resources
- Deploy cost monitoring dashboards (Kubecost or custom Grafana dashboards)
- Configure cost anomaly alerts and budget notifications
Day 24-26: Policy and Governance
- Implement resource quotas for all namespaces
- Deploy OPA policies for admission control (resource requests required, cost labels enforced)
- Create monthly cost review process and assign owners
- Document optimization best practices and team training materials
Day 27-30: Validation and Optimization
- Compare costs before/after optimization (target: 50-70% reduction)
- Validate application performance and stability maintained
- Create ongoing optimization roadmap (quarterly reviews)
- Train teams on cost-conscious Kubernetes practices
Total 30-Day Cost Reduction: 50-70% depending on initial optimization state
Ready to Optimize Your EKS Costs?
Daily DevOps specializes in AWS EKS cost optimization that delivers 50-70% infrastructure cost reductions while improving performance, reliability, and operational efficiency. Our proven methodologies balance immediate savings with long-term optimization practices.
Schedule Your Free EKS Cost Audit:
- Comprehensive analysis of your current EKS infrastructure costs
- Identification of immediate optimization opportunities (quick wins)
- Projected savings calculation with 30-60-90 day implementation roadmap
- Right-sizing, Spot instance, and autoscaling recommendations
What You’ll Receive:
- 90-minute consultation reviewing your EKS cluster architecture
- Detailed cost optimization report with specific recommendations
- Custom implementation plan with timeline and effort estimates
- Ongoing optimization strategy for sustained cost management
Contact Jon Price:
- Email: jon@jonprice.io
- LinkedIn: linkedin.com/in/jonpricelinux
- Location: Pacific Northwest (serving Western US and remote clients)
Transform your Kubernetes infrastructure from cost burden to optimized platform. Let’s unlock your 50-70% cost savings opportunity together.
This article is part of our AWS Cost Optimization and Kubernetes series. For more insights on EKS best practices, container cost management, and AWS infrastructure optimization, explore our comprehensive resource library and case studies.