AWS Fargate Cost Optimization: Why Your Containers Cost More Than They Should

Fargate is appealing for exactly the reason that makes it expensive: you don’t manage servers. You define a CPU and memory allocation, Fargate runs your container, and AWS bills you per-second for the vCPU and memory you provisioned. No idle EC2 instances sitting around. No over-provisioning a node group.

The problem is that “no idle instances” doesn’t mean no waste. Fargate charges for what you specify, not what your container actually uses. Specify 2 vCPU and 4GB for a container that uses 0.3 vCPU and 1GB at steady state, and you’re paying for 1.7 vCPU and 3GB of capacity that does nothing.

Most Fargate workloads have 30-60% provisioned headroom that’s never consumed.

How Fargate pricing works

Fargate charges two dimensions independently:

vCPU: $0.04048 per vCPU-hour
Memory: $0.004445 per GB-hour

Rounding up: 1 vCPU + 2GB for 720 hours (one month) ≈ $35.70/month per task. A fleet of 20 tasks at that size = $714/month in pure compute.

This is before data transfer, load balancer, and CloudWatch costs. Those add up fast for busy services.

Fargate also has valid minimum allocations: the smallest task is 0.25 vCPU + 0.5GB ($5.59/month). The largest is 16 vCPU + 120GB.

Right-sizing: the first and most impactful fix

The biggest Fargate lever is matching your task definition CPU/memory to actual container resource usage.

Step 1: Get baseline utilization from CloudWatch

ECS emits two key metrics per service:

CPUUtilization — percentage of provisioned CPU in use
MemoryUtilization — percentage of provisioned memory in use

If your CPUUtilization is consistently 20-30%, you’re paying for 3-4x the CPU you’re actually using. The fix is to reduce the task definition cpu value until the metric climbs to 50-70% at typical load.

Pull 7-day p95 utilization:

aws cloudwatch get-metric-statistics \
  --namespace AWS/ECS \
  --metric-name CPUUtilization \
  --dimensions Name=ServiceName,Value=my-service Name=ClusterName,Value=my-cluster \
  --statistics p95 \
  --period 3600 \
  --start-time $(date -d '7 days ago' +%Y-%m-%dT%H:%M:%S) \
  --end-time $(date +%Y-%m-%dT%H:%M:%S)

Target: provision for ~2x your p95 utilization. This gives enough headroom for burst without systematic over-provisioning.

Step 2: Understand Fargate CPU/memory pairings

Fargate doesn’t let you pick arbitrary CPU and memory values. There are fixed pairings. If you specify 512 CPU units (0.5 vCPU), the minimum memory is 1GB and maximum is 4GB. This matters because you might need to step up to a higher CPU tier just to get the memory you need, or you might be allocating far more CPU than needed to unlock a particular memory size.

Common pairings:

256 (.25 vCPU): 0.5GB – 2GB memory
512 (.5 vCPU): 1GB – 4GB memory
1024 (1 vCPU): 2GB – 8GB memory
2048 (2 vCPU): 4GB – 16GB memory

If a service needs 6GB memory but only 0.5 vCPU compute, you’re forced into 1 vCPU to access 6GB — paying for double the CPU you need. The alternative: rearchitect the service to use less memory (split into a smaller service, move memory-heavy operations to a different tier).

Auto scaling: stop paying for peak capacity 24/7

Static Fargate services run the same task count regardless of traffic. Most workloads have significant daily variance — evening and overnight traffic can be 80% lower than peak. Running 20 tasks at 3am when you need 4 is 16 tasks of pure waste.

ECS Application Auto Scaling adjusts task count based on CloudWatch metrics:

{
  "ServiceNamespace": "ecs",
  "ResourceId": "service/my-cluster/my-service",
  "ScalableDimension": "ecs:service:DesiredCount",
  "MinCapacity": 2,
  "MaxCapacity": 20
}

Two scaling approaches:

Target tracking (recommended for most services): Maintain a target metric value, like 60% average CPU utilization. ECS automatically adjusts task count to hold that target.

{
  "TargetValue": 60.0,
  "PredefinedMetricSpecification": {
    "PredefinedMetricType": "ECSServiceAverageCPUUtilization"
  },
  "ScaleInCooldown": 300,
  "ScaleOutCooldown": 60
}

Step scaling: Define explicit scale-out and scale-in steps at specific utilization thresholds. More control, more configuration.

Scale-in cooldown matters: set it too short and your service constantly oscillates between task counts. 300 seconds (5 minutes) for scale-in is a reasonable starting point for most services. Scale-out cooldown should be shorter (60 seconds) so you respond quickly to traffic spikes.

For batch/scheduled workloads: Use scheduled scaling to preemptively add tasks before predictable traffic peaks, and reduce them afterward. This avoids the reaction lag of metric-based scaling for patterns you can predict.

Spot for Fargate: 70% savings for interruptible workloads

Fargate Spot uses spare AWS capacity at a discount of approximately 70% off On-Demand rates. The trade-off: AWS can reclaim a Spot task with a 2-minute warning.

For stateless services with multiple replicas, Fargate Spot interruptions are manageable — ECS drains the task gracefully and replaces it. The key is having enough replicas that losing one or two simultaneously doesn’t degrade service.

Configure mixed On-Demand/Spot capacity providers in your service:

{
  "capacityProviderStrategy": [
    {
      "capacityProvider": "FARGATE",
      "weight": 1,
      "base": 2
    },
    {
      "capacityProvider": "FARGATE_SPOT",
      "weight": 4
    }
  ]
}

This configuration: minimum 2 On-Demand tasks (base), then 4 out of every 5 additional tasks on Spot. For a 10-task service, 2 run On-Demand and 8 run on Spot. Blended cost reduction: ~55% vs all On-Demand.

Use Spot for: stateless API services, workers, batch processors, anything that handles graceful shutdown.

Do not use Spot for: stateful services, anything processing payments or health records where mid-task interruption causes data issues.

Data transfer: the hidden Fargate cost

Fargate tasks in private subnets call AWS services (ECR, S3, SSM, Secrets Manager, DynamoDB) through the NAT Gateway by default. At $0.045/GB processed, this adds up — especially for container image pulls from ECR.

Container image pulls: A 1GB image pulled across 20 task replacements per day = 20GB/day = $0.90/day NAT Gateway charge = $27/month just for image pulls. Add a VPC Interface Endpoint for ECR (com.amazonaws.region.ecr.api and com.amazonaws.region.ecr.dkr) and image traffic stays within the VPC.

S3 and DynamoDB: VPC Gateway Endpoints are free and eliminate NAT charges for those services.

Cross-AZ traffic: Tasks spread across availability zones generate inter-AZ data transfer at $0.01/GB each direction. For high-request-rate services with frequent service-to-service calls, check the EC2 → Data Transfer line in Cost Explorer. The fix is service discovery that preferring same-AZ endpoints (ECS Service Connect supports this), or moving services that talk frequently into the same AZ when resilience requirements allow.

Container image size

Large container images cost in multiple ways: ECR storage, data transfer on every pull, and slower task start times (which means you need more headroom in your scaling configuration to handle traffic spikes while new tasks warm up).

Practical reductions:

Multi-stage builds to separate build dependencies from the final image
alpine or distroless base images instead of Ubuntu/Debian
.dockerignore to exclude test files, documentation, and development dependencies
Layer ordering (frequently-changed layers last) to maximize cache reuse

A Python service that starts at 1.5GB image size can typically get to 300-400MB with these changes. The pull time drops proportionally, which improves autoscaling responsiveness.

ECS Exec and ephemeral storage

Two lesser-known billing items:

Ephemeral storage: Fargate tasks get 20GB ephemeral storage by default, included in the task price. If your task definition requests additional storage (up to 200GB), you pay $0.10/GB/month for the overage. Audit your task definitions for ephemeralStorage settings — many teams set these defensively and forget about them.

AWS Graviton (ARM) for Fargate: The same ARM pricing advantage applies to Fargate. Graviton Fargate tasks cost ~10-20% less than x86 at the same CPU/memory allocation. Most containerized applications run on ARM without modification — the ECS-optimized ARM image is maintained by AWS, and multi-arch Docker images are standard practice.

Cost audit approach for Fargate

For a structured review:

Pull Fargate costs in Cost Explorer by service, last 3 months. Look for services with steadily high cost and no obvious traffic justification.
Check CPUUtilization and MemoryUtilization for each service — anything below 40% average is over-provisioned.
Check task counts by hour — do they vary with traffic? If flat, auto scaling is not configured or not effective.
Check Spot adoption — what percentage of tasks are on Fargate Spot?
Review VPC endpoints — ECR, S3, DynamoDB endpoints in place?
Check image sizes in ECR — anything over 1GB warrants a build optimization pass.

Teams that haven’t done this audit typically find 30-50% cost reduction available, with the bulk from right-sizing and Spot adoption.

Getting help

Fargate cost optimization is often the first engagement that opens a longer relationship — right-sizing and Spot configuration are fast wins that demonstrate ROI clearly. If you want a second set of eyes on your ECS/Fargate spend, I’m available for a cost audit.

Nick Allevato is an AWS Certified Solutions Architect Professional with 20 years of infrastructure experience. He runs Cold Smoke Consulting, an independent AWS consulting practice.