LOCAL PREVIEW View on GitHub

ECS Fargate - Basics to Production

What Is ECS?

ECS stands for Elastic Container Service. It is AWS's container orchestration service — it runs Docker containers and manages where they run, how many copies run, and what happens when one crashes.

Think of ECS as the manager that answers: - "Run this Docker image for me" - "Keep 10 copies of it running at all times" - "If one dies, start a new one automatically" - "When traffic spikes, start more; when it drops, stop them"


What Is Fargate?

ECS has two launch types:

EC2 Launch Type Fargate Launch Type
You manage EC2 instances (VMs), patching, scaling Nothing — AWS manages everything
You pay for EC2 instance hours (whether used or not) Exact CPU + memory consumed per second
Startup time Fast (container on existing VM) Slightly slower (~10-30s cold start)
Operational overhead High Near-zero

Fargate = ECS without servers. You define your container, give it CPU/memory, and AWS handles the underlying infrastructure. You never SSH into a VM, never patch an OS, never worry about bin-packing containers onto nodes.


Core Concepts

Task Definition

A blueprint for how to run your container.

{
  "family": "manga-orchestrator",
  "cpu": "1024",
  "memory": "2048",
  "networkMode": "awsvpc",
  "containerDefinitions": [
    {
      "name": "orchestrator",
      "image": "123456789.dkr.ecr.us-east-1.amazonaws.com/manga-orchestrator:latest",
      "portMappings": [{ "containerPort": 8080 }],
      "environment": [
        { "name": "ENV", "value": "production" }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/manga-orchestrator",
          "awslogs-region": "us-east-1"
        }
      }
    }
  ]
}
  • cpu: 1024 units = 1 vCPU
  • memory: 2048 MB = 2 GB RAM
  • One task definition can have multiple containers (sidecars)

Task

A running instance of a Task Definition. One task = one running container (or container group). Tasks are ephemeral — they start, do work, and stop.

Service

A Service keeps N tasks running at all times. If a task crashes, the service scheduler starts a replacement.

Service: manga-orchestrator-service
  Desired count: 10
  Running count: 10   ← scheduler ensures these match
  Task definition: manga-orchestrator:v42

Cluster

A logical grouping of tasks and services. With Fargate, the cluster is just a namespace — there are no EC2 instances to manage inside it.


Networking Model

Every Fargate task gets its own ENI (Elastic Network Interface) — its own private IP inside your VPC. This is the awsvpc network mode.

VPC: 10.0.0.0/16
  Private Subnet A (AZ-1): 10.0.1.0/24
    Task 1: 10.0.1.45  ← Fargate task, its own ENI
    Task 2: 10.0.1.67
  Private Subnet B (AZ-2): 10.0.2.0/24
    Task 3: 10.0.2.12
    Task 4: 10.0.2.34

Traffic flow for MangaAssist:

User → CloudFront → ALB → Target Group → Fargate Tasks (round-robin)

The ALB distributes requests across all healthy tasks. Tasks register/deregister automatically as they start and stop.


Auto-Scaling

Two dimensions of scaling:

Horizontal Scaling (more tasks)

Add or remove task instances based on load.

Scale out trigger: CPU > 70% for 3 minutes  →  add 5 tasks
Scale in trigger:  CPU < 30% for 10 minutes →  remove 5 tasks
Min tasks: 10
Max tasks: 100

Target Tracking (simpler)

Set a target metric and ECS handles the math:

Target: 60% average CPU utilization
ECS auto-adjusts task count to maintain this

For MangaAssist, this handles the 10x traffic pattern: - Normal load: 10 tasks - Peak (flash sale): auto-scales to ~80 tasks - Lambda handles the overflow burst (see Lambda doc)


How MangaAssist Uses ECS Fargate

Chatbot Orchestrator Service

The central coordinator for every chat message runs on ECS Fargate.

Service: chatbot-orchestrator
  Task definition: orchestrator:latest
  CPU per task: 1 vCPU
  Memory per task: 2 GB
  Desired count: 10 (normal), scales to 100 (peak)
  Load balancer: ALB with WebSocket sticky sessions
  Subnets: Private (no public IP)
  Security group: Allow 8080 from ALB only

Why ECS Fargate for the Orchestrator? - Holds long-lived WebSocket connections (Lambda has a 15-minute timeout; WebSocket sessions can last longer) - Streams tokens back to the user — a persistent container is better than a short-lived function for streaming - Stateless container = easy to scale horizontally - No server management overhead

WebSocket Handler Service

Separate service that maintains WebSocket connections. Uses sticky sessions at the ALB so a user's WebSocket messages always route to the same task.

ALB Listener Rule:
  Path: /ws/*
  Target Group: websocket-tg
  Stickiness: enabled (duration-based, 5 min)

ECS Fargate vs Alternatives

vs EKS (Kubernetes)

ECS Fargate EKS
Learning curve Low High (K8s concepts)
Operational overhead Near-zero High (nodes, CNI, upgrades)
Burst to Lambda Native AWS integration Requires extra work
AWS service integration (IAM, ALB, CloudWatch) First-class Requires add-ons
Multi-cloud portability AWS only Portable

For MangaAssist: ECS Fargate wins because the team gets to focus on the chatbot, not on Kubernetes operations.

vs EC2 directly

ECS Fargate abstracts away the VM entirely. You would never run containers directly on EC2 in a production system without an orchestrator.


Cost Model

You pay for vCPU-seconds and GB-seconds consumed by your tasks.

Approximate rates (us-east-1): - vCPU: $0.04048 per vCPU-hour - Memory: $0.004445 per GB-hour

Example: 10 tasks, each 1 vCPU + 2 GB, running 24/7:

CPU:    10 tasks × 1 vCPU × 24h × $0.04048  = $9.71/day
Memory: 10 tasks × 2 GB   × 24h × $0.004445 = $2.13/day
Total baseline: ~$11.84/day = ~$355/month

This scales linearly as task count increases. No waste from idle EC2 instances.


Key Operational Patterns

Blue/Green Deployments

ECS integrates with CodeDeploy for zero-downtime deployments: 1. New tasks start with the new image (green) 2. Traffic shifts from old (blue) to green gradually 3. Old tasks drain connections and terminate 4. Automatic rollback if health checks fail

Health Checks

ALB health check:
  Path: /health
  Interval: 30s
  Healthy threshold: 2
  Unhealthy threshold: 3

Container health check:
  Command: ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"]
  Interval: 30s
  Retries: 3

A task must pass both ALB and container health checks to receive traffic.

IAM Roles for Tasks

Each task gets a Task Role — an IAM role with least-privilege permissions.

manga-orchestrator-task-role:
  Allow: dynamodb:GetItem, PutItem (manga_chatbot_memory table only)
  Allow: bedrock:InvokeModel (Claude 3.5 Sonnet only)
  Allow: sagemaker:InvokeEndpoint (intent-classifier endpoint only)
  Allow: elasticache:Connect (manga-cache cluster only)
  Deny: everything else

The container application uses these credentials automatically via the metadata endpoint — no hardcoded keys.


Summary

Concept One-Line Summary
ECS AWS container orchestration service
Fargate Serverless compute engine for ECS (no EC2 to manage)
Task Definition Blueprint: what image, how much CPU/memory
Task One running container instance
Service Keeps N tasks running, handles restarts and scaling
Cluster Logical grouping of services
awsvpc Each task gets its own IP in your VPC
Auto-scaling Adds/removes tasks based on CPU, memory, or custom metrics