Skip to Content
ServicesAI ServiceDocker Deployment

Docker Deployment Guide

This guide covers building and deploying the AI Service using Docker, both locally and on AWS Fargate.

Overview

The AI Service uses a flexible Docker setup that supports:

  • CPU-only inference (AWS Fargate, production)
  • GPU-accelerated inference (local development, EC2 GPU instances)
  • Official TensorFlow base images
  • Multi-stage builds for optimization
  • Non-root user for security

Quick Start

Local Development (CPU)

# From repository root cd services/ai_service # Build docker build -t ai-service:latest . # Run (mount your models directory) docker run -p 8080:8080 \ -v $(pwd)/../../models:/models:ro \ -e MODEL_DIRECTORY=/models \ -e DEFAULT_MODEL=gen2a \ -e USE_GPU=false \ ai-service:latest # Test curl http://localhost:8080/health
# From repository root mkdir -p models # CPU-only mode (matches AWS Fargate) docker compose up -d # GPU-enabled mode (requires NVIDIA Container Toolkit) docker compose -f docker-compose.yml -f docker-compose.gpu.yml up -d # View logs docker compose logs -f ai_service # Stop docker compose down

Building Images

CPU-Only (AWS Fargate Compatible)

docker build -t ai-service:cpu \ --build-arg TENSORFLOW_IMAGE=tensorflow/tensorflow:2.18.0 \ services/ai_service/

This is the default and recommended for production deployment on AWS Fargate.

Base Image: tensorflow/tensorflow:2.18.0

  • Official TensorFlow CPU-only image
  • Smaller image size (~1.5GB vs ~6GB for GPU)
  • Works on any Docker host (no GPU required)
  • Compatible with AWS Fargate

GPU-Enabled (Local Development)

docker build -t ai-service:gpu \ --build-arg TENSORFLOW_IMAGE=tensorflow/tensorflow:2.18.0-gpu \ services/ai_service/

Base Image: tensorflow/tensorflow:2.18.0-gpu

  • Official TensorFlow GPU image with CUDA support
  • Requires NVIDIA GPU + NVIDIA Container Toolkit
  • Larger image size (~6GB)
  • For local development or EC2 GPU instances

AWS Fargate Deployment

Task Definition Configuration

{ "family": "ai-service", "networkMode": "awsvpc", "requiresCompatibilities": ["FARGATE"], "cpu": "2048", "memory": "4096", "containerDefinitions": [ { "name": "ai-service", "image": "<ECR_REPO_URL>/ai-service:latest", "portMappings": [ { "containerPort": 8080, "protocol": "tcp" } ], "environment": [ { "name": "MODEL_DIRECTORY", "value": "/models" }, { "name": "DEFAULT_MODEL", "value": "gen2a" }, { "name": "USE_GPU", "value": "false" }, { "name": "LOG_LEVEL", "value": "INFO" } ], "mountPoints": [ { "sourceVolume": "models", "containerPath": "/models", "readOnly": true } ], "healthCheck": { "command": ["CMD-SHELL", "python -c \"import urllib.request; urllib.request.urlopen('http://localhost:8080/health').read()\" || exit 1"], "interval": 30, "timeout": 10, "retries": 3, "startPeriod": 90 }, "logConfiguration": { "logDriver": "awslogs", "options": { "awslogs-group": "/ecs/ai-service", "awslogs-region": "us-east-1", "awslogs-stream-prefix": "ecs" } } } ], "volumes": [ { "name": "models", "efsVolumeConfiguration": { "fileSystemId": "<EFS_FILE_SYSTEM_ID>", "rootDirectory": "/models", "transitEncryption": "ENABLED" } } ] }

ECR Push Workflow

# Authenticate to ECR aws ecr get-login-password --region us-east-1 | \ docker login --username AWS --password-stdin <AWS_ACCOUNT_ID>.dkr.ecr.us-east-1.amazonaws.com # Build for production docker build -t ai-service:latest \ --build-arg TENSORFLOW_IMAGE=tensorflow/tensorflow:2.18.0 \ --platform linux/amd64 \ services/ai_service/ # Tag docker tag ai-service:latest <ECR_REPO_URL>/ai-service:latest docker tag ai-service:latest <ECR_REPO_URL>/ai-service:$(git rev-parse --short HEAD) # Push docker push <ECR_REPO_URL>/ai-service:latest docker push <ECR_REPO_URL>/ai-service:$(git rev-parse --short HEAD)

Model Management

Development (Volume Mount)

# Mount local models directory docker run -v $(pwd)/models:/models:ro ai-service:latest

Directory Structure:

models/ ā”œā”€ā”€ gen2a/ │ ā”œā”€ā”€ model/ │ │ └── best_model.h5 │ ā”œā”€ā”€ mlb.pkl │ ā”œā”€ā”€ model_params.pkl │ ā”œā”€ā”€ continuous_mean.pkl │ ā”œā”€ā”€ scaler.pkl │ └── raw_list_of_field.pkl └── gen2i/ └── ...

Production (AWS EFS)

  1. Create EFS File System:

    aws efs create-file-system \ --performance-mode generalPurpose \ --throughput-mode bursting \ --encrypted \ --tags Key=Name,Value=ai-service-models
  2. Upload Models to EFS:

    # From EC2 instance with EFS mounted aws s3 sync s3://your-models-bucket/models/ /mnt/efs/models/
  3. Configure Task Definition: See EFS volume configuration above

Alternative: S3 Sync on Startup

Create a custom entrypoint script:

#!/bin/bash # entrypoint.sh # Sync models from S3 if [ -n "$S3_MODELS_BUCKET" ]; then echo "Syncing models from S3..." aws s3 sync s3://${S3_MODELS_BUCKET}/models/ /models/ fi # Start application exec uvicorn src.main:app --host 0.0.0.0 --port 8080

Then update Dockerfile:

COPY entrypoint.sh /app/ RUN chmod +x /app/entrypoint.sh CMD ["/app/entrypoint.sh"]

GPU Development (Local)

See the GPU Setup Guide for detailed instructions on setting up GPU support for local development.

Quick Start

# Install NVIDIA Container Toolkit (one-time setup) # See GPU Setup Guide for detailed instructions # Run with GPU docker run --gpus all -p 8080:8080 \ -v $(pwd)/models:/models:ro \ -e USE_GPU=true \ ai-service:gpu # Or with Docker Compose docker compose -f docker-compose.yml -f docker-compose.gpu.yml up -d

Environment Variables

VariableDefaultDescription
MODEL_DIRECTORY/modelsPath to model files
DEFAULT_MODELgen2aModel subdirectory name
USE_GPUtrueEnable GPU inference
LOG_LEVELINFOLogging level (DEBUG, INFO, WARNING, ERROR)
API_PREFIX/api/v1API route prefix

Troubleshooting

Container Starts but Health Check Fails

# Check logs docker compose logs ai_service # Common issues: # 1. Models not mounted correctly docker compose exec ai_service ls -la /models # 2. Model loading failure (check for all required pickle files) docker compose exec ai_service ls -la /models/gen2a/ # 3. Service not listening on correct port docker compose exec ai_service netstat -tlnp

Large Image Size

The GPU image is large (~6GB) due to CUDA dependencies. For CPU-only:

# Build CPU-only image (smaller) docker build --build-arg TENSORFLOW_IMAGE=tensorflow/tensorflow:2.18.0 -t ai-service:cpu . # Check size docker images ai-service

Slow Startup on Fargate

TensorFlow model loading takes 60-90 seconds. Configure:

  • Health check startPeriod: 90 (allows time for model loading)
  • Use EFS with provisioned throughput for faster model loading
  • Consider model caching or warm containers

Out of Memory

Fargate minimum requirements:

  • CPU: 2 vCPU (2048)
  • Memory: 4 GB (4096)

Adjust based on model size and inference batch size.

Performance Tuning

CPU Optimization

# Set TensorFlow thread count ENV OMP_NUM_THREADS=4 ENV TF_NUM_INTRAOP_THREADS=4 ENV TF_NUM_INTEROP_THREADS=2

Memory Limits

# docker-compose.yml services: ai_service: deploy: resources: limits: memory: 4G

Security Considerations

  1. Non-root User: Container runs as appuser (UID 1000)
  2. Read-only Models: Models mounted as :ro (read-only)
  3. No Secrets in Image: Use environment variables or AWS Secrets Manager
  4. Minimal Base Image: Official TensorFlow images are security-scanned
  5. Health Checks: Ensures container is responsive

CI/CD Integration

Example GitHub Actions workflow:

name: Build and Push AI Service on: push: branches: [main] jobs: build: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - name: Configure AWS credentials uses: aws-actions/configure-aws-credentials@v2 with: aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }} aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }} aws-region: us-east-1 - name: Login to Amazon ECR id: login-ecr uses: aws-actions/amazon-ecr-login@v1 - name: Build and push env: ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }} ECR_REPOSITORY: ai-service IMAGE_TAG: ${{ github.sha }} run: | docker build -t $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG \ --build-arg TENSORFLOW_IMAGE=tensorflow/tensorflow:2.18.0 \ --platform linux/amd64 \ services/ai_service/ docker push $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG docker tag $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG $ECR_REGISTRY/$ECR_REPOSITORY:latest docker push $ECR_REGISTRY/$ECR_REPOSITORY:latest
Last updated on