Setup & Configuration

Prerequisites

Python 3.13+
uv (Python package manager)
TensorFlow-compatible system
GPU (optional, for faster inference)

Installation


cd services/ai_service
uv sync

Environment Variables

Required

MODEL_DIRECTORY: Path to model files (required)

Optional

USE_GPU: Enable GPU inference (default: true)
LOG_LEVEL: Logging level (default: info)
PORT: Service port (default: 8080)

Example .env


MODEL_DIRECTORY=/models
USE_GPU=true
LOG_LEVEL=info
PORT=8080

Model Directory Structure

Models should be stored in the directory specified by MODEL_DIRECTORY:


models/
└── <model_name>/
    ├── model/
    │   └── best_model.h5
    ├── mlb.pkl                    # Disease labels
    ├── model_params.pkl           # Model configuration
    ├── continuous_mean.pkl        # Metadata field averages
    ├── scaler.pkl                 # MinMax scaler
    └── raw_list_of_field.pkl     # Metadata field list

Running the Service

Development


uv run uvicorn src.main:app --reload --port 8080

Production


uv run uvicorn src.main:app --host 0.0.0.0 --port 8080 --workers 4

With GPU

Ensure CUDA is installed and configured:


# Check GPU availability
python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
 
# Run service
USE_GPU=true uv run uvicorn src.main:app --port 8080

CPU-Only Mode

Force CPU inference even if GPU is available:


USE_GPU=false uv run uvicorn src.main:app --port 8080

Testing


# Run tests
uv run pytest
 
# Run with coverage
uv run pytest --cov=src --cov-report=html
 
# Test specific module
uv run pytest src/core/models/predictor_test.py

Deployment

Docker

Build and run with Docker:


docker build -t dermadetect-ai-service .
docker run -p 8080:8080 \
  -v /path/to/models:/models \
  -e MODEL_DIRECTORY=/models \
  dermadetect-ai-service

Kubernetes

Deploy to Kubernetes:


kubectl apply -f k8s/ai-service-deployment.yaml
kubectl apply -f k8s/ai-service-service.yaml

Scale replicas for higher throughput:


kubectl scale deployment ai-service --replicas=5

Health Check

Verify the service is running:


curl http://localhost:8080/health

Expected response:


{
  "status": "healthy",
  "gpu_available": true,
  "models_loaded": true
}