ML Models

Model Architecture

Diagnosis Model

Type: Multi-label classification

Framework: TensorFlow 2.x

Architecture:

Input layer: Patient metadata + image features
Hidden layers: Dense neural network
Output layer: Softmax for disease probabilities

Training Data:

50,000+ annotated dermatological cases
Expert physician diagnoses
Multi-ethnic dataset

Performance:

Top-1 accuracy: 85%
Top-3 accuracy: 95%
Average inference time: ~250ms

Model Files

Each model requires the following files:

best_model.h5

TensorFlow saved model with trained weights.

Location: {MODEL_DIRECTORY}/{model_name}/model/best_model.h5

Size: ~50MB

Format: HDF5

mlb.pkl

MultiLabelBinarizer for disease labels.

Contains:

List of all possible diseases
Label encoding mapping
Disease taxonomy

Example:


['Psoriasis', 'Eczema', 'Melanoma', 'Basal Cell Carcinoma', ...]

model_params.pkl

Model configuration and hyperparameters.

Contains:

Input feature list
Model version
Training timestamp
Performance metrics

continuous_mean.pkl

Mean values for continuous features (imputation).

Used for:

Handling missing values
Feature normalization
Consistent preprocessing

scaler.pkl

MinMaxScaler for feature normalization.

Range: [0, 1]

Applied to:

Age
Symptom duration
Symptom severity
Image quality scores

raw_list_of_field.pkl

Complete list of input features.

Contains:

Metadata field names
Feature order
Data types

Model Versioning

Version Format

v{major}.{minor}.{patch}

Example: v2.1.0

Version History

v2.1.0 (Current)

Added red flags detection
Improved multi-ethnic performance
5% accuracy improvement

v2.0.0

Multi-label classification
Image feature integration
Complete retraining on expanded dataset

v1.5.0

Initial production model
Single-label classification
Metadata-only features

Model Loading

Models are loaded at service startup:


class ModelPredictor:
    def __init__(self, model_directory: str):
        self.model = self._load_model(model_directory)
        self.mlb = self._load_pickle("mlb.pkl")
        self.scaler = self._load_pickle("scaler.pkl")
        # ... load other artifacts
 
    def _load_model(self, directory: str):
        model_path = f"{directory}/model/best_model.h5"
        return tf.keras.models.load_model(model_path)

Startup Time: ~5 seconds (CPU) / ~2 seconds (GPU)

Memory Usage: ~500MB

Model Updates

Deployment Process

Train new model with expanded dataset
Evaluate performance on test set
Package model files (all .pkl + .h5)
Upload to model storage (Azure/GCS)
Update MODEL_DIRECTORY environment variable
Restart AI service to load new model
Validate predictions against baseline

A/B Testing

Deploy multiple model versions for comparison:


# Kubernetes ConfigMap
models:
  v2_1_0:
    path: /models/v2.1.0
    weight: 0.9  # 90% of traffic
  v2_2_0_beta:
    path: /models/v2.2.0-beta
    weight: 0.1  # 10% of traffic

Rollback

Revert to previous model version:


# Update environment variable
export MODEL_DIRECTORY=/models/v2.0.0
 
# Restart service
kubectl rollout restart deployment/ai-service

Model Monitoring

Metrics to Track

Accuracy Metrics:

Prediction accuracy
Confidence distribution
Top-K accuracy

Performance Metrics:

Inference latency (p50, p95, p99)
Throughput (requests/second)
GPU utilization

Data Quality:

Feature distribution drift
Missing value rates
Out-of-range values

Logging

Each prediction is logged with:

Input features
Output probabilities
Model version
Inference time
Request ID (for tracing)

Alerts

Set up alerts for:

Accuracy drops below threshold
Latency exceeds SLA
High error rates
Memory/GPU issues

Model Retraining

Retraining Cadence

Quarterly: Regular retraining with new data
Ad-hoc: When accuracy degrades or new diseases emerge

Retraining Pipeline

Data Collection: Aggregate new physician-labeled cases
Data Validation: Check for quality and consistency
Feature Engineering: Extract metadata and image features
Model Training: Train on combined old + new data
Evaluation: Test on held-out validation set
Deployment: Package and deploy new model

Continuous Learning

Future enhancement: Online learning with physician feedback

Collect physician corrections
Retrain model incrementally
Improve accuracy over time