Running Experiments

Configuration

Experiments are configured via YAML files in src/training/configs/. The config schema is defined in src/ddtrain/config.py.


# configs/gen2a_port.yaml
name: gen2a_port
 
dataset:
  version: v1
  local_path: ~/.cache/dermadetect/v1
 
model:
  type: gen2a          # gen2a | image_only
  backbone: resnet50
  pretrained: true
  image_size: 224
  cnn_fc_size: 256     # Image branch output dim
  mlp_hidden_size: 256 # Metadata branch output dim
  final_fc_size: 64    # Dim after concatenation
  dropout: 0.2
  mlp_dropout: 0.4
  num_classes: 26
  multilabel: true
 
training:
  batch_size: 32
  initial_epochs: 3      # Phase 1: frozen backbone
  finetune_epochs: 15    # Phase 2: full fine-tuning
  base_lr: 0.001         # Phase 1 learning rate
  finetune_lr: 0.0005    # Phase 2 learning rate
  optimizer: rmsprop     # rmsprop | adam | adamw
  seed: 17
  num_workers: 4
 
wandb:
  enabled: true          # Set false or WANDB_MODE=disabled to skip
  project: dermadetect   # W&B project name
  entity: null           # W&B team/user, null = default account
  tags: [gen2a, port]    # Filterable tags in the W&B dashboard
  notes: null            # Free-text description for this run
 
output_dir: outputs

Choosing Disease Classes

The full dataset has 1,151 unique diagnoses, but most are rare. For training, you’ll want to select a subset. The legacy system used sets of 12-30 classes.

Check the top diagnoses in stats.json:


uv run python -c "
import json
with open('~/.cache/dermadetect/v1/stats.json') as f:
    stats = json.load(f)
for diag, count in sorted(stats['diagnosis_distribution'].items(), key=lambda x: -x[1])[:30]:
    print(f'{count:>6}  {diag}')
"

Create a label file with one disease per line and pass it to training, or define label_list in your training script.

Experiment Tracking with W&B

Weights & Biases is enabled by default. Each training run:

Logs per-epoch metrics: train/loss, train/accuracy, val/loss, val/accuracy
Logs training metadata: phase (frozen/finetune), lr, epoch_time_s
Tracks gradient histograms via wandb.watch()
Saves the best model + config as a versioned W&B artifact
Records best_val_loss and best_epoch in the run summary

Setup:


# One-time login
wandb login
 
# Or set the API key as an env var
export WANDB_API_KEY=your_key_here

Disabling W&B (e.g., for quick local tests):


# Via env var (no config change needed)
WANDB_MODE=disabled uv run python -m ddtrain.training.trainer --config configs/gen2a_port.yaml
 
# Or in the config YAML
wandb:
  enabled: false

Output Structure

Each experiment writes to outputs/{name}/:


outputs/gen2a_port/
  best_model.pt         # Best checkpoint (by val loss)
  last_model.pt         # Final epoch checkpoint
  config.json           # Full config used for this run
  tb_logs/              # TensorBoard event files

Feature Encoding

The MetadataEncoder converts raw manifest values to fixed-size tensors:

Continuous (4): MinMax scaled to [0, 1], null → 0.0
Categorical (varies): One-hot, null → all zeros
Boolean 3-state (15 × 3 = 45): [true, false, unknown]
Multi-select (varies): Multi-hot vector

Total feature dimension depends on the schema (~120+ for the full dataset). Check with:


from ddtrain.datasets.features import MetadataEncoder
encoder = MetadataEncoder("~/.cache/dermadetect/v1/feature_schema.json")
print(f"Feature dim: {encoder.dim}")
print(f"Feature names: {encoder.feature_names()[:10]}...")

Adding New Architectures

Create a new model in src/ddtrain/models/ that extends DermaDetectModel
Implement forward(images, metadata) -> logits and optionally freeze_backbone()/unfreeze_backbone()
Add a config YAML in configs/
The training loop works with any DermaDetectModel — no changes needed to the trainer

Example: to try EfficientNet instead of ResNet50, create models/efficientnet.py and a configs/efficientnet_b0.yaml with model.type: efficientnet.