Running Experiments
Configuration
Experiments are configured via YAML files in services/training/configs/. The config schema is defined in src/ddtrain/config.py.
# configs/gen2a_port.yaml
name: gen2a_port
dataset:
version: v1
local_path: ~/.cache/dermadetect/v1
model:
type: gen2a # gen2a | image_only
backbone: resnet50
pretrained: true
image_size: 224
cnn_fc_size: 256 # Image branch output dim
mlp_hidden_size: 256 # Metadata branch output dim
final_fc_size: 64 # Dim after concatenation
dropout: 0.2
mlp_dropout: 0.4
num_classes: 26
multilabel: true
training:
batch_size: 32
initial_epochs: 3 # Phase 1: frozen backbone
finetune_epochs: 15 # Phase 2: full fine-tuning
base_lr: 0.001 # Phase 1 learning rate
finetune_lr: 0.0005 # Phase 2 learning rate
optimizer: rmsprop # rmsprop | adam | adamw
seed: 17
num_workers: 4
wandb:
enabled: true # Set false or WANDB_MODE=disabled to skip
project: dermadetect # W&B project name
entity: null # W&B team/user, null = default account
tags: [gen2a, port] # Filterable tags in the W&B dashboard
notes: null # Free-text description for this run
output_dir: outputsChoosing Disease Classes
The full dataset has 1,151 unique diagnoses, but most are rare. For training, you’ll want to select a subset. The legacy system used sets of 12-30 classes.
Check the top diagnoses in stats.json:
uv run python -c "
import json
with open('~/.cache/dermadetect/v1/stats.json') as f:
stats = json.load(f)
for diag, count in sorted(stats['diagnosis_distribution'].items(), key=lambda x: -x[1])[:30]:
print(f'{count:>6} {diag}')
"Create a label file with one disease per line and pass it to training, or define label_list in your training script.
Experiment Tracking with W&B
Weights & Biases is enabled by default. Each training run:
- Logs per-epoch metrics:
train/loss,train/accuracy,val/loss,val/accuracy - Logs training metadata:
phase(frozen/finetune),lr,epoch_time_s - Tracks gradient histograms via
wandb.watch() - Saves the best model + config as a versioned W&B artifact
- Records
best_val_lossandbest_epochin the run summary
Setup:
# One-time login
wandb login
# Or set the API key as an env var
export WANDB_API_KEY=your_key_hereDisabling W&B (e.g., for quick local tests):
# Via env var (no config change needed)
WANDB_MODE=disabled uv run python -m ddtrain.training.trainer --config configs/gen2a_port.yaml
# Or in the config YAML
wandb:
enabled: falseOutput Structure
Each experiment writes to outputs/{name}/:
outputs/gen2a_port/
best_model.pt # Best checkpoint (by val loss)
last_model.pt # Final epoch checkpoint
config.json # Full config used for this run
tb_logs/ # TensorBoard event filesFeature Encoding
The MetadataEncoder converts raw manifest values to fixed-size tensors:
- Continuous (4): MinMax scaled to [0, 1], null → 0.0
- Categorical (varies): One-hot, null → all zeros
- Boolean 3-state (15 × 3 = 45):
[true, false, unknown] - Multi-select (varies): Multi-hot vector
Total feature dimension depends on the schema (~120+ for the full dataset). Check with:
from ddtrain.datasets.features import MetadataEncoder
encoder = MetadataEncoder("~/.cache/dermadetect/v1/feature_schema.json")
print(f"Feature dim: {encoder.dim}")
print(f"Feature names: {encoder.feature_names()[:10]}...")Adding New Architectures
- Create a new model in
src/ddtrain/models/that extendsDermaDetectModel - Implement
forward(images, metadata) -> logitsand optionallyfreeze_backbone()/unfreeze_backbone() - Add a config YAML in
configs/ - The training loop works with any
DermaDetectModel— no changes needed to the trainer
Example: to try EfficientNet instead of ResNet50, create models/efficientnet.py and a configs/efficientnet_b0.yaml with model.type: efficientnet.