imbalanced-losses¶

imbalanced-losses is a PyTorch library of training losses for class-imbalanced classification and ranking-metric optimization. It provides Focal Loss for reweighting, plus differentiable surrogates for ranking and operating-point metrics — Smooth Average Precision (Smooth-AP), Recall-at-Quantile, and Partial-AUC-at-Budget — all with built-in DDP all-gather support for globally-correct estimation across multi-GPU training. Imbalance is the design center (the memory queue and DDP gather exist for stable estimation at low positive rates), but the ranking losses apply to ranking/operating-point objectives more broadly.

When to use it¶

Use imbalanced-losses when:

Your dataset has significant class imbalance (e.g. fraud detection, rare event classification, object detection with many background anchors)
Standard cross-entropy or BCE loss produces degenerate models that predict the majority class
You are optimizing for ranking-based metrics (Average Precision, Recall at a fixed operating point) rather than accuracy
You are running distributed multi-GPU training and need losses that are globally correct across all workers

Installation¶

Requires Python ≥ 3.10 and PyTorch ≥ 2.8.

pip install imbalanced-losses

For development or contributing:

git clone https://github.com/chris-santiago/imbalanced-losses.git
cd imbalanced-losses
uv sync

Losses at a glance¶

Loss	Use case
`SigmoidFocalLoss`	Binary / multi-label (sigmoid per logit, classes are independent); drop-in for `BCEWithLogitsLoss`
`SoftmaxFocalLoss`	Mutually-exclusive multiclass (softmax couples all class logits); drop-in for `CrossEntropyLoss`
`SmoothAPLoss`	Directly optimizes Average Precision
`RecallAtQuantileLoss`	Optimizes recall within a fixed alert fraction (top q% of scores)
`PAUCAtBudgetLoss`	Optimizes partial AUC over an FPR band around a target operating point
`LossWarmupWrapper`	Warmup on CE/BCE, then blend/anneal into a ranking loss

Which loss should I use?¶

Your situation	Recommended loss
Binary / multi-label, mild-to-moderate imbalance	`SigmoidFocalLoss`
Binary, extreme imbalance (< 1% positives)	`SmoothAPLoss`
Mutually-exclusive multiclass, mild-to-moderate imbalance	`SoftmaxFocalLoss`
Multiclass with tail classes below ~1–2% of data	`SmoothAPLoss`
Maximize recall when only a fixed alert fraction (top q% of scores) can be reviewed	`RecallAtQuantileLoss`
Optimize recall at a fixed false-alarm budget / operating point (e.g. fraud at 50 bps)	`PAUCAtBudgetLoss`

For a deeper breakdown, see Why Imbalanced Losses?. Before deploying, read Assumptions and Failure Modes.

Documentation sections¶

Tutorials — hands-on walkthroughs that take you from zero to a working training loop
How-To Guides — goal-oriented recipes for common tasks
Reference — full API documentation for every public class and function
Explanation — background on design decisions, trade-offs, and non-obvious behavior