Overview
Deep Sets (Zaheer et al., NeurIPS 2017) is a neural architecture for learning functions on sets (inputs with no inherent order). Its core result: any permutation-invariant function can be written as ρ(Σ φ(x)). Transform each element with one network, pool the results, then apply a second network. The sum pooling guarantees that shuffling the input never changes the output.
What it implements
The repository covers every architectural variant from the paper: invariant models (set to vector), equivariant models and layers (set to set, preserving permutation symmetry), and context-conditioned models with multiple fusion strategies. Variable-size sets are supported throughout via masking, with sum, max, and mean pooling, each with mathematically correct mask handling.
Notes
What makes this more than a toy port is the verification: a dedicated test suite checks structural compliance with the paper’s theorems, and the README numerically demonstrates that shuffling inputs leaves outputs unchanged. The documentation explains the non-obvious design choices, including why masked max-pooling fills with negative infinity rather than zero, and how the architecture’s linear complexity compares to quadratic attention-based alternatives.