Skip to content

Welcome to Tabular SSL Documentation

Welcome to the documentation for Tabular SSL, a modular library for self-supervised learning on tabular data with state-of-the-art corruption strategies. This documentation follows the DiΓ‘taxis framework to provide you with the most effective learning and reference experience.

πŸš€ Quick Start

New to Tabular SSL? Try our interactive demos to see the library in action:

# Demo corruption strategies (VIME, SCARF, ReConTab)
python demo_corruption_strategies.py

# Demo with real credit card transaction data
python demo_credit_card_data.py

# Train with state-of-the-art SSL methods
python train.py +experiment=vime_ssl     # VIME approach
python train.py +experiment=scarf_ssl    # SCARF approach  
python train.py +experiment=recontab_ssl # ReConTab approach

🎭 New: Corruption Strategies

We've implemented corruption strategies from leading tabular SSL papers:

  • 🎯 VIME - Value imputation and mask estimation (NeurIPS 2020)
  • 🌟 SCARF - Contrastive learning with feature corruption (arXiv 2021)
  • πŸ”§ ReConTab - Multi-task reconstruction-based learning

🏦 Sample Data

Get started immediately with real transaction data from the IBM TabFormer project - no data preparation needed!

Documentation Structure

πŸ“š Tutorials

Learning-oriented guides for newcomers

Step-by-step lessons to help you learn Tabular SSL fundamentals. Start here if you're new to the library.

πŸ› οΈ How-to Guides

Problem-oriented solutions for specific tasks

Practical guides for accomplishing specific goals and solving real problems.

πŸ“– Reference

Information-oriented technical documentation

Complete and accurate technical reference for the library's components and APIs.

πŸ’‘ Explanation

Understanding-oriented discussions of key topics

Background information and conceptual explanations to help you understand the library's design and principles.

Quick Examples

🎭 Demo Scripts

# Interactive corruption strategies demo
python demo_corruption_strategies.py

# Real data demo with credit card transactions
python demo_credit_card_data.py

πŸ§ͺ SSL Experiments

# VIME: Value imputation + mask estimation
python train.py +experiment=vime_ssl

# SCARF: Contrastive learning with feature corruption
python train.py +experiment=scarf_ssl

# ReConTab: Multi-task reconstruction
python train.py +experiment=recontab_ssl

πŸ”§ Custom Configuration

# Mix and match components
python train.py model/sequence_encoder=rnn model/event_encoder=mlp

# Use different corruption strategies
python train.py model/corruption=vime model/corruption.corruption_rate=0.5

# Adjust hyperparameters
python train.py model.learning_rate=1e-3 data.batch_size=64

Installation

git clone https://github.com/yourusername/tabular-ssl.git
cd tabular-ssl
pip install -e .
export PYTHONPATH=$PWD/src

Key Features

  • 🎭 State-of-the-Art Corruption Strategies - VIME, SCARF, and ReConTab implementations
  • 🏦 Ready-to-Use Sample Data - IBM TabFormer credit card transaction dataset
  • 🧩 Modular Architecture - Mix and match components for custom models
  • βš™οΈ Hydra Configuration - Flexible, hierarchical configuration management
  • πŸ§ͺ Pre-configured SSL Experiments - VIME, SCARF, and ReConTab ready to run
  • 🎬 Interactive Demos - See corruption strategies in action
  • πŸ“Š PyTorch Lightning - Robust training and evaluation framework

Contributing

We welcome contributions! Please see our Contributing Guide for details on how to get involved.

Support

License

This project is licensed under the MIT License - see the LICENSE file for details.