Welcome to Tabular SSL Documentation¶
Welcome to the documentation for Tabular SSL, a modular library for self-supervised learning on tabular data with state-of-the-art corruption strategies. This documentation follows the DiΓ‘taxis framework to provide you with the most effective learning and reference experience.
π Quick Start¶
New to Tabular SSL? Try our interactive demos to see the library in action:
# Demo corruption strategies (VIME, SCARF, ReConTab)
python demo_corruption_strategies.py
# Demo with real credit card transaction data
python demo_credit_card_data.py
# Train with state-of-the-art SSL methods
python train.py +experiment=vime_ssl # VIME approach
python train.py +experiment=scarf_ssl # SCARF approach
python train.py +experiment=recontab_ssl # ReConTab approach
π New: Corruption Strategies¶
We've implemented corruption strategies from leading tabular SSL papers:
- π― VIME - Value imputation and mask estimation (NeurIPS 2020)
- π SCARF - Contrastive learning with feature corruption (arXiv 2021)
- π§ ReConTab - Multi-task reconstruction-based learning
π¦ Sample Data¶
Get started immediately with real transaction data from the IBM TabFormer project - no data preparation needed!
Documentation Structure¶
π Tutorials¶
Learning-oriented guides for newcomers
Step-by-step lessons to help you learn Tabular SSL fundamentals. Start here if you're new to the library.
- Getting Started - Your first steps with Tabular SSL
- Basic Usage - Core concepts and workflows
- Custom Components - Creating your own components
π οΈ How-to Guides¶
Problem-oriented solutions for specific tasks
Practical guides for accomplishing specific goals and solving real problems.
- Data Preparation - Prepare your datasets
- Model Training - Train models effectively
- Evaluation - Evaluate model performance
- Configuring Experiments - Set up experiments
π Reference¶
Information-oriented technical documentation
Complete and accurate technical reference for the library's components and APIs.
- API Reference - Complete API documentation
- Models - Available model components
- Corruption Strategies - VIME, SCARF, and ReConTab implementations
- Data - Data handling utilities
- Configuration - Configuration system
- Utilities - Helper functions
π‘ Explanation¶
Understanding-oriented discussions of key topics
Background information and conceptual explanations to help you understand the library's design and principles.
- Architecture Overview - System design and principles
- SSL Methods - Self-supervised learning approaches
- Performance - Optimization and best practices
Quick Examples¶
π Demo Scripts¶
# Interactive corruption strategies demo
python demo_corruption_strategies.py
# Real data demo with credit card transactions
python demo_credit_card_data.py
π§ͺ SSL Experiments¶
# VIME: Value imputation + mask estimation
python train.py +experiment=vime_ssl
# SCARF: Contrastive learning with feature corruption
python train.py +experiment=scarf_ssl
# ReConTab: Multi-task reconstruction
python train.py +experiment=recontab_ssl
π§ Custom Configuration¶
# Mix and match components
python train.py model/sequence_encoder=rnn model/event_encoder=mlp
# Use different corruption strategies
python train.py model/corruption=vime model/corruption.corruption_rate=0.5
# Adjust hyperparameters
python train.py model.learning_rate=1e-3 data.batch_size=64
Installation¶
git clone https://github.com/yourusername/tabular-ssl.git
cd tabular-ssl
pip install -e .
export PYTHONPATH=$PWD/src
Key Features¶
- π State-of-the-Art Corruption Strategies - VIME, SCARF, and ReConTab implementations
- π¦ Ready-to-Use Sample Data - IBM TabFormer credit card transaction dataset
- π§© Modular Architecture - Mix and match components for custom models
- βοΈ Hydra Configuration - Flexible, hierarchical configuration management
- π§ͺ Pre-configured SSL Experiments - VIME, SCARF, and ReConTab ready to run
- π¬ Interactive Demos - See corruption strategies in action
- π PyTorch Lightning - Robust training and evaluation framework
Contributing¶
We welcome contributions! Please see our Contributing Guide for details on how to get involved.
Support¶
- π GitHub Issues - Bug reports and feature requests
- π¬ Discussions - Questions and community support
- π§ Contact - Direct support
License¶
This project is licensed under the MIT License - see the LICENSE file for details.