Skip to content

Model Evaluation Guide

This guide covers how to evaluate and interpret your Tabular SSL models.

Basic Evaluation

Computing Metrics

from tabular_ssl.utils import evaluate_model

# Evaluate model performance
metrics = evaluate_model(
    model,
    test_data,
    metrics=['accuracy', 'f1', 'precision', 'recall']
)
print(metrics)

Cross-Validation

from tabular_ssl.utils import cross_validate

# Perform k-fold cross-validation
cv_results = cross_validate(
    model,
    data,
    n_splits=5,
    metrics=['accuracy', 'f1']
)
print(cv_results)

Advanced Evaluation

Custom Metrics

from tabular_ssl.utils import CustomMetric

# Define custom metric
def custom_metric(y_true, y_pred):
    # Your custom metric implementation
    return score

# Evaluate with custom metric
metrics = evaluate_model(
    model,
    test_data,
    metrics=['accuracy', CustomMetric(custom_metric)]
)

Model Comparison

from tabular_ssl.utils import compare_models

# Compare multiple models
comparison = compare_models(
    models=[model1, model2, model3],
    test_data=test_data,
    metrics=['accuracy', 'f1']
)
print(comparison)

Visualization

Training History

from tabular_ssl.utils import plot_training_history

# Plot training metrics
fig = plot_training_history(history)
fig.show()

Performance Plots

from tabular_ssl.utils import plot_performance

# Plot various performance metrics
fig = plot_performance(
    model,
    test_data,
    plot_types=['confusion_matrix', 'roc_curve', 'precision_recall']
)
fig.show()

Model Interpretation

Feature Importance

from tabular_ssl.utils import get_feature_importance

# Get feature importance scores
importance = get_feature_importance(model, test_data)
print(importance)

SHAP Values

from tabular_ssl.utils import get_shap_values

# Compute SHAP values
shap_values = get_shap_values(model, test_data)

# Plot SHAP summary
plot_shap_summary(shap_values, test_data)

Error Analysis

Error Distribution

from tabular_ssl.utils import analyze_errors

# Analyze prediction errors
error_analysis = analyze_errors(
    model,
    test_data,
    analysis_types=['distribution', 'correlation']
)
print(error_analysis)

Error Visualization

from tabular_ssl.utils import plot_errors

# Plot error analysis
fig = plot_errors(
    model,
    test_data,
    plot_types=['residuals', 'error_distribution']
)
fig.show()

Best Practices

  1. Use multiple evaluation metrics
  2. Perform cross-validation for robust results
  3. Compare against baseline models
  4. Analyze error patterns
  5. Visualize results for better understanding
  6. Consider domain-specific metrics
  7. Document evaluation methodology
  8. Validate results with statistical tests

Common Issues and Solutions

Unbalanced Data

from tabular_ssl.utils import balanced_metrics

# Use balanced metrics
metrics = evaluate_model(
    model,
    test_data,
    metrics=['balanced_accuracy', 'f1']
)

Small Test Sets

# Use bootstrapping for small test sets
from tabular_ssl.utils import bootstrap_evaluation

results = bootstrap_evaluation(
    model,
    test_data,
    n_bootstrap=1000,
    metrics=['accuracy', 'f1']
)