Skip to content

Data and model contracts

This page defines the runtime contracts expected by the app.

Input data contract

The uploaded dataset must be long format.

Minimum logical schema:

  • entity column: unique sequence id per entity
  • time column: sortable timestep column within each entity
  • target column: binary label (0 or 1) per row
  • feature columns: numeric model inputs

The app sorts by [entity_col, time_col] before windowing.

Sequence construction contract

build_sequences(...) creates sliding windows with:

  • input tensor shape: (num_windows, seq_len, n_features)
  • label shape: (num_windows,)
  • target used: value at the last timestep in each window

Entities with fewer than seq_len rows are skipped.

Training split contract

train_val_split_by_entity(...) splits by entity id, not by window index.

Implications:

  • all windows from one entity stay in one split
  • prevents entity leakage between train and validation

Scaling contract

fit_scaler(...) computes feature-wise mean and std over train windows only.

Runtime scaling:

x_scaled = (x_raw - mean) / std

For uploaded external scaler:

  • file type .npz
  • must contain arrays mean and std
  • each must have length n_features

Checkpoint model contract

Loaded model must be a torch.nn.Module compatible with:

  • forward(x) or forward(x, h0)
  • x shape (batch, seq_len, input_dim)

Expected return:

  • (logits, seq_hidden, hidden_state) or
  • (logits, seq_hidden)

where:

  • logits shape (batch,) or (batch, 1)
  • seq_hidden shape (batch, seq_len, hidden_dim)

TimeSHAP call contract

run_local_timeshap(...) adapts to TimeSHAP local methods and passes:

  • data shape (1, seq_len, n_features)
  • baseline shape (1, n_features) (mean over events)
  • callable f that returns (probabilities, hidden_state)

It resolves and calls:

  • local_pruning
  • local_event
  • local_feat

and converts pruning index semantics to valid forward timestep indices.