Multi-Dimensional Splits¶
The n_dims parameter controls how Pilz finds correlations between features. This is the core innovation that makes Pilz special — it can split on multiple features simultaneously rather than one at a time.
What is n_dims?¶
n_dims defines how many features are combined in a single split evaluation:
| n_dims | Combinations Evaluated |
|---|---|
| 1 | Single features only |
| 2 | Feature pairs |
| 3 | Feature triplets |
Why Multi-Dimensional Matters¶
The Correlation Problem¶
When features correlate, their combination is more predictive than either alone:
Traditional vs Pilz¶
A traditional decision tree needs multiple sequential splits to capture a pairwise correlation. Pilz captures it in a single multi-dimensional cut:
How It Works¶
The counter() method finds the best split by evaluating single features and multi-dimensional combinations:
# src/pilz/service/train.py:153-190
def counter(self, train_df: TrainDataframes) -> tuple[Filter, Filter | None, Filter]:
# Step 1: Score individual features (n_dims=1)
sorted_train_feats = sorted(
train_df.train_features,
key=lambda x: x.calc_diff(),
reverse=True,
)
best_feat = sorted_train_feats[0]
best_feat_diff = best_feat.calc_diff()
# Step 2: Try multi-feature combinations for higher n_dims
for dim in range(2, self.settings.n_dims + 1):
counter = 0
for comb in itertools.combinations(sorted_train_feats, r=dim):
counter += 1
akt_feature = CombinedCategorizedFeature(
comb,
non_target_size=train_df.n_count_non_target,
target_size=train_df.n_count_target,
neutral_faktor=self.settings.neutral_faktor,
)
if akt_feature.calc_diff() > best_feat_diff:
best_feat = akt_feature
best_feat_diff = akt_feature.calc_diff()
if (
self.settings.calcs_per_dim
and counter > self.settings.calcs_per_dim
):
break
return best_feat.get_left_right_filter()
Step 1: Score Individual Features¶
Each feature is scored by calc_diff():
# src/pilz/model/dataframes.py:153-167
def calc_diff(self) -> float:
target_diff = self.diff_df.filter(
pl.col("diff") > self.neutral_faktor
)["diff"].sum()
non_target_diff = abs(
self.diff_df.filter(pl.col("diff") < -self.neutral_faktor)["diff"].sum()
)
return max(target_diff, non_target_diff)
Feature Sorting¶
Before building combinations, counter() sorts all features by their calc_diff() score descending. The single best feature becomes sorted_train_feats[0]:
# src/pilz/service/train.py:156-161
sorted_train_feats = sorted(
train_df.train_features,
key=lambda x: x.calc_diff(),
reverse=True,
)
best_feat = sorted_train_feats[0]
This ordering determines combination priority. itertools.combinations generates combinations in the order of the sorted list, so the strongest features always appear first:
When calcs_per_dim limits the number of combinations, pruning naturally affects the weaker feature combinations at the end of the iteration order. Key implications:
- The best feature (
sorted_train_feats[0]) appears in the most combinations, giving it maximum coverage - Weaker features may never be evaluated when
calcs_per_dimcuts early - Combination order is deterministic — always follows the sorted
calc_diff()order within eachcounter()call
Step 2: Try Feature Combinations¶
For each dimension from 2 to n_dims, Pilz generates all combinations using itertools.combinations and wraps them in CombinedCategorizedFeature. Since features are pre-sorted by calc_diff(), combinations follow the same priority — (best, second_best) is evaluated before (best, weakest):
# src/pilz/model/dataframes.py:334-362
class CombinedCategorizedFeature(CategorizedFeatureMixin):
def __init__(self, train_features, non_target_size, target_size, neutral_faktor):
group_by = [train.feature.name for train in train_features]
# Build joint contingency table
non_target_df = pl.DataFrame([train.non_target_sr for train in train_features])
df_count_non_target = (
non_target_df.group_by(group_by)
.len(name="proportion")
.with_columns((pl.col("proportion") / non_target_size))
)
target_df = pl.DataFrame([train.target_sr for train in train_features])
df_count_target = (
target_df.group_by(group_by)
.len(name="proportion")
.with_columns((pl.col("proportion") / target_size))
)
# Compute diff for each combination value
self.set_diff_df(
df_count_target=df_count_target,
df_count_non_target=df_count_non_target,
join_on=group_by,
)
The set_diff_df() method computes the difference between target and non-target proportions:
# src/pilz/model/dataframes.py:169-184
def set_diff_df(self, df_count_target, df_count_non_target, join_on):
self.diff_df = (
df_count_non_target.join(df_count_target, on=join_on, how="outer")
.fill_null(0)
.with_columns(
(pl.col("proportion_right") - pl.col("proportion")).alias("diff"),
pl.max_horizontal(
pl.col("proportion"), pl.col("proportion_right")
).alias("max_proportion"),
)
)
Step 3: Determine Split¶
The get_left_right_filter() method classifies each combination based on diff:
The Combination Explosion¶
Higher n_dims values evaluate exponentially more combinations:
| Features | n_dims=1 | n_dims=2 | n_dims=3 | n_dims=4 |
|---|---|---|---|---|
| 4 | 4 | 6 | 4 | 1 |
| 10 | 10 | 45 | 120 | 210 |
| 20 | 20 | 190 | 1,140 | 4,845 |
| 50 | 50 | 1,225 | 19,600 | 230,300 |
The calcs_per_dim Parameter¶
To keep training time bounded, calcs_per_dim limits how many combinations are tried per dimension:
# src/pilz/model/settings.py:32-35
calcs_per_dim: int | None = Field(
description="Maximum calculations per dimension",
default=5000,
)
Practical Guidelines¶
When to Use Higher n_dims¶
| n_dims | Best For |
|---|---|
| 1 | Simple datasets, many features, baseline |
| 2 | Most cases — captures pairwise correlations |
| 3 | Complex interactions, fewer features |
Recommendations¶
Summary¶
| Concept | Description |
|---|---|
| n_dims=1 | Single feature splits — fast, no correlation capture |
| n_dims=2 | Feature pairs — captures pairwise correlations |
| n_dims=3+ | Higher-order combinations — for complex interactions |
| calcs_per_dim | Limits computation to prevent exhaustive search |
counter() at train.py:153 |
Main split-finding method |
Feature sorting at train.py:156-160 |
Features sorted by calc_diff() descending; strongest prioritized in combinations |
| Combination order | itertools.combinations follows sorted order; calcs_per_dim prunes weaker combos last |
CombinedCategorizedFeature at dataframes.py:334 |
Builds joint contingency tables |
Next Steps¶
- Three-Way Splits — How the split is used in tree building
- Feature Categorization — How features are binned first
- Training Internals — Full algorithm reference