Settings Reference
Complete reference for all training and evaluation settings.
TrainSettings
Required Parameters
| Parameter |
Type |
Description |
n |
int |
Number of trees to build per target class |
out_folder |
string |
Directory to save trained models |
Optional Parameters
| Parameter |
Default |
Description |
max_depth |
10 |
Maximum tree depth (prevents overfitting) |
frac_eval_cat |
0.5 |
Fraction of data for bin evaluation vs weight-based grouping |
max_eval_fit |
1000 |
Maximum rows sampled per node |
min_eval_fit |
10 |
Minimum samples before stopping recursion |
n_dims |
3 |
Feature combinations to evaluate (1=single, 2=pairs, 3=triplets) |
n_cat |
3 |
Number of bins per feature |
calcs_per_dim |
5000 |
Maximum combinations to evaluate per dimension |
neutral_faktor |
0.0 |
Threshold for neutral branch (within [-n, n] goes to Neutral) |
Example: TrainSettings
# Minimum required
n: 5
out_folder: my_model
# With all options
n: 10
out_folder: my_model
max_depth: 15
frac_eval_cat: 0.5
max_eval_fit: 1000
min_eval_fit: 50
n_dims: 3
n_cat: 5
calcs_per_dim: 5000
neutral_faktor: 0.0
EvalSettings
Required Parameters
| Parameter |
Type |
Description |
in_folders |
list |
Directories containing trained models |
out_folder |
string |
Directory for evaluation results |
Optional Parameters
| Parameter |
Default |
Description |
out_file |
null |
Output file path (.csv or .parquet) |
keep_cols |
[] |
Columns to include in output |
max_parallel_where |
100 |
Split SQL into batches if more conditions than this |
Example: EvalSettings
# Minimum required
in_folders:
- model
out_folder: results
# With all options
in_folders:
- model_v1
- model_v2
out_folder: results
out_file: predictions.csv
keep_cols:
- customer_id
- date
max_parallel_where: 500
Parameter Guide
n: Number of Trees
| Value |
Effect |
| 1 |
Fast, baseline |
| 3-5 |
Good balance |
| 10+ |
More accurate, slower |
max_depth
| Value |
Effect |
| 5-10 |
Shallow, fast, general |
| 10-15 |
Medium |
| 15+ |
Deep, may overfit |
n_dims
| Value |
Effect |
| 1 |
Single features only |
| 2 |
Feature pairs |
| 3+ |
Complex interactions |
n_cat
| Value |
Effect |
| 2-3 |
Few bins, general |
| 5 |
Default |
| 10+ |
Many bins, specific |
calcs_per_dim
| Value |
Effect |
| null |
No limit |
| 1000 |
Quick |
| 10000 |
Thorough |
| 100000+ |
Exhaustive |
Quick Reference Table
flowchart LR
subgraph "Simple Problem"
S1[n: 1] --> S2[n_dims: 1] --> S3[n_cat: 3] --> S4[max_depth: 10]
end
subgraph "Normal Problem"
N1[n: 3-5] --> N2[n_dims: 2] --> N3[n_cat: 5] --> N4[max_depth: 15]
end
subgraph "Complex Problem"
C1[n: 10+] --> C2[n_dims: 3+] --> C3[n_cat: 8+] --> C4[max_depth: 20]
end