Example: Bank Term Deposit¶

This example demonstrates binary classification with a bank marketing dataset - predicting whether a client will subscribe to a term deposit.

Dataset¶

Source: Kaggle Bank Term Deposit via thedevastator/bank-term-deposit-predictions
Task: Predict term deposit subscription (yes/no)
Features: 16 (demographics, campaign information, economic context)
Classes: 2 (yes, no)
Training samples: 45,211
Test samples: 4,521

Quick Start¶

The config files for this example are in examples/bank_term_deposit/:

# 1. Download data (requires kagglehub)
pip install kagglehub
python3 -c "
import kagglehub
path = kagglehub.dataset_download('thedevastator/bank-term-deposit-predictions')
print(f'Downloaded to: {path}')
"

# 2. Point the datacard to your downloaded data
#    Edit examples/bank_term_deposit/bank_term_deposit.yaml and update:
#      train_files:
#        - <kagglehub_path>/train.csv
#      test_files:
#        - <kagglehub_path>/test.csv

# 3. Train
pilz train \
  --datacard examples/bank_term_deposit/bank_term_deposit.yaml \
  --trainsettings examples/bank_term_deposit/train_settings.yaml

# 4. Evaluate
pilz eval \
  --datacard examples/bank_term_deposit/bank_term_deposit.yaml \
  --evalsettings examples/bank_term_deposit/eval_settings.yaml

Or use the provided script:

cd examples/bank_term_deposit
# After downloading data and updating bank_term_deposit.yaml paths
bash run.sh

DataCard Structure¶

features:
  - name: y
    statistical: categorial
    type: str
  - name: job
    statistical: categorial
    type: str
  - name: marital
    statistical: categorial
    type: str
  - name: education
    statistical: categorial
    type: str
  - name: default
    statistical: categorial
    type: str
  - name: balance
    statistical: numerical
    type: int
  - name: housing
    statistical: categorial
    type: str
  - name: loan
    statistical: categorial
    type: str
  - name: contact
    statistical: categorial
    type: str
  - name: day
    statistical: numerical
    type: int
  - name: month
    statistical: categorial
    type: str
  - name: duration
    statistical: numerical
    type: int
  - name: campaign
    statistical: numerical
    type: int
  - name: pdays
    statistical: numerical
    type: int
  - name: previous
    statistical: numerical
    type: int
  - name: poutcome
    statistical: categorial
    type: str

target:
  feature_name: y
  values:
    - "yes"
    - "no"

train_files:
  - /path/to/train.csv
test_files:
  - /path/to/test.csv

Settings (Quick Start)¶

n: 1                # 1 tree per class (2 trees total)
out_folder: test
max_depth: 5        # Shallow trees for speed
frac_eval_cat: 0.8
max_eval_fit: 500
min_eval_fit: 5
n_dims: 2           # Pairwise feature combinations
n_cat: 3            # 3 bins per numerical feature
calcs_per_dim: 200  # Limited calculations per dimension

in_folders:
  - test
out_folder: eval

Training Time¶

With quick-start settings on a modern laptop (Apple Silicon):

Training: ~10 seconds
Evaluation: < 1 second

Actual Results¶

Overall Accuracy: 90.5%¶

Per-Class Accuracy¶

Class	Accuracy
no	96.7%
yes	42.8%

The "no" class is very accurate (majority class with ~88% of samples). The "yes" class is harder to predict due to strong class imbalance and the complexity of predicting rare subscriber behavior.

ROC Curve¶

Output Files¶

test/
├── yes/0.json    # Model for predicting "yes" subscription
└── no/0.json     # Model for predicting "no" subscription

eval/
├── yes_roc.html
├── no_roc.html
├── all_roc.html
└── multi_class_result.html

Key Findings¶

Duration is the strongest predictor
Longer calls strongly correlate with subscription
Very short calls (< 2 min) almost never convert
Previous campaign outcome matters
Previous success → high likelihood of repeat success
Previous non-contact = low likelihood
Contact type influences results
Cellular contacts outperform telephone
Month shows seasonal patterns
Subscriptions peak in certain months (e.g., May, June)

Tips¶

Quick Start Settings (current)¶

These are reduced for fast iteration. Training takes ~10 seconds and gives ~90.5% accuracy (mostly from the majority class).

For Better Accuracy¶

Increase these settings in train_settings.yaml:

n: 5                # Ensemble of 5 trees per class
max_depth: 10       # Deeper trees
n_dims: 3           # Triple feature combinations
n_cat: 5            # Finer bins
calcs_per_dim: 1000 # More thorough search
max_eval_fit: 5000  # More training samples

With these settings expect: - Training time: 2-5 minutes - Yes-class recall: significantly improved - Overall accuracy: similar (no class saturation)

Dealing with Class Imbalance¶

The dataset has ~88% "no" and ~12% "yes":

Start with the quick-start settings to verify the pipeline
Increase max_depth and n_dims to capture subtle patterns for the minority class
Add more trees (n=5 or n=10) for ensemble stability
Monitor the "yes" class accuracy specifically - not just overall accuracy