API Reference

partition_tree.sklearn

Standard scikit-learn–compatible estimators for classification and regression.

from partition_tree.sklearn import (
    PartitionTreeClassifier,
    PartitionForestClassifier,
    PartitionTreeRegressor,
    PartitionForestRegressor,
)

PartitionTreeClassifier

A single Partition Tree for classification tasks.

Inherits: sklearn.base.ClassifierMixin, sklearn.base.BaseEstimator

Constructor

PartitionTreeClassifier(
    max_leaves=101,
    boundaries_expansion_factor=0.1,
    min_samples_xy=1.0,
    min_samples_x=1.0,
    min_samples_y=1.0,
    min_gain=0.0,
    min_volume_fraction=0.0,
    max_depth=1000,
    min_samples_split=2.0,
)

Methods

Method Description
fit(X, y, sample_weights=None) Fit the tree on training data
predict(X) Predict class labels
predict_proba(X) Predict class probabilities — np.ndarray of shape (n_samples, n_classes)
apply(X) Return the leaf index for each sample
get_leaves_info() Return metadata for each leaf

Attributes

Attribute Description
classes_ Array of class labels learned during fit
partition_tree_ Internal tree object (available after fit)

PartitionForestClassifier

Ensemble of Partition Trees for classification (bagging with density averaging).

Inherits: sklearn.base.ClassifierMixin, sklearn.base.BaseEstimator

Constructor

PartitionForestClassifier(
    n_estimators=100,
    max_leaves=101,
    boundaries_expansion_factor=0.1,
    min_samples_xy=1.0,
    min_samples_x=1.0,
    min_samples_y=1.0,
    min_gain=0.0,
    min_volume_fraction=0.0,
    max_depth=5,
    min_samples_split=2.0,
    max_samples=0.8,
    max_features=0.8,
    random_state=None,
)

Methods

Method Description
fit(X, y, sample_weights=None) Fit the forest on training data
predict(X) Predict class labels (majority vote from averaged probabilities)
predict_proba(X) Predict class probabilities — np.ndarray of shape (n_samples, n_classes)

PartitionTreeRegressor

A single Partition Tree for regression tasks.

Inherits: sklearn.base.RegressorMixin, sklearn.base.BaseEstimator

Constructor

PartitionTreeRegressor(
    max_leaves=101,
    boundaries_expansion_factor=0.1,
    min_samples_xy=1.0,
    min_samples_x=1.0,
    min_samples_y=1.0,
    min_gain=0.0,
    min_volume_fraction=0.0,
    max_depth=5,
    min_samples_split=2.0,
)

Methods

Method Description
fit(X, y, sample_weights=None) Fit the tree on training data
predict(X) Predict target values — np.ndarray

PartitionForestRegressor

Ensemble of Partition Trees for regression.

Inherits: sklearn.base.RegressorMixin, sklearn.base.BaseEstimator

Constructor

PartitionForestRegressor(
    n_estimators=100,
    max_leaves=101,
    boundaries_expansion_factor=0.1,
    min_samples_xy=1.0,
    min_samples_x=1.0,
    min_samples_y=1.0,
    min_gain=0.0,
    min_volume_fraction=0.0,
    max_depth=5,
    min_samples_split=2.0,
    max_samples=0.8,
    max_features=0.8,
    random_state=42,
)

Methods

Method Description
fit(X, y, sample_weights=None) Fit the forest on training data
predict(X) Predict target values — np.ndarray

partition_tree.skpro

Probabilistic regressors that return full predictive distributions, based on skpro’s BaseProbaRegressor.

from partition_tree.skpro import (
    PartitionTreeRegressor,
    PartitionForestRegressor,
    IntervalDistribution,
)

PartitionTreeRegressor (skpro)

A single Partition Tree for probabilistic regression.

Inherits: skpro.regression.base.BaseProbaRegressor

Constructor

PartitionTreeRegressor(
    max_leaves=101,
    boundaries_expansion_factor=0.1,
    min_samples_xy=1.0,
    min_samples_x=1.0,
    min_samples_y=1.0,
    min_gain=0.0,
    min_volume_fraction=0.0,
    max_depth=5,
    min_samples_split=2.0,
)

Methods

Method Returns Description
fit(X, y) self Fit the tree
predict(X) pd.DataFrame Point predictions (posterior mean)
predict_proba(X) IntervalDistribution Full predictive distribution
apply(X) array Leaf index per sample
get_leaves_info() list[dict] Metadata for each leaf
get_feature_importances(normalize=True) dict Feature importances from split gains

PartitionForestRegressor (skpro)

Ensemble of Partition Trees for probabilistic regression.

Inherits: skpro.regression.base.BaseProbaRegressor

Constructor

PartitionForestRegressor(
    n_estimators=100,
    max_leaves=101,
    boundaries_expansion_factor=0.1,
    min_samples_xy=1.0,
    min_samples_x=1.0,
    min_samples_y=1.0,
    min_gain=0.0,
    min_volume_fraction=0.0,
    max_depth=5,
    min_samples_split=2.0,
    seed=42,
)

Methods

Method Returns Description
fit(X, y) self Fit the forest
predict(X) pd.DataFrame Point predictions
predict_proba(X) IntervalDistribution Mixed predictive distribution (averaged across trees)
predict_proba_per_tree(X) list[IntervalDistribution] Individual distribution from each tree

IntervalDistribution

Piecewise-uniform distribution over disjoint intervals. Returned by predict_proba on the skpro estimators.

Inherits: skpro.distributions.base.BaseDistribution

Constructor

IntervalDistribution(
    intervals,          # list[list[tuple(low, high)]] per instance
    pdf_values=None,    # list[array] of densities per interval
    index=None,         # pd.Index
    columns=None,       # pd.Index
)

Methods

Method Signature Returns
mean() () -> pd.DataFrame Posterior mean
var() () -> pd.DataFrame Posterior variance
pdf(x) (x) -> pd.DataFrame Density at points x
log_pdf(x) (x) -> pd.DataFrame Log-density at points x
cdf(x) (x) -> pd.DataFrame CDF at points x
ppf(q) (q) -> pd.DataFrame Quantile function (inverse CDF)
sample(n_samples) (int) -> pd.DataFrame Random samples
energy(x) (x) -> pd.DataFrame Energy score
plot(ax, colors, alpha) (...) -> Axes Plot the piecewise-constant PDF

Class Methods

Method Description
IntervalDistribution.from_mixture(distributions, weights, index, columns) Create a mixture distribution by weighted averaging

Common Parameters

Parameter Type Default Description
max_leaves int 101 Maximum number of tree leaves
max_depth int 5 Maximum depth of the tree
min_samples_split float 2.0 Minimum samples required to split a node
min_samples_xy float 1.0 Minimum joint-cell sample count
min_samples_x float 1.0 Minimum \(X\)-projection sample count
min_samples_y float 1.0 Minimum \(Y\)-projection sample count
min_gain float 0.0 Minimum log-loss gain to accept a split
min_volume_fraction float 0.0 Minimum fraction of root \(Y\)-volume for leaf
boundaries_expansion_factor float 0.1 Padding for the outcome bounding box
n_estimators int 100 Number of trees in the forest
max_samples float 0.8 Fraction of training samples per tree
max_features float 0.8 Fraction of features per split
seed / random_state int \| None 42 Random seed for reproducibility