MLutils module¶
-
class
MLutils.Classifier(n_jobs=- 1)¶ Bases:
MLutils.Model-
compute_metrics_and_graphs(pred, actual, output_path='outputs/plots/mlflow_artifacts')¶ Evaluate model’s performance
- Parameters
pred (list) – List of predictions
actual (list) – List of labels
output_path (str, optional) – Path indicating where to store the plots. Defaults to “outputs/plots/mlflow_artifacts”.
-
fit(x, y)¶ Fit a classifier using a cross-validated grid search.
- Parameters
x (DataFrame) – Features
y (DataFrame) – Target
-
-
class
MLutils.Model¶ Bases:
objectParent class of all model objects. 2 classes inherit: Regressor and Classfier classes
-
static
getModel(task, n_jobs=1)¶
-
static
-
class
MLutils.Preprocess(scaler=None, numeric_na_fill_method=None, category_na_fill_method=None, one_hot_encoding=True)¶ Bases:
objectPreprocessing data (scaling, imputation, one hot encoding)
- Parameters
scaler (str, optional) – Type of scaling to use.
numeric_na_fill_method (str, optional) – Imputation method for numerical variables. Defaults to None.
category_na_fill_method (srr, optional) – Imputation method for categorical variables. Defaults to None.
one_hot_encoding (bool, optional) – Where to encode non-numeric categorical variables. Defaults to True.
-
fit(df)¶
-
fit_transform(df)¶
-
transform(df, verbose=True)¶
-
class
MLutils.Regressor(n_jobs=- 1)¶ Bases:
MLutils.Model-
compute_metrics_and_graphs(pred, actual, output_path='outputs/plots/mlflow_artifacts')¶ Evaluate model’s performance
- Parameters
pred (list) – List of predictions
actual (list) – List of labels
output_path (str, optional) – Path indicating where to store the plots. Defaults to “outputs/plots/mlflow_artifacts”.
-
fit(x, y)¶
-
-
MLutils.explain(x, model, task, path='outputs/plots/mlflow_artifacts/shap', n_features=5)¶ explain a model’ decisions based on SHAP value approximation. SHAP algorithm is quadratic with the depth of trees. -> Be careful not to go over 12 for max_depth.
- Parameters
x (DataFrame) – Input data
model ([type]) – Model to explain
task (str) – Task to perform. Available: regression, classification.
path (str, optional) – [description]. Defaults to “outputs/plots/mlflow_artifacts/shap”.
n_features (int, optional) – Number of most important features for which to generate partial dependance plot.