ML_AI_training/earlier_versions/GSCV_base

# Logistic regression:
    pnca
    input: numerical features
    output: dm/om: target

grid search/base estimator with a single model with hyperparamter choices: gives you the best model based on a SINGLE metric!
    -- question: which is the metric to optimise for?
base estimator with multipe models and multiple hyperparams: returns the OVERALL best model-hyperparam combo, based on a single score?
    -- question: which is the metric to optimise for?


# Demonstration

###################
# Metric1: accuracy
###################

Best model:
 {'clf__max_iter': 100, 'clf__solver': 'liblinear'}

Best models score:
 0.7145320197044336


###################
# Metric2: F1
###################
Best model:
 {'clf__max_iter': 100, 'clf__solver': 'saga'}
 Best models score:
 0.7550294183111348


###################
# Metric3: Recall
###################
Best model:
 {'clf__max_iter': 100, 'clf__solver': 'saga'}
Best models score:
 0.8216666666666667


###################
# Metric4: ROC_AUC
###################

Best model:
 {'clf__max_iter': 200, 'clf__solver': 'sag'}
Best models score:
 0.7711904761904762

###################
# Metric5: MCC
###################

Best model:
 {'clf__max_iter': 100, 'clf__solver': 'saga'}
 Best models score:
 0.4322970173039572

 sklearn/linear_model/_sag.py:354: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
ConvergenceWarning,

#####################################
# Same thing but using: CLFSwitcher()


###################
# Metric1: Accuracy
###################

Best model:
 {'clf__estimator': LogisticRegression(random_state=42, solver='liblinear')
 , 'clf__estimator__max_iter': 100, 'clf__estimator__solver': 'liblinear'}
Best models score:
 0.7219298245614035

###################
# Metric2: F1
###################
Best model:
 {'clf__estimator': LogisticRegression(random_state=42, solver='liblinear'), 'clf__estimator__max_iter': 100, 'clf__estimator__solver': 'liblinear'}

print('Best models score:\n', gscv.best_score_)
Best models score:
 0.7585724070894442

###################
# Metric3: Recall
###################
Best model:
 {'clf__estimator': LogisticRegression(random_state=42, solver='liblinear')
 , 'clf__estimator__max_iter': 100, 'clf__estimator__solver': 'liblinear'}
Best models score:
 0.8198610213316095

###################
# Metric4: ROC_AUC
###################
Best model:
 {'clf__estimator': LogisticRegression(solver='newton-cg')
 , 'clf__estimator__max_iter': 100, 'clf__estimator__solver': 'newton-cg'}

Best models score:
 nan

###################
# Metric5: MCC
###################
Best model:
 {'clf__estimator': LogisticRegression(random_state=42, solver='liblinear')
 , 'clf__estimator__max_iter': 100, 'clf__estimator__solver': 'liblin

Best models score:
 0.4480248700902755


 print('Best model:\n', gs_dt.best_params_)
Best model:
 {'criterion': 'entropy', 'max_depth': 2, 'max_features': None, 'max_leaf_nodes': 10}

print('Best models score:\n', gs_dt.best_score_)
Best models score:
 0.43290518915746007