19530 lines
973 KiB
Text
19530 lines
973 KiB
Text
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_data.py:550: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
mask_check.sort_values(by = ['ligand_distance'], ascending = True, inplace = True)
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
|
|
from pandas import MultiIndex, Int64Index
|
|
1.22.4
|
|
1.4.1
|
|
|
|
aaindex_df contains non-numerical data
|
|
|
|
Total no. of non-numerial columns: 2
|
|
|
|
Selecting numerical data only
|
|
|
|
PASS: successfully selected numerical columns only for aaindex_df
|
|
|
|
Now checking for NA in the remaining aaindex_cols
|
|
|
|
Counting aaindex_df cols with NA
|
|
ncols with NA: 4 columns
|
|
Dropping these...
|
|
Original ncols: 127
|
|
|
|
Revised df ncols: 123
|
|
|
|
Checking NA in revised df...
|
|
|
|
PASS: cols with NA successfully dropped from aaindex_df
|
|
Proceeding with combining aa_df with other features_df
|
|
|
|
PASS: ncols match
|
|
Expected ncols: 123
|
|
Got: 123
|
|
|
|
Total no. of columns in clean aa_df: 123
|
|
|
|
Proceeding to merge, expected nrows in merged_df: 1133
|
|
|
|
PASS: my_features_df and aa_df successfully combined
|
|
nrows: 1133
|
|
ncols: 274
|
|
count of NULL values before imputation
|
|
|
|
or_mychisq 339
|
|
log10_or_mychisq 339
|
|
dtype: int64
|
|
count of NULL values AFTER imputation
|
|
|
|
mutationinformation 0
|
|
or_rawI 0
|
|
logorI 0
|
|
dtype: int64
|
|
|
|
PASS: OR values imputed, data ready for ML
|
|
|
|
No. of numerical features: 46
|
|
No. of categorical features: 7
|
|
|
|
index: 0
|
|
ind: 1
|
|
|
|
Mask count check: True
|
|
|
|
index: 1
|
|
ind: 2
|
|
|
|
Mask count check: True
|
|
|
|
index: 2
|
|
ind: 3
|
|
|
|
Mask count check: True
|
|
Original Data
|
|
Counter({0: 282, 1: 275}) Data dim: (557, 53)
|
|
|
|
-------------------------------------------------------------
|
|
Successfully split data: UQ [no aa_index but active site included] training
|
|
actual values: training set
|
|
imputed values: blind test set
|
|
Train data size: (557, 53)
|
|
Test data size: (575, 53)
|
|
y_train numbers: Counter({0: 282, 1: 275})
|
|
y_train ratio: 1.0254545454545454
|
|
|
|
y_test_numbers: Counter({0: 545, 1: 30})
|
|
y_test ratio: 18.166666666666668
|
|
-------------------------------------------------------------
|
|
Simple Random OverSampling
|
|
Counter({0: 282, 1: 282})
|
|
(564, 53)
|
|
Simple Random UnderSampling
|
|
Counter({0: 275, 1: 275})
|
|
(550, 53)
|
|
Simple Combined Over and UnderSampling
|
|
Counter({0: 282, 1: 282})
|
|
(564, 53)
|
|
SMOTE_NC OverSampling
|
|
Counter({0: 282, 1: 282})
|
|
(564, 53)
|
|
|
|
#####################################################################
|
|
|
|
Running ML analysis: UQ [without AA index but with active site annotations]
|
|
Gene name: rpoB
|
|
Drug name: rifampicin
|
|
|
|
Output directory: /home/tanu/git/Data/rifampicin/output/ml/uq_v1/
|
|
|
|
Sanity checks:
|
|
Total input features: 53
|
|
|
|
Training data size: (557, 53)
|
|
Test data size: (575, 53)
|
|
|
|
Target feature numbers (training data): Counter({0: 282, 1: 275})
|
|
Target features ratio (training data: 1.0254545454545454
|
|
|
|
Target feature numbers (test data): Counter({0: 545, 1: 30})
|
|
Target features ratio (test data): 18.166666666666668
|
|
|
|
#####################################################################
|
|
|
|
|
|
================================================================
|
|
|
|
Strucutral features (n): 37
|
|
These are:
|
|
Common stablity features: ['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist']
|
|
FoldX columns: ['electro_rr', 'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr', 'disulfide_mm', 'disulfide_sm', 'disulfide_ss', 'hbonds_rr', 'hbonds_mm', 'hbonds_sm', 'hbonds_ss', 'partcov_rr', 'partcov_mm', 'partcov_sm', 'partcov_ss', 'vdwclashes_rr', 'vdwclashes_mm', 'vdwclashes_sm', 'vdwclashes_ss', 'volumetric_rr', 'volumetric_mm', 'volumetric_ss']
|
|
Other struc columns: ['rsa', 'kd_values', 'rd_values']
|
|
================================================================
|
|
|
|
Evolutionary features (n): 3
|
|
These are:
|
|
['consurf_score', 'snap2_score', 'provean_score']
|
|
================================================================
|
|
|
|
Genomic features (n): 6
|
|
These are:
|
|
['maf', 'logorI']
|
|
['lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique']
|
|
================================================================
|
|
|
|
Categorical features (n): 7
|
|
These are:
|
|
['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site']
|
|
================================================================
|
|
|
|
|
|
Pass: No. of features match
|
|
|
|
#####################################################################
|
|
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02604318 0.02941751 0.02129436 0.0241468 0.02635837 0.02950144
|
|
0.02385283 0.02729487 0.02373672 0.02400374]
|
|
|
|
mean value: 0.025564980506896973
|
|
|
|
key: score_time
|
|
value: [0.01137805 0.01106167 0.01085043 0.0108633 0.01163006 0.01123095
|
|
0.01119161 0.01098061 0.01095009 0.01113772]
|
|
|
|
mean value: 0.011127448081970215
|
|
|
|
key: test_mcc
|
|
value: [0.93103448 0.82149863 0.89342711 0.82195294 0.71611487 0.85933785
|
|
0.75047877 0.78174603 0.71735629 0.8565805 ]
|
|
|
|
mean value: 0.8149527494116898
|
|
|
|
key: train_mcc
|
|
value: [0.8246123 0.83651026 0.81662709 0.8246123 0.84078809 0.82921429
|
|
0.8366859 0.8249619 0.81699263 0.82954689]
|
|
|
|
mean value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
0.8280551641651794
|
|
|
|
key: test_accuracy
|
|
value: [0.96428571 0.91071429 0.94642857 0.91071429 0.85714286 0.92857143
|
|
0.875 0.89090909 0.85454545 0.92727273]
|
|
|
|
mean value: 0.9065584415584416
|
|
|
|
key: train_accuracy
|
|
value: [0.91217565 0.91816367 0.90818363 0.91217565 0.92015968 0.91417166
|
|
0.91816367 0.9123506 0.90836653 0.91434263]
|
|
|
|
mean value: 0.9138253373730626
|
|
|
|
key: test_fscore
|
|
value: [0.96428571 0.90566038 0.94736842 0.9122807 0.86206897 0.92592593
|
|
0.87719298 0.88888889 0.86206897 0.92307692]
|
|
|
|
mean value: 0.9068817865833584
|
|
|
|
key: train_fscore
|
|
value: [0.9123506 0.91816367 0.908 0.912 0.92031873 0.91485149
|
|
0.91816367 0.9123506 0.90836653 0.91518738]
|
|
|
|
mean value: 0.9139752661367001
|
|
|
|
key: test_precision
|
|
value: [0.93103448 0.92307692 0.93103448 0.89655172 0.83333333 0.96153846
|
|
0.86206897 0.88888889 0.80645161 0.96 ]
|
|
|
|
mean value: 0.8993978874913247
|
|
|
|
key: train_precision
|
|
value: [0.9015748 0.90909091 0.8972332 0.90118577 0.90588235 0.89534884
|
|
0.90551181 0.9015748 0.8976378 0.8957529 ]
|
|
|
|
mean value: 0.9010793179924724
|
|
|
|
key: test_recall
|
|
value: [1. 0.88888889 0.96428571 0.92857143 0.89285714 0.89285714
|
|
0.89285714 0.88888889 0.92592593 0.88888889]
|
|
|
|
mean value: 0.9164021164021164
|
|
|
|
key: train_recall
|
|
value: [0.9233871 0.92741935 0.91902834 0.92307692 0.93522267 0.93522267
|
|
0.93117409 0.9233871 0.91935484 0.93548387]
|
|
|
|
mean value: 0.9272756954420791
|
|
|
|
key: test_roc_auc
|
|
value: [0.96551724 0.90996169 0.94642857 0.91071429 0.85714286 0.92857143
|
|
0.875 0.89087302 0.85582011 0.9265873 ]
|
|
|
|
mean value: 0.9066616493340631
|
|
|
|
key: train_roc_auc
|
|
value: [0.91228643 0.91825513 0.90833307 0.91232586 0.92036724 0.91446173
|
|
0.91834295 0.91248095 0.90849632 0.91459233]
|
|
|
|
mean value: 0.9139942013981738
|
|
|
|
key: test_jcc
|
|
value: [0.93103448 0.82758621 0.9 0.83870968 0.75757576 0.86206897
|
|
0.78125 0.8 0.75757576 0.85714286]
|
|
|
|
mean value: 0.8312943704886141
|
|
|
|
key: train_jcc
|
|
value: [0.83882784 0.84870849 0.83150183 0.83823529 0.85239852 0.84306569
|
|
0.84870849 0.83882784 0.83211679 0.84363636]
|
|
|
|
mean value: 0.8416027146818327
|
|
|
|
MCC on Blind test: 0.28
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.6813972 0.78868866 0.81215572 0.72905326 0.82197165 0.74409461
|
|
0.71482921 0.76164103 0.68979573 0.75526428]
|
|
|
|
mean value: 0.7498891353607178
|
|
|
|
key: score_time
|
|
value: [0.01248741 0.01272368 0.01245189 0.01255608 0.01290011 0.01112986
|
|
0.01275086 0.012429 0.01255012 0.01147771]
|
|
|
|
mean value: 0.012345671653747559
|
|
|
|
key: test_mcc
|
|
value: [0.96481304 0.9284802 0.92857143 0.89802651 0.64285714 0.8660254
|
|
0.82195294 0.89153439 0.82337971 0.8565805 ]
|
|
|
|
mean value: 0.8622221274212197
|
|
|
|
key: train_mcc
|
|
value: [0.91621503 0.94017409 0.93217802 0.93212612 0.95608442 0.92815126
|
|
0.92815126 0.94040302 0.916326 0.93624587]
|
|
|
|
mean value: 0.932605508864504
|
|
|
|
key: test_accuracy
|
|
value: [0.98214286 0.96428571 0.96428571 0.94642857 0.82142857 0.92857143
|
|
0.91071429 0.94545455 0.90909091 0.92727273]
|
|
|
|
mean value: 0.9299675324675325
|
|
|
|
key: train_accuracy
|
|
value: [0.95808383 0.97005988 0.96606786 0.96606786 0.97804391 0.96407186
|
|
0.96407186 0.97011952 0.95816733 0.96812749]
|
|
|
|
mean value: 0.9662881408497745
|
|
|
|
key: test_fscore
|
|
value: [0.98113208 0.96296296 0.96428571 0.94915254 0.82142857 0.92307692
|
|
0.9122807 0.94545455 0.9122807 0.92307692]
|
|
|
|
mean value: 0.9295131661638991
|
|
|
|
key: train_fscore
|
|
value: [0.95740365 0.96957404 0.96537678 0.96551724 0.97768763 0.96341463
|
|
0.96341463 0.9694501 0.95757576 0.96774194]
|
|
|
|
mean value: 0.9657156401043632
|
|
|
|
key: test_precision
|
|
value: [1. 0.96296296 0.96428571 0.90322581 0.82142857 1.
|
|
0.89655172 0.92857143 0.86666667 0.96 ]
|
|
|
|
mean value: 0.9303692874504887
|
|
|
|
key: train_precision
|
|
value: [0.96326531 0.9755102 0.97131148 0.96747967 0.9796748 0.96734694
|
|
0.96734694 0.97942387 0.95951417 0.96774194]
|
|
|
|
mean value: 0.9698615308546767
|
|
|
|
key: test_recall
|
|
value: [0.96296296 0.96296296 0.96428571 1. 0.82142857 0.85714286
|
|
0.92857143 0.96296296 0.96296296 0.88888889]
|
|
|
|
mean value: 0.9312169312169312
|
|
|
|
key: train_recall
|
|
value: [0.9516129 0.96370968 0.95951417 0.96356275 0.9757085 0.95951417
|
|
0.95951417 0.95967742 0.95564516 0.96774194]
|
|
|
|
mean value: 0.961620086195638
|
|
|
|
key: test_roc_auc
|
|
value: [0.98148148 0.9642401 0.96428571 0.94642857 0.82142857 0.92857143
|
|
0.91071429 0.9457672 0.91005291 0.9265873 ]
|
|
|
|
mean value: 0.9299557562488597
|
|
|
|
key: train_roc_auc
|
|
value: [0.95801989 0.96999713 0.96597756 0.96603335 0.97801173 0.96400905
|
|
0.96400905 0.96999619 0.95813754 0.96812294]
|
|
|
|
mean value: 0.9662314429920021
|
|
|
|
key: test_jcc
|
|
value: [0.96296296 0.92857143 0.93103448 0.90322581 0.6969697 0.85714286
|
|
0.83870968 0.89655172 0.83870968 0.85714286]
|
|
|
|
mean value: 0.8711021170976677
|
|
|
|
key: train_jcc
|
|
value: [0.91828794 0.94094488 0.93307087 0.93333333 0.95634921 0.92941176
|
|
0.92941176 0.94071146 0.91860465 0.9375 ]
|
|
|
|
mean value: 0.9337625868482374/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
|
|
|
|
MCC on Blind test: 0.23
|
|
|
|
Accuracy on Blind test: 0.65
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02240419 0.00845432 0.00848174 0.0077436 0.00821161 0.00828028
|
|
0.00818753 0.00795102 0.00834084 0.00831008]
|
|
|
|
mean value: 0.009636521339416504
|
|
|
|
key: score_time
|
|
value: [0.01104975 0.00858045 0.00879431 0.0087378 0.00867748 0.0087285
|
|
0.00872374 0.00838089 0.00867343 0.00884008]
|
|
|
|
mean value: 0.0089186429977417
|
|
|
|
key: test_mcc
|
|
value: [0.74266517 0.48372032 0.77459667 0.71611487 0.40574111 0.61065803
|
|
0.55328334 0.68300095 0.74935731 0.74935731]
|
|
|
|
mean value: 0.6468495081997219
|
|
|
|
key: train_mcc
|
|
value: [0.66487805 0.68935419 0.66458942 0.66570983 0.62725669 0.69324149
|
|
0.67986963 0.68418537 0.66184784 0.66877084]
|
|
|
|
mean value: 0.6699703343840876
|
|
|
|
key: test_accuracy
|
|
value: [0.85714286 0.73214286 0.875 0.85714286 0.69642857 0.80357143
|
|
0.76785714 0.83636364 0.87272727 0.87272727]
|
|
|
|
mean value: 0.8171103896103896
|
|
|
|
key: train_accuracy
|
|
value: [0.82634731 0.83832335 0.82634731 0.8243513 0.79840319 0.84231537
|
|
0.83433134 0.83665339 0.8247012 0.82669323]
|
|
|
|
mean value: 0.8278466970441587
|
|
|
|
key: test_fscore
|
|
value: [0.82608696 0.66666667 0.85714286 0.85185185 0.65306122 0.79245283
|
|
0.73469388 0.81632653 0.8627451 0.8627451 ]
|
|
|
|
mean value: 0.7923772991103286
|
|
|
|
key: train_fscore
|
|
value: [0.80536913 0.81879195 0.80449438 0.79816514 0.75662651 0.82560706
|
|
0.81431767 0.81777778 0.80269058 0.80272109]
|
|
|
|
mean value: 0.8046561286055279
|
|
|
|
key: test_precision
|
|
value: [1. 0.83333333 1. 0.88461538 0.76190476 0.84
|
|
0.85714286 0.90909091 0.91666667 0.91666667]
|
|
|
|
mean value: 0.8919420579420579
|
|
|
|
key: train_precision
|
|
value: [0.90452261 0.91959799 0.9040404 0.92063492 0.93452381 0.90776699
|
|
0.91 0.91089109 0.9040404 0.91709845]
|
|
|
|
mean value: 0.9133116666250641
|
|
|
|
key: test_recall
|
|
value: [0.7037037 0.55555556 0.75 0.82142857 0.57142857 0.75
|
|
0.64285714 0.74074074 0.81481481 0.81481481]
|
|
|
|
mean value: 0.7165343915343916
|
|
|
|
key: train_recall
|
|
value: [0.72580645 0.73790323 0.72469636 0.70445344 0.63562753 0.75708502
|
|
0.73684211 0.74193548 0.72177419 0.71370968]
|
|
|
|
mean value: 0.7199833485699361
|
|
|
|
key: test_roc_auc
|
|
value: [0.85185185 0.72605364 0.875 0.85714286 0.69642857 0.80357143
|
|
0.76785714 0.83465608 0.87169312 0.87169312]
|
|
|
|
mean value: 0.8155947819740923
|
|
|
|
key: train_roc_auc
|
|
value: [0.82535382 0.83733106 0.8249466 0.82269916 0.79616022 0.84114094
|
|
0.83298798 0.83553467 0.82348552 0.82535878]
|
|
|
|
mean value: 0.8264998750879309
|
|
|
|
key: test_jcc
|
|
value: [0.7037037 0.5 0.75 0.74193548 0.48484848 0.65625
|
|
0.58064516 0.68965517 0.75862069 0.75862069]
|
|
|
|
mean value: 0.6624279385437617
|
|
|
|
key: train_jcc
|
|
value: [0.6741573 0.69318182 0.67293233 0.66412214 0.60852713 0.70300752
|
|
0.68679245 0.69172932 0.67041199 0.67045455]
|
|
|
|
mean value: 0.6735316546975922
|
|
|
|
MCC on Blind test: 0.34
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00915837 0.008641 0.00853086 0.00855017 0.00856733 0.00867844
|
|
0.00850987 0.00852513 0.00848937 0.00854254]
|
|
|
|
mean value: 0.008619308471679688
|
|
|
|
key: score_time
|
|
value: [0.00892973 0.00885773 0.00864935 0.00866771 0.00876236 0.00893545
|
|
0.0087409 0.00873089 0.00877738 0.00871181]
|
|
|
|
mean value: 0.008776330947875976
|
|
|
|
key: test_mcc
|
|
value: [0.89342711 0.74984143 0.85714286 0.71428571 0.67900461 0.78571429
|
|
0.64285714 0.71049701 0.75878131 0.74935731]
|
|
|
|
mean value: 0.7540908782235038
|
|
|
|
key: train_mcc
|
|
value: [0.76073062 0.76464682 0.75244668 0.78078676 0.77655234 0.75249829
|
|
0.76042979 0.77325226 0.78086182 0.77758373]
|
|
|
|
mean value: 0.7679789125420294
|
|
|
|
key: test_accuracy
|
|
value: [0.94642857 0.875 0.92857143 0.85714286 0.83928571 0.89285714
|
|
0.82142857 0.85454545 0.87272727 0.87272727]
|
|
|
|
mean value: 0.8760714285714286
|
|
|
|
key: train_accuracy
|
|
value: [0.88023952 0.88223553 0.8762475 0.89021956 0.88822355 0.8762475
|
|
0.88023952 0.88645418 0.89043825 0.88844622]
|
|
|
|
mean value: 0.8838991340029105
|
|
|
|
key: test_fscore
|
|
value: [0.94545455 0.86792453 0.92857143 0.85714286 0.84210526 0.89285714
|
|
0.82142857 0.84615385 0.88135593 0.8627451 ]
|
|
|
|
mean value: 0.8745739213310778
|
|
|
|
key: train_fscore
|
|
value: [0.88047809 0.88223553 0.87449393 0.89021956 0.8875502 0.875
|
|
0.87804878 0.88667992 0.88933602 0.88932806]
|
|
|
|
mean value: 0.8833370085701109
|
|
|
|
key: test_precision
|
|
value: [0.92857143 0.88461538 0.92857143 0.85714286 0.82758621 0.89285714
|
|
0.82142857 0.88 0.8125 0.91666667]
|
|
|
|
mean value: 0.8749939686750031
|
|
|
|
key: train_precision
|
|
value: [0.87007874 0.87351779 0.87449393 0.87795276 0.88047809 0.87148594
|
|
0.88163265 0.8745098 0.8875502 0.87209302]
|
|
|
|
mean value: 0.8763792922216086
|
|
|
|
key: test_recall
|
|
value: [0.96296296 0.85185185 0.92857143 0.85714286 0.85714286 0.89285714
|
|
0.82142857 0.81481481 0.96296296 0.81481481]
|
|
|
|
mean value: 0.8764550264550264
|
|
|
|
key: train_recall
|
|
value: [0.89112903 0.89112903 0.87449393 0.90283401 0.89473684 0.87854251
|
|
0.87449393 0.89919355 0.89112903 0.90725806]
|
|
|
|
mean value: 0.8904939924252318
|
|
|
|
key: test_roc_auc
|
|
value: [0.94699872 0.87420179 0.92857143 0.85714286 0.83928571 0.89285714
|
|
0.82142857 0.85383598 0.87433862 0.87169312]
|
|
|
|
mean value: 0.8760353950009122
|
|
|
|
key: train_roc_auc
|
|
value: [0.88034712 0.88232341 0.87622334 0.89039338 0.8883133 0.87627913
|
|
0.88016035 0.88660465 0.89044641 0.8886684 ]
|
|
|
|
mean value: 0.8839759495598507
|
|
|
|
key: test_jcc
|
|
value: [0.89655172 0.76666667 0.86666667 0.75 0.72727273 0.80645161
|
|
0.6969697 0.73333333 0.78787879 0.75862069]
|
|
|
|
mean value: 0.7790411905484208
|
|
|
|
key: train_jcc
|
|
value: [0.78647687 0.78928571 0.77697842 0.80215827 0.79783394 0.77777778
|
|
0.7826087 0.79642857 0.80072464 0.80071174]
|
|
|
|
mean value: 0.7910984634590573
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00800991 0.00789165 0.00820088 0.00811696 0.00825524 0.00775909
|
|
0.00819016 0.00785351 0.00778246 0.00778031]
|
|
|
|
mean value: 0.007984018325805664
|
|
|
|
key: score_time
|
|
value: [0.08504152 0.01459813 0.01290846 0.01306796 0.0130856 0.01207209
|
|
0.01181221 0.01277232 0.01141 0.01144505]
|
|
|
|
mean value: 0.01982133388519287
|
|
|
|
key: test_mcc
|
|
value: [0.85696041 0.74984143 0.78772636 0.67900461 0.75047877 0.78571429
|
|
0.64450339 0.74569602 0.65330526 0.78353876]
|
|
|
|
mean value: 0.7436769286015497
|
|
|
|
key: train_mcc
|
|
value: [0.79646836 0.79243629 0.77242951 0.80040802 0.78877235 0.78048897
|
|
0.78837632 0.78487523 0.80887676 0.7817104 ]
|
|
|
|
mean value: 0.7894842220535155
|
|
|
|
key: test_accuracy
|
|
value: [0.92857143 0.875 0.89285714 0.83928571 0.875 0.89285714
|
|
0.82142857 0.87272727 0.81818182 0.89090909]
|
|
|
|
mean value: 0.8706818181818182
|
|
|
|
key: train_accuracy
|
|
value: [0.89820359 0.89620758 0.88622754 0.9001996 0.89421158 0.89021956
|
|
0.89421158 0.89243028 0.90438247 0.89043825]
|
|
|
|
mean value: 0.8946732033940088
|
|
|
|
key: test_fscore
|
|
value: [0.92592593 0.86792453 0.89655172 0.83636364 0.87272727 0.89285714
|
|
0.82758621 0.86792453 0.83333333 0.88461538]
|
|
|
|
mean value: 0.8705809683460952
|
|
|
|
key: train_fscore
|
|
value: [0.89779559 0.89558233 0.88484848 0.89919355 0.89421158 0.88933602
|
|
0.89249493 0.89156627 0.904 0.89151874]
|
|
|
|
mean value: 0.8940547478417012
|
|
|
|
key: test_precision
|
|
value: [0.92592593 0.88461538 0.86666667 0.85185185 0.88888889 0.89285714
|
|
0.8 0.88461538 0.75757576 0.92 ]
|
|
|
|
mean value: 0.8672997002997003
|
|
|
|
key: train_precision
|
|
value: [0.89243028 0.892 0.88306452 0.89558233 0.88188976 0.884
|
|
0.89430894 0.888 0.8968254 0.87258687]
|
|
|
|
mean value: 0.8880688100611991
|
|
|
|
key: test_recall
|
|
value: [0.92592593 0.85185185 0.92857143 0.82142857 0.85714286 0.89285714
|
|
0.85714286 0.85185185 0.92592593 0.85185185]
|
|
|
|
mean value: 0.8764550264550265
|
|
|
|
key: train_recall
|
|
value: [0.90322581 0.89919355 0.88663968 0.90283401 0.90688259 0.89473684
|
|
0.89068826 0.89516129 0.91129032 0.91129032]
|
|
|
|
mean value: 0.9001942666840799
|
|
|
|
key: test_roc_auc
|
|
value: [0.9284802 0.87420179 0.89285714 0.83928571 0.875 0.89285714
|
|
0.82142857 0.8723545 0.82010582 0.89021164]
|
|
|
|
mean value: 0.8706782521437694
|
|
|
|
key: train_roc_auc
|
|
value: [0.89825322 0.89623709 0.88623322 0.9002359 0.89438618 0.89028181
|
|
0.89416303 0.89246253 0.90446406 0.89068453]
|
|
|
|
mean value: 0.8947401572130679
|
|
|
|
key: test_jcc
|
|
value: [0.86206897 0.76666667 0.8125 0.71875 0.77419355 0.80645161
|
|
0.70588235 0.76666667 0.71428571 0.79310345]
|
|
|
|
mean value: 0.772056897564365
|
|
|
|
key: train_jcc
|
|
value: [0.81454545 0.81090909 0.79347826 0.81684982 0.80866426 0.80072464
|
|
0.80586081 0.80434783 0.82481752 0.80427046]
|
|
|
|
mean value: 0.8084468133612275
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01613522 0.01378536 0.01915169 0.01421118 0.01374722 0.01693225
|
|
0.014081 0.0139122 0.01385522 0.01448727]
|
|
|
|
mean value: 0.01502985954284668
|
|
|
|
key: score_time
|
|
value: [0.00889826 0.00894713 0.00886488 0.00883365 0.00874567 0.00876617
|
|
0.00901341 0.00938988 0.00874805 0.00884128]
|
|
|
|
mean value: 0.008904838562011718
|
|
|
|
key: test_mcc
|
|
value: [0.89342711 0.82149863 0.89342711 0.71428571 0.67900461 0.85714286
|
|
0.71611487 0.71049701 0.71735629 0.74935731]
|
|
|
|
mean value: 0.7752111518648109
|
|
|
|
key: train_mcc
|
|
value: [0.77670104 0.78487855 0.77670104 0.79675795 0.80065667 0.78078676
|
|
0.79658289 0.79328084 0.79284399 0.78122197]
|
|
|
|
mean value: 0.7880411697036507
|
|
|
|
key: test_accuracy
|
|
value: [0.94642857 0.91071429 0.94642857 0.85714286 0.83928571 0.92857143
|
|
0.85714286 0.85454545 0.85454545 0.87272727]
|
|
|
|
mean value: 0.8867532467532467
|
|
|
|
key: train_accuracy
|
|
value: [0.88822355 0.89221557 0.88822355 0.89820359 0.9001996 0.89021956
|
|
0.89820359 0.89641434 0.89641434 0.89043825]
|
|
|
|
mean value: 0.8938755954227005
|
|
|
|
key: test_fscore
|
|
value: [0.94545455 0.90566038 0.94736842 0.85714286 0.84210526 0.92857143
|
|
0.86206897 0.84615385 0.86206897 0.8627451 ]
|
|
|
|
mean value: 0.8859339767965393
|
|
|
|
key: train_fscore
|
|
value: [0.88844622 0.89285714 0.888 0.89820359 0.9 0.89021956
|
|
0.89779559 0.8968254 0.89558233 0.89065606]
|
|
|
|
mean value: 0.893858589263252
|
|
|
|
key: test_precision
|
|
value: [0.92857143 0.92307692 0.93103448 0.85714286 0.82758621 0.92857143
|
|
0.83333333 0.88 0.80645161 0.91666667]
|
|
|
|
mean value: 0.8832434939921036
|
|
|
|
key: train_precision
|
|
value: [0.87795276 0.87890625 0.87747036 0.88582677 0.88932806 0.87795276
|
|
0.88888889 0.8828125 0.892 0.87843137]
|
|
|
|
mean value: 0.8829569713874807
|
|
|
|
key: test_recall
|
|
value: [0.96296296 0.88888889 0.96428571 0.85714286 0.85714286 0.92857143
|
|
0.89285714 0.81481481 0.92592593 0.81481481]
|
|
|
|
mean value: 0.8907407407407407
|
|
|
|
key: train_recall
|
|
value: [0.89919355 0.90725806 0.89878543 0.91093117 0.91093117 0.90283401
|
|
0.90688259 0.91129032 0.89919355 0.90322581]
|
|
|
|
mean value: 0.9050525662792216
|
|
|
|
key: test_roc_auc
|
|
value: [0.94699872 0.90996169 0.94642857 0.85714286 0.83928571 0.92857143
|
|
0.85714286 0.85383598 0.85582011 0.87169312]
|
|
|
|
mean value: 0.8866881043605181
|
|
|
|
key: train_roc_auc
|
|
value: [0.88833195 0.89236421 0.88836909 0.89837897 0.90034748 0.89039338
|
|
0.89832319 0.89659004 0.89644717 0.89058928]
|
|
|
|
mean value: 0.8940134761930483
|
|
|
|
key: test_jcc
|
|
value: [0.89655172 0.82758621 0.9 0.75 0.72727273 0.86666667
|
|
0.75757576 0.73333333 0.75757576 0.75862069]
|
|
|
|
mean value: 0.7975182863113898
|
|
|
|
key: train_jcc
|
|
value: [0.79928315 0.80645161 0.79856115 0.81521739 0.81818182 0.80215827
|
|
0.81454545 0.81294964 0.81090909 0.80286738]
|
|
|
|
mean value: 0.8081124970226548
|
|
|
|
MCC on Blind test: 0.22
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.44565105 1.50711632 1.54303265 1.40061927 1.64806414 1.59813452
|
|
1.42913032 1.62204456 1.88645577 1.49996781]
|
|
|
|
mean value: 1.558021640777588
|
|
|
|
key: score_time
|
|
value: [0.01188302 0.01363969 0.01324034 0.01340175 0.01373863 0.01372957
|
|
0.0137701 0.02073598 0.01149464 0.0171504 ]
|
|
|
|
mean value: 0.014278411865234375
|
|
|
|
key: test_mcc
|
|
value: [0.96490128 0.89342711 0.82195294 0.93094934 0.75047877 0.83484711
|
|
0.82195294 0.81878307 0.79069197 0.8565805 ]
|
|
|
|
mean value: 0.8484565042398484
|
|
|
|
key: train_mcc
|
|
value: [0.96407453 0.96407453 0.97604323 0.96407052 0.97205662 0.97604323
|
|
0.96809206 0.96812294 0.96812294 0.96018795]
|
|
|
|
mean value: 0.9680888542163917
|
|
|
|
key: test_accuracy
|
|
value: [0.98214286 0.94642857 0.91071429 0.96428571 0.875 0.91071429
|
|
0.91071429 0.90909091 0.89090909 0.92727273]
|
|
|
|
mean value: 0.9227272727272727
|
|
|
|
key: train_accuracy
|
|
value: [0.98203593 0.98203593 0.98802395 0.98203593 0.98602794 0.98802395
|
|
0.98403194 0.98406375 0.98406375 0.98007968]
|
|
|
|
mean value: 0.9840422740177016
|
|
|
|
key: test_fscore
|
|
value: [0.98181818 0.94545455 0.9122807 0.96551724 0.87719298 0.90196078
|
|
0.9122807 0.90909091 0.89655172 0.92307692]
|
|
|
|
mean value: 0.9225224695236438
|
|
|
|
key: train_fscore
|
|
value: [0.98181818 0.98181818 0.98785425 0.98174442 0.98580122 0.98785425
|
|
0.98387097 0.98387097 0.98387097 0.97991968]
|
|
|
|
mean value: 0.9838423086546554
|
|
|
|
key: test_precision
|
|
value: [0.96428571 0.92857143 0.89655172 0.93333333 0.86206897 1.
|
|
0.89655172 0.89285714 0.83870968 0.96 ]
|
|
|
|
mean value: 0.9172929710260077
|
|
|
|
key: train_precision
|
|
value: [0.98380567 0.98380567 0.98785425 0.98373984 0.98780488 0.98785425
|
|
0.97991968 0.98387097 0.98387097 0.976 ]
|
|
|
|
mean value: 0.9838526167702565
|
|
|
|
key: test_recall
|
|
value: [1. 0.96296296 0.92857143 1. 0.89285714 0.82142857
|
|
0.92857143 0.92592593 0.96296296 0.88888889]
|
|
|
|
mean value: 0.9312169312169312
|
|
|
|
key: train_recall
|
|
value: [0.97983871 0.97983871 0.98785425 0.97975709 0.98380567 0.98785425
|
|
0.98785425 0.98387097 0.98387097 0.98387097]
|
|
|
|
mean value: 0.983841582865352
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.94699872 0.91071429 0.96428571 0.875 0.91071429
|
|
0.91071429 0.90939153 0.89219577 0.9265873 ]
|
|
|
|
mean value: 0.9229360518153622
|
|
|
|
key: train_roc_auc
|
|
value: [0.98201422 0.98201422 0.98802161 0.98200453 0.98599732 0.98802161
|
|
0.98408461 0.98406147 0.98406147 0.98012446]
|
|
|
|
mean value: 0.9840405511662667
|
|
|
|
key: test_jcc
|
|
value: [0.96428571 0.89655172 0.83870968 0.93333333 0.78125 0.82142857
|
|
0.83870968 0.83333333 0.8125 0.85714286]
|
|
|
|
mean value: 0.857724488850045
|
|
|
|
key: train_jcc
|
|
value: [0.96428571 0.96428571 0.976 0.96414343 0.972 0.976
|
|
0.96825397 0.96825397 0.96825397 0.96062992]
|
|
|
|
mean value: 0.9682106680887996
|
|
|
|
MCC on Blind test: 0.24
|
|
|
|
Accuracy on Blind test: 0.62
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01453757 0.01242089 0.01000428 0.01067281 0.00980949 0.01040387
|
|
0.01071548 0.0110333 0.01046252 0.01084757]
|
|
|
|
mean value: 0.011090779304504394
|
|
|
|
key: score_time
|
|
value: [0.01091075 0.00837231 0.00808406 0.00912213 0.00790167 0.00789714
|
|
0.00791621 0.00794125 0.00791454 0.00789595]
|
|
|
|
mean value: 0.00839560031890869
|
|
|
|
key: test_mcc
|
|
value: [1. 0.85696041 0.78772636 0.92857143 0.82195294 0.89802651
|
|
0.79385662 0.89153439 1. 0.74935731]
|
|
|
|
mean value: 0.8727985977030033
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.92857143 0.89285714 0.96428571 0.91071429 0.94642857
|
|
0.89285714 0.94545455 1. 0.87272727]
|
|
|
|
mean value: 0.9353896103896104
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.92592593 0.89655172 0.96428571 0.9122807 0.94339623
|
|
0.9 0.94545455 1. 0.8627451 ]
|
|
|
|
mean value: 0.9350639936012812
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.92592593 0.86666667 0.96428571 0.89655172 1.
|
|
0.84375 0.92857143 1. 0.91666667]
|
|
|
|
mean value: 0.9342418126254333
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.92592593 0.92857143 0.96428571 0.92857143 0.89285714
|
|
0.96428571 0.96296296 1. 0.81481481]
|
|
|
|
mean value: 0.9382275132275132
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.9284802 0.89285714 0.96428571 0.91071429 0.94642857
|
|
0.89285714 0.9457672 1. 0.87169312]
|
|
|
|
mean value: 0.9353083378945448
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.86206897 0.8125 0.93103448 0.83870968 0.89285714
|
|
0.81818182 0.89655172 1. 0.75862069]
|
|
|
|
mean value: 0.8810524500527281
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.11
|
|
|
|
Accuracy on Blind test: 0.36
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10190248 0.09926343 0.10202909 0.10004902 0.10008192 0.10025787
|
|
0.10022664 0.09921432 0.10032749 0.10042095]
|
|
|
|
mean value: 0.10037732124328613
|
|
|
|
key: score_time
|
|
value: [0.01691294 0.01694822 0.0180881 0.01682162 0.01716375 0.01716638
|
|
0.01710129 0.0170927 0.01716757 0.01707959]
|
|
|
|
mean value: 0.017154216766357422
|
|
|
|
key: test_mcc
|
|
value: [0.93103448 0.78544061 0.89342711 0.89342711 0.78571429 0.82195294
|
|
0.82618439 0.85449735 0.82337971 0.78961518]
|
|
|
|
mean value: 0.840467318391085
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96428571 0.89285714 0.94642857 0.94642857 0.89285714 0.91071429
|
|
0.91071429 0.92727273 0.90909091 0.89090909]
|
|
|
|
mean value: 0.9191558441558442
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96428571 0.88888889 0.94736842 0.94736842 0.89285714 0.90909091
|
|
0.91525424 0.92592593 0.9122807 0.88 ]
|
|
|
|
mean value: 0.9183320362196365
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.93103448 0.88888889 0.93103448 0.93103448 0.89285714 0.92592593
|
|
0.87096774 0.92592593 0.86666667 0.95652174]
|
|
|
|
mean value: 0.9120857479606331
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.88888889 0.96428571 0.96428571 0.89285714 0.89285714
|
|
0.96428571 0.92592593 0.96296296 0.81481481]
|
|
|
|
mean value: 0.9271164021164021
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96551724 0.89272031 0.94642857 0.94642857 0.89285714 0.91071429
|
|
0.91071429 0.92724868 0.91005291 0.88955026]
|
|
|
|
mean value: 0.9192232256887429
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.93103448 0.8 0.9 0.9 0.80645161 0.83333333
|
|
0.84375 0.86206897 0.83870968 0.78571429]
|
|
|
|
mean value: 0.8501062357646062
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.33
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0078423 0.00767851 0.00772405 0.00781775 0.00766301 0.00781226
|
|
0.00765562 0.00782037 0.00784039 0.00778151]
|
|
|
|
mean value: 0.007763576507568359
|
|
|
|
key: score_time
|
|
value: [0.0079782 0.00796032 0.00801039 0.0080893 0.00807023 0.0080111
|
|
0.00799298 0.00796652 0.00790787 0.00800228]
|
|
|
|
mean value: 0.007998919486999512
|
|
|
|
key: test_mcc
|
|
value: [0.96490128 0.82661701 0.85933785 0.75047877 0.4645821 0.75434227
|
|
0.67900461 0.58684513 0.85695439 0.82269299]
|
|
|
|
mean value: 0.7565756396515464
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98214286 0.91071429 0.92857143 0.875 0.73214286 0.875
|
|
0.83928571 0.78181818 0.92727273 0.90909091]
|
|
|
|
mean value: 0.876103896103896
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98181818 0.9122807 0.93103448 0.87719298 0.72727273 0.86792453
|
|
0.84210526 0.73913043 0.92857143 0.90196078]
|
|
|
|
mean value: 0.87092915151876
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96428571 0.86666667 0.9 0.86206897 0.74074074 0.92
|
|
0.82758621 0.89473684 0.89655172 0.95833333]
|
|
|
|
mean value: 0.8830970193683443
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.96296296 0.96428571 0.89285714 0.71428571 0.82142857
|
|
0.85714286 0.62962963 0.96296296 0.85185185]
|
|
|
|
mean value: 0.8657407407407407
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.91251596 0.92857143 0.875 0.73214286 0.875
|
|
0.83928571 0.77910053 0.92791005 0.90806878]
|
|
|
|
mean value: 0.8760353950009123
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96428571 0.83870968 0.87096774 0.78125 0.57142857 0.76666667
|
|
0.72727273 0.5862069 0.86666667 0.82142857]
|
|
|
|
mean value: 0.7794883233655481
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.24
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.32765579 1.30277157 1.36390686 1.35351229 1.34183788 1.42637658
|
|
1.35306215 1.29488921 1.30548406 1.29170752]
|
|
|
|
mean value: 1.3361203908920287
|
|
|
|
key: score_time
|
|
value: [0.0910337 0.09533978 0.09802961 0.09628296 0.09906578 0.09774327
|
|
0.09189868 0.09146214 0.09067702 0.0920558 ]
|
|
|
|
mean value: 0.09435887336730957
|
|
|
|
key: test_mcc
|
|
value: [1. 0.89342711 0.92857143 0.93094934 0.78571429 0.93094934
|
|
0.96490128 0.89153439 1. 0.89139151]
|
|
|
|
mean value: 0.9217438682406724
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.94642857 0.96428571 0.96428571 0.89285714 0.96428571
|
|
0.98214286 0.94545455 1. 0.94545455]
|
|
|
|
mean value: 0.9605194805194806
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.94545455 0.96428571 0.96551724 0.89285714 0.96296296
|
|
0.98245614 0.94545455 1. 0.94339623]
|
|
|
|
mean value: 0.9602384519160193
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.92857143 0.96428571 0.93333333 0.89285714 1.
|
|
0.96551724 0.92857143 1. 0.96153846]
|
|
|
|
mean value: 0.957467475053682
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.96296296 0.96428571 1. 0.89285714 0.92857143
|
|
1. 0.96296296 1. 0.92592593]
|
|
|
|
mean value: 0.9637566137566138
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.94699872 0.96428571 0.96428571 0.89285714 0.96428571
|
|
0.98214286 0.9457672 1. 0.94510582]
|
|
|
|
mean value: 0.960572888159095
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.89655172 0.93103448 0.93333333 0.80645161 0.92857143
|
|
0.96551724 0.89655172 1. 0.89285714]
|
|
|
|
mean value: 0.9250868690078924
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.19
|
|
|
|
Accuracy on Blind test: 0.49
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
|
|
key: fit_time
|
|
value: [1.78958488 0.90520978 0.92161393 0.90113878 1.04687214 0.9353826
|
|
0.92672181 0.89199233 0.91325641 0.93072248]
|
|
|
|
mean value: 1.0162495136260987
|
|
|
|
key: score_time
|
|
value: [0.24019027 0.16753531 0.25853181 0.21053815 0.25056458 0.25201178
|
|
0.21247077 0.21787858 0.26128078 0.26930881]
|
|
|
|
mean value: 0.234031081199646
|
|
|
|
key: test_mcc
|
|
value: [1. 0.89342711 0.92857143 0.93094934 0.85714286 0.93094934
|
|
0.96490128 0.89153439 1. 0.8565805 ]
|
|
|
|
mean value: 0.9254056246826363
|
|
|
|
key: train_mcc
|
|
value: [0.94423549 0.94817282 0.94817035 0.94817035 0.95628198 0.94423372
|
|
0.94817035 0.95231443 0.94043131 0.94434567]
|
|
|
|
mean value: 0.9474526465723194
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.94642857 0.96428571 0.96428571 0.92857143 0.96428571
|
|
0.98214286 0.94545455 1. 0.92727273]
|
|
|
|
mean value: 0.9622727272727273
|
|
|
|
key: train_accuracy
|
|
value: [0.97205589 0.9740519 0.9740519 0.9740519 0.97804391 0.97205589
|
|
0.9740519 0.97609562 0.97011952 0.97211155]
|
|
|
|
mean value: 0.9736689966680185
|
|
|
|
key: test_fscore
|
|
value: [1. 0.94545455 0.96428571 0.96551724 0.92857143 0.96296296
|
|
0.98245614 0.94545455 1. 0.92307692]
|
|
|
|
mean value: 0.9617779501536308
|
|
|
|
key: train_fscore
|
|
value: [0.972 0.9739479 0.97384306 0.97384306 0.97795591 0.97188755
|
|
0.97384306 0.976 0.97005988 0.972 ]
|
|
|
|
mean value: 0.9735380413105856
|
|
|
|
key: test_precision
|
|
value: [1. 0.92857143 0.96428571 0.93333333 0.92857143 1.
|
|
0.96551724 0.92857143 1. 0.96 ]
|
|
|
|
mean value: 0.9608850574712644
|
|
|
|
key: train_precision
|
|
value: [0.96428571 0.96812749 0.968 0.968 0.96825397 0.96414343
|
|
0.968 0.96825397 0.96047431 0.96428571]
|
|
|
|
mean value: 0.9661824589714422
|
|
|
|
key: test_recall
|
|
value: [1. 0.96296296 0.96428571 1. 0.92857143 0.92857143
|
|
1. 0.96296296 1. 0.88888889]
|
|
|
|
mean value: 0.9636243386243386
|
|
|
|
key: train_recall
|
|
value: [0.97983871 0.97983871 0.97975709 0.97975709 0.98785425 0.97975709
|
|
0.97975709 0.98387097 0.97983871 0.97983871]
|
|
|
|
mean value: 0.981010839754473
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.94699872 0.96428571 0.96428571 0.92857143 0.96428571
|
|
0.98214286 0.9457672 1. 0.9265873 ]
|
|
|
|
mean value: 0.9622924648786718
|
|
|
|
key: train_roc_auc
|
|
value: [0.97213279 0.97410908 0.97413051 0.97413051 0.97817909 0.97216201
|
|
0.97413051 0.97618745 0.97023432 0.97220282]
|
|
|
|
mean value: 0.9737599093111167
|
|
|
|
key: test_jcc
|
|
value: [1. 0.89655172 0.93103448 0.93333333 0.86666667 0.92857143
|
|
0.96551724 0.89655172 1. 0.85714286]
|
|
|
|
mean value: 0.9275369458128079
|
|
|
|
key: train_jcc
|
|
value: [0.94552529 0.94921875 0.94901961 0.94901961 0.95686275 0.9453125
|
|
0.94901961 0.953125 0.94186047 0.94552529]
|
|
|
|
mean value: 0.9484488867401317
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.5
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01942348 0.00846815 0.0079608 0.00791335 0.00764561 0.00800776
|
|
0.00769567 0.00761008 0.00771356 0.00770426]
|
|
|
|
mean value: 0.009014272689819336
|
|
|
|
key: score_time
|
|
value: [0.01135564 0.00861812 0.0081501 0.00827861 0.00797629 0.00797629
|
|
0.00789309 0.00791764 0.00794172 0.0080111 ]
|
|
|
|
mean value: 0.008411860466003418
|
|
|
|
key: test_mcc
|
|
value: [0.89342711 0.74984143 0.85714286 0.71428571 0.67900461 0.78571429
|
|
0.64285714 0.71049701 0.75878131 0.74935731]
|
|
|
|
mean value: 0.7540908782235038
|
|
|
|
key: train_mcc
|
|
value: [0.76073062 0.76464682 0.75244668 0.78078676 0.77655234 0.75249829
|
|
0.76042979 0.77325226 0.78086182 0.77758373]
|
|
|
|
mean value: 0.7679789125420294
|
|
|
|
key: test_accuracy
|
|
value: [0.94642857 0.875 0.92857143 0.85714286 0.83928571 0.89285714
|
|
0.82142857 0.85454545 0.87272727 0.87272727]
|
|
|
|
mean value: 0.8760714285714286
|
|
|
|
key: train_accuracy
|
|
value: [0.88023952 0.88223553 0.8762475 0.89021956 0.88822355 0.8762475
|
|
0.88023952 0.88645418 0.89043825 0.88844622]
|
|
|
|
mean value: 0.8838991340029105
|
|
|
|
key: test_fscore
|
|
value: [0.94545455 0.86792453 0.92857143 0.85714286 0.84210526 0.89285714
|
|
0.82142857 0.84615385 0.88135593 0.8627451 ]
|
|
|
|
mean value: 0.8745739213310778
|
|
|
|
key: train_fscore
|
|
value: [0.88047809 0.88223553 0.87449393 0.89021956 0.8875502 0.875
|
|
0.87804878 0.88667992 0.88933602 0.88932806]
|
|
|
|
mean value: 0.8833370085701109
|
|
|
|
key: test_precision
|
|
value: [0.92857143 0.88461538 0.92857143 0.85714286 0.82758621 0.89285714
|
|
0.82142857 0.88 0.8125 0.91666667]
|
|
|
|
mean value: 0.8749939686750031
|
|
|
|
key: train_precision
|
|
value: [0.87007874 0.87351779 0.87449393 0.87795276 0.88047809 0.87148594
|
|
0.88163265 0.8745098 0.8875502 0.87209302]
|
|
|
|
mean value: 0.8763792922216086
|
|
|
|
key: test_recall
|
|
value: [0.96296296 0.85185185 0.92857143 0.85714286 0.85714286 0.89285714
|
|
0.82142857 0.81481481 0.96296296 0.81481481]
|
|
|
|
mean value: 0.8764550264550264
|
|
|
|
key: train_recall
|
|
value: [0.89112903 0.89112903 0.87449393 0.90283401 0.89473684 0.87854251
|
|
0.87449393 0.89919355 0.89112903 0.90725806]
|
|
|
|
mean value: 0.8904939924252318
|
|
|
|
key: test_roc_auc
|
|
value: [0.94699872 0.87420179 0.92857143 0.85714286 0.83928571 0.89285714
|
|
0.82142857 0.85383598 0.87433862 0.87169312]
|
|
|
|
mean value: 0.8760353950009122
|
|
|
|
key: train_roc_auc
|
|
value: [0.88034712 0.88232341 0.87622334 0.89039338 0.8883133 0.87627913
|
|
0.88016035 0.88660465 0.89044641 0.8886684 ]
|
|
|
|
mean value: 0.8839759495598507
|
|
|
|
key: test_jcc
|
|
value: [0.89655172 0.76666667 0.86666667 0.75 0.72727273 0.80645161
|
|
0.6969697 0.73333333 0.78787879 0.75862069]
|
|
|
|
mean value: 0.7790411905484208
|
|
|
|
key: train_jcc
|
|
value: [0.78647687 0.78928571 0.77697842 0.80215827 0.79783394 0.77777778
|
|
0.7826087 0.79642857 0.80072464 0.80071174]
|
|
|
|
mean value: 0.7910984634590573
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.13241124 0.04881454 0.04856825 0.05027437 0.04823184 0.0521431
|
|
0.05157948 0.04875374 0.05209494 0.05204916]
|
|
|
|
mean value: 0.058492064476013184
|
|
|
|
key: score_time
|
|
value: [0.01028204 0.01021361 0.00997639 0.00988674 0.00968552 0.00974798
|
|
0.00975752 0.00969625 0.01008534 0.00979543]
|
|
|
|
mean value: 0.009912681579589844
|
|
|
|
key: test_mcc
|
|
value: [1. 0.9284802 0.89342711 0.93094934 0.89342711 0.93094934
|
|
0.92857143 0.89153439 1. 0.89139151]
|
|
|
|
mean value: 0.9288730432045526
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.96428571 0.94642857 0.96428571 0.94642857 0.96428571
|
|
0.96428571 0.94545455 1. 0.94545455]
|
|
|
|
mean value: 0.9640909090909091
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.96296296 0.94736842 0.96551724 0.94736842 0.96296296
|
|
0.96428571 0.94545455 1. 0.94339623]
|
|
|
|
mean value: 0.9639316495565854
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.96296296 0.93103448 0.93333333 0.93103448 1.
|
|
0.96428571 0.92857143 1. 0.96153846]
|
|
|
|
mean value: 0.9612760866209142
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.96296296 0.96428571 1. 0.96428571 0.92857143
|
|
0.96428571 0.96296296 1. 0.92592593]
|
|
|
|
mean value: 0.9673280423280424
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.9642401 0.94642857 0.96428571 0.94642857 0.96428571
|
|
0.96428571 0.9457672 1. 0.94510582]
|
|
|
|
mean value: 0.9640827403758438
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.92857143 0.9 0.93333333 0.9 0.92857143
|
|
0.93103448 0.89655172 1. 0.89285714]
|
|
|
|
mean value: 0.9310919540229885
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.37
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01652741 0.03145742 0.04161358 0.04130459 0.0417943 0.04165006
|
|
0.03727698 0.04370403 0.04243636 0.04148102]
|
|
|
|
mean value: 0.037924575805664065
|
|
|
|
key: score_time
|
|
value: [0.01045966 0.01935315 0.02035856 0.02224278 0.02057624 0.0211885
|
|
0.01432323 0.01109576 0.01095986 0.01956701]
|
|
|
|
mean value: 0.017012476921081543
|
|
|
|
key: test_mcc
|
|
value: [0.93103448 0.82149863 0.89342711 0.82195294 0.67900461 0.89342711
|
|
0.67900461 0.78174603 0.71735629 0.82269299]
|
|
|
|
mean value: 0.8041144809910427
|
|
|
|
key: train_mcc
|
|
value: [0.86087113 0.84902508 0.84841579 0.8325975 0.86886449 0.85702217
|
|
0.85676029 0.85318007 0.84497964 0.84964116]
|
|
|
|
mean value: 0.8521357324796069
|
|
|
|
key: test_accuracy
|
|
value: [0.96428571 0.91071429 0.94642857 0.91071429 0.83928571 0.94642857
|
|
0.83928571 0.89090909 0.85454545 0.90909091]
|
|
|
|
mean value: 0.9011688311688312
|
|
|
|
key: train_accuracy
|
|
value: [0.93013972 0.9241517 0.9241517 0.91616766 0.93413174 0.92814371
|
|
0.92814371 0.92629482 0.92231076 0.92430279]
|
|
|
|
mean value: 0.9257938306653625
|
|
|
|
key: test_fscore
|
|
value: [0.96428571 0.90566038 0.94736842 0.9122807 0.84210526 0.94545455
|
|
0.84210526 0.88888889 0.86206897 0.90196078]
|
|
|
|
mean value: 0.9012178924941413
|
|
|
|
key: train_fscore
|
|
value: [0.93069307 0.92490119 0.92369478 0.916 0.93439364 0.92857143
|
|
0.92828685 0.92673267 0.92246521 0.92519685]
|
|
|
|
mean value: 0.9260935685934734
|
|
|
|
key: test_precision
|
|
value: [0.93103448 0.92307692 0.93103448 0.89655172 0.82758621 0.96296296
|
|
0.82758621 0.88888889 0.80645161 0.95833333]
|
|
|
|
mean value: 0.895350682461361
|
|
|
|
key: train_precision
|
|
value: [0.91439689 0.90697674 0.91633466 0.90513834 0.91796875 0.91050584
|
|
0.91372549 0.91050584 0.90980392 0.90384615]
|
|
|
|
mean value: 0.9109202621383721
|
|
|
|
key: test_recall
|
|
value: [1. 0.88888889 0.96428571 0.92857143 0.85714286 0.92857143
|
|
0.85714286 0.88888889 0.92592593 0.85185185]
|
|
|
|
mean value: 0.9091269841269841
|
|
|
|
key: train_recall
|
|
value: [0.94758065 0.94354839 0.93117409 0.92712551 0.951417 0.94736842
|
|
0.94331984 0.94354839 0.93548387 0.94758065]
|
|
|
|
mean value: 0.9418146793783466
|
|
|
|
key: test_roc_auc
|
|
value: [0.96551724 0.90996169 0.94642857 0.91071429 0.83928571 0.94642857
|
|
0.83928571 0.89087302 0.85582011 0.90806878]
|
|
|
|
mean value: 0.9012383689107828
|
|
|
|
key: train_roc_auc
|
|
value: [0.93031206 0.92434336 0.92424846 0.91631866 0.93436992 0.92840862
|
|
0.92835283 0.9264986 0.92246634 0.92457772]
|
|
|
|
mean value: 0.925989658944721
|
|
|
|
key: test_jcc
|
|
value: [0.93103448 0.82758621 0.9 0.83870968 0.72727273 0.89655172
|
|
0.72727273 0.8 0.75757576 0.82142857]
|
|
|
|
mean value: 0.8227431874762242
|
|
|
|
key: train_jcc
|
|
value: [0.87037037 0.86029412 0.85820896 0.84501845 0.87686567 0.86666667
|
|
0.866171 0.86346863 0.85608856 0.86080586]
|
|
|
|
mean value: 0.8623958291829558
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02321148 0.00787091 0.00767159 0.00766253 0.0082829 0.00821114
|
|
0.00813293 0.00844979 0.00815034 0.00824094]
|
|
|
|
mean value: 0.00958845615386963
|
|
|
|
key: score_time
|
|
value: [0.00829506 0.00817847 0.00787377 0.00799441 0.0085752 0.0085206
|
|
0.00833821 0.00848675 0.00863934 0.00852776]
|
|
|
|
mean value: 0.008342957496643067
|
|
|
|
key: test_mcc
|
|
value: [0.89342711 0.74984143 0.89342711 0.71428571 0.67900461 0.82195294
|
|
0.71611487 0.71049701 0.75878131 0.74935731]
|
|
|
|
mean value: 0.7686689426300658
|
|
|
|
key: train_mcc
|
|
value: [0.76059032 0.77655946 0.76451932 0.78061298 0.78453717 0.7684682
|
|
0.78839993 0.78902126 0.77686055 0.77734028]
|
|
|
|
mean value: 0.7766909479185805
|
|
|
|
key: test_accuracy
|
|
value: [0.94642857 0.875 0.94642857 0.85714286 0.83928571 0.91071429
|
|
0.85714286 0.85454545 0.87272727 0.87272727]
|
|
|
|
mean value: 0.8832142857142857
|
|
|
|
key: train_accuracy
|
|
value: [0.88023952 0.88822355 0.88223553 0.89021956 0.89221557 0.88423154
|
|
0.89421158 0.89442231 0.88844622 0.88844622]
|
|
|
|
mean value: 0.8882891587343242
|
|
|
|
key: test_fscore
|
|
value: [0.94545455 0.86792453 0.94736842 0.85714286 0.84210526 0.9122807
|
|
0.86206897 0.84615385 0.88135593 0.8627451 ]
|
|
|
|
mean value: 0.8824600158777894
|
|
|
|
key: train_fscore
|
|
value: [0.88 0.888 0.88128773 0.88977956 0.89156627 0.88306452
|
|
0.89292929 0.89421158 0.88709677 0.88888889]
|
|
|
|
mean value: 0.8876824599523696
|
|
|
|
key: test_precision
|
|
value: [0.92857143 0.88461538 0.93103448 0.85714286 0.82758621 0.89655172
|
|
0.83333333 0.88 0.8125 0.91666667]
|
|
|
|
mean value: 0.8768002084122773
|
|
|
|
key: train_precision
|
|
value: [0.87301587 0.88095238 0.876 0.88095238 0.88446215 0.87951807
|
|
0.89112903 0.88537549 0.88709677 0.875 ]
|
|
|
|
mean value: 0.8813502159126972
|
|
|
|
key: test_recall
|
|
value: [0.96296296 0.85185185 0.96428571 0.85714286 0.85714286 0.92857143
|
|
0.89285714 0.81481481 0.96296296 0.81481481]
|
|
|
|
mean value: 0.8907407407407407
|
|
|
|
key: train_recall
|
|
value: [0.88709677 0.89516129 0.88663968 0.89878543 0.89878543 0.88663968
|
|
0.89473684 0.90322581 0.88709677 0.90322581]
|
|
|
|
mean value: 0.8941393496147316
|
|
|
|
key: test_roc_auc
|
|
value: [0.94699872 0.87420179 0.94642857 0.85714286 0.83928571 0.91071429
|
|
0.85714286 0.85383598 0.87433862 0.87169312]
|
|
|
|
mean value: 0.8831782521437694
|
|
|
|
key: train_roc_auc
|
|
value: [0.88030728 0.88829211 0.88229622 0.89033759 0.8923061 0.88426472
|
|
0.89421881 0.89452629 0.88843028 0.88862078]
|
|
|
|
mean value: 0.8883600174671026
|
|
|
|
key: test_jcc
|
|
value: [0.89655172 0.76666667 0.9 0.75 0.72727273 0.83870968
|
|
0.75757576 0.73333333 0.78787879 0.75862069]
|
|
|
|
mean value: 0.7916609363939731
|
|
|
|
key: train_jcc
|
|
value: [0.78571429 0.79856115 0.78776978 0.80144404 0.80434783 0.79061372
|
|
0.80656934 0.80866426 0.79710145 0.8 ]
|
|
|
|
mean value: 0.7980785861054747
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0116744 0.01263475 0.0133822 0.01279974 0.01338243 0.01445627
|
|
0.01424313 0.01156998 0.01224852 0.01223946]
|
|
|
|
mean value: 0.012863087654113769
|
|
|
|
key: score_time
|
|
value: [0.00871682 0.00998259 0.00996375 0.01038527 0.01055145 0.01046562
|
|
0.01043034 0.01041961 0.01059127 0.01041579]
|
|
|
|
mean value: 0.010192251205444336
|
|
|
|
key: test_mcc
|
|
value: [0.89827421 0.85696041 0.89342711 0.85714286 0.59628479 0.82195294
|
|
0.79385662 0.78174603 0.85449735 0.81854376]
|
|
|
|
mean value: 0.8172686093718741
|
|
|
|
key: train_mcc
|
|
value: [0.83135263 0.909012 0.87714464 0.89219562 0.81343828 0.85235242
|
|
0.86715942 0.86343244 0.87040305 0.89653312]
|
|
|
|
mean value: 0.8673023622935534
|
|
|
|
key: test_accuracy
|
|
value: [0.94642857 0.92857143 0.94642857 0.92857143 0.78571429 0.91071429
|
|
0.89285714 0.89090909 0.92727273 0.90909091]
|
|
|
|
mean value: 0.9066558441558441
|
|
|
|
key: train_accuracy
|
|
value: [0.91217565 0.95409182 0.93812375 0.94610778 0.9001996 0.9261477
|
|
0.93213573 0.93027888 0.93426295 0.94820717]
|
|
|
|
mean value: 0.9321731039912208
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 0.92592593 0.94736842 0.92857143 0.8125 0.90909091
|
|
0.9 0.88888889 0.92592593 0.90566038]
|
|
|
|
mean value: 0.9091300297866832
|
|
|
|
key: train_fscore
|
|
value: [0.91666667 0.95257732 0.93861386 0.94523327 0.9070632 0.92555332
|
|
0.93385214 0.92631579 0.93110647 0.948 ]
|
|
|
|
mean value: 0.9324982031673843
|
|
|
|
key: test_precision
|
|
value: [0.9 0.92592593 0.93103448 0.92857143 0.72222222 0.92592593
|
|
0.84375 0.88888889 0.92592593 0.92307692]
|
|
|
|
mean value: 0.8915321723295861
|
|
|
|
key: train_precision
|
|
value: [0.86428571 0.97468354 0.91860465 0.94715447 0.83848797 0.92
|
|
0.8988764 0.969163 0.96536797 0.94047619]
|
|
|
|
mean value: 0.9237099909738861
|
|
|
|
key: test_recall
|
|
value: [1. 0.92592593 0.96428571 0.92857143 0.92857143 0.89285714
|
|
0.96428571 0.88888889 0.92592593 0.88888889]
|
|
|
|
mean value: 0.9308201058201058
|
|
|
|
key: train_recall
|
|
value: [0.97580645 0.93145161 0.95951417 0.94331984 0.98785425 0.93117409
|
|
0.97165992 0.88709677 0.89919355 0.95564516]
|
|
|
|
mean value: 0.9442715815593574
|
|
|
|
key: test_roc_auc
|
|
value: [0.94827586 0.9284802 0.94642857 0.92857143 0.78571429 0.91071429
|
|
0.89285714 0.89087302 0.92724868 0.90873016]
|
|
|
|
mean value: 0.9067893632548805
|
|
|
|
key: train_roc_auc
|
|
value: [0.91280441 0.9538681 0.9384185 0.94606937 0.90140744 0.92621697
|
|
0.93268035 0.92976886 0.93384874 0.94829502]
|
|
|
|
mean value: 0.9323377764010412
|
|
|
|
key: test_jcc
|
|
value: [0.9 0.86206897 0.9 0.86666667 0.68421053 0.83333333
|
|
0.81818182 0.8 0.86206897 0.82758621]
|
|
|
|
mean value: 0.8354116482428642
|
|
|
|
key: train_jcc
|
|
value: [0.84615385 0.90944882 0.88432836 0.89615385 0.82993197 0.86142322
|
|
0.87591241 0.8627451 0.87109375 0.90114068]
|
|
|
|
mean value: 0.873833200438617
|
|
|
|
MCC on Blind test: 0.18
|
|
|
|
Accuracy on Blind test: 0.49
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01433849 0.01301265 0.0144105 0.01522279 0.0132041 0.01305103
|
|
0.01268101 0.01373744 0.0126369 0.01393795]
|
|
|
|
mean value: 0.013623285293579101
|
|
|
|
key: score_time
|
|
value: [0.01049781 0.01047158 0.01053524 0.0104003 0.01044488 0.01042581
|
|
0.01039124 0.01047635 0.01046515 0.01041508]
|
|
|
|
mean value: 0.010452342033386231
|
|
|
|
key: test_mcc
|
|
value: [0.93069263 0.85951469 0.82618439 0.89802651 0.75047877 0.78571429
|
|
0.73127242 0.85695439 0.92962225 0.8565805 ]
|
|
|
|
mean value: 0.8425040848772626
|
|
|
|
key: train_mcc
|
|
value: [0.87181962 0.83135263 0.85503558 0.88967789 0.91283821 0.88589338
|
|
0.86743952 0.82906495 0.86468284 0.92034415]
|
|
|
|
mean value: 0.8728148763208086
|
|
|
|
key: test_accuracy
|
|
value: [0.96428571 0.92857143 0.91071429 0.94642857 0.875 0.89285714
|
|
0.85714286 0.92727273 0.96363636 0.92727273]
|
|
|
|
mean value: 0.9193181818181818
|
|
|
|
key: train_accuracy
|
|
value: [0.93413174 0.91217565 0.9241517 0.94411178 0.95608782 0.94211577
|
|
0.93213573 0.91035857 0.93027888 0.96015936]
|
|
|
|
mean value: 0.9345706992389723
|
|
|
|
key: test_fscore
|
|
value: [0.96153846 0.92857143 0.90566038 0.94915254 0.87719298 0.89285714
|
|
0.84 0.92857143 0.96153846 0.92307692]
|
|
|
|
mean value: 0.9168159748341358
|
|
|
|
key: train_fscore
|
|
value: [0.93023256 0.91666667 0.91774892 0.94488189 0.95454545 0.94302554
|
|
0.9279661 0.91525424 0.92569002 0.95983936]
|
|
|
|
mean value: 0.9335850744783595
|
|
|
|
key: test_precision
|
|
value: [1. 0.89655172 0.96 0.90322581 0.86206897 0.89285714
|
|
0.95454545 0.89655172 1. 0.96 ]
|
|
|
|
mean value: 0.9325800817647314
|
|
|
|
key: train_precision
|
|
value: [0.97777778 0.86428571 0.98604651 0.91954023 0.97468354 0.91603053
|
|
0.97333333 0.85865724 0.97757848 0.956 ]
|
|
|
|
mean value: 0.940393336471731
|
|
|
|
key: test_recall
|
|
value: [0.92592593 0.96296296 0.85714286 1. 0.89285714 0.89285714
|
|
0.75 0.96296296 0.92592593 0.88888889]
|
|
|
|
mean value: 0.905952380952381
|
|
|
|
key: train_recall
|
|
value: [0.88709677 0.97580645 0.8582996 0.97165992 0.93522267 0.97165992
|
|
0.88663968 0.97983871 0.87903226 0.96370968]
|
|
|
|
mean value: 0.930896565234426
|
|
|
|
key: test_roc_auc
|
|
value: [0.96296296 0.92975734 0.91071429 0.94642857 0.875 0.89285714
|
|
0.85714286 0.92791005 0.96296296 0.9265873 ]
|
|
|
|
mean value: 0.9192323481116584
|
|
|
|
key: train_roc_auc
|
|
value: [0.93366696 0.91280441 0.92324429 0.94449138 0.95580031 0.94252287
|
|
0.93150881 0.9111792 0.92967361 0.9602013 ]
|
|
|
|
mean value: 0.9345093140199082
|
|
|
|
key: test_jcc
|
|
value: [0.92592593 0.86666667 0.82758621 0.90322581 0.78125 0.80645161
|
|
0.72413793 0.86666667 0.92592593 0.85714286]
|
|
|
|
mean value: 0.8484979599613915
|
|
|
|
key: train_jcc
|
|
value: [0.86956522 0.84615385 0.848 0.89552239 0.91304348 0.89219331
|
|
0.86561265 0.84375 0.86166008 0.92277992]
|
|
|
|
mean value: 0.8758280888468557
|
|
|
|
MCC on Blind test: 0.23
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10799265 0.09333324 0.09343123 0.09375167 0.09355068 0.09351802
|
|
0.09371734 0.09361362 0.09355211 0.09383702]
|
|
|
|
mean value: 0.09502975940704346
|
|
|
|
key: score_time
|
|
value: [0.01410651 0.01418447 0.01437783 0.01411939 0.01418042 0.01420903
|
|
0.01414371 0.01427364 0.01410794 0.01541471]
|
|
|
|
mean value: 0.014311766624450684
|
|
|
|
key: test_mcc
|
|
value: [0.96481304 0.89315584 0.96490128 0.89802651 0.85933785 0.93094934
|
|
0.96490128 0.89153439 1. 0.92724868]
|
|
|
|
mean value: 0.9294868200199901
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98214286 0.94642857 0.98214286 0.94642857 0.92857143 0.96428571
|
|
0.98214286 0.94545455 1. 0.96363636]
|
|
|
|
mean value: 0.9641233766233765
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98113208 0.94339623 0.98245614 0.94915254 0.93103448 0.96296296
|
|
0.98245614 0.94545455 1. 0.96296296]
|
|
|
|
mean value: 0.964100807910052
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.96153846 0.96551724 0.90322581 0.9 1.
|
|
0.96551724 0.92857143 1. 0.96296296]
|
|
|
|
mean value: 0.9587333142283087
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96296296 0.92592593 1. 1. 0.96428571 0.92857143
|
|
1. 0.96296296 1. 0.96296296]
|
|
|
|
mean value: 0.9707671957671957
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98148148 0.94572158 0.98214286 0.94642857 0.92857143 0.96428571
|
|
0.98214286 0.9457672 1. 0.96362434]
|
|
|
|
mean value: 0.9640166028097062
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96296296 0.89285714 0.96551724 0.90322581 0.87096774 0.92857143
|
|
0.96551724 0.89655172 1. 0.92857143]
|
|
|
|
mean value: 0.9314742718246611
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.39
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03909659 0.04021478 0.03637838 0.04280972 0.04759765 0.04407358
|
|
0.04242229 0.04342175 0.03089213 0.04606533]
|
|
|
|
mean value: 0.04129722118377686
|
|
|
|
key: score_time
|
|
value: [0.02080131 0.02186036 0.0172112 0.03137207 0.02341676 0.0170188
|
|
0.03378963 0.01608229 0.01641321 0.03630662]
|
|
|
|
mean value: 0.02342722415924072
|
|
|
|
key: test_mcc
|
|
value: [1. 0.85696041 0.92857143 0.93094934 0.89342711 0.96490128
|
|
0.96490128 0.89153439 1. 0.89139151]
|
|
|
|
mean value: 0.9322636750479738
|
|
|
|
key: train_mcc
|
|
value: [0.98803016 0.98403035 0.99204516 0.99204516 0.99204692 0.98803016
|
|
0.99201441 0.99602309 0.98409121 0.99203073]
|
|
|
|
mean value: 0.9900387331545668
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.92857143 0.96428571 0.96428571 0.94642857 0.98214286
|
|
0.98214286 0.94545455 1. 0.94545455]
|
|
|
|
mean value: 0.9658766233766234
|
|
|
|
key: train_accuracy
|
|
value: [0.99401198 0.99201597 0.99600798 0.99600798 0.99600798 0.99401198
|
|
0.99600798 0.99800797 0.99203187 0.99601594]
|
|
|
|
mean value: 0.9950127633179855
|
|
|
|
key: test_fscore
|
|
value: [1. 0.92592593 0.96428571 0.96551724 0.94736842 0.98181818
|
|
0.98245614 0.94545455 1. 0.94339623]
|
|
|
|
mean value: 0.9656222396682281
|
|
|
|
key: train_fscore
|
|
value: [0.99393939 0.99193548 0.99593496 0.99593496 0.99596774 0.99393939
|
|
0.99595142 0.9979798 0.99190283 0.99596774]
|
|
|
|
mean value: 0.9949453723311854
|
|
|
|
key: test_precision
|
|
value: [1. 0.92592593 0.96428571 0.93333333 0.93103448 1.
|
|
0.96551724 0.92857143 1. 0.96153846]
|
|
|
|
mean value: 0.9610206587792794
|
|
|
|
key: train_precision
|
|
value: [0.99595142 0.99193548 1. 1. 0.99196787 0.99193548
|
|
0.99595142 1. 0.99593496 0.99596774]
|
|
|
|
mean value: 0.9959644374521054
|
|
|
|
key: test_recall
|
|
value: [1. 0.92592593 0.96428571 1. 0.96428571 0.96428571
|
|
1. 0.96296296 1. 0.92592593]
|
|
|
|
mean value: 0.9707671957671957
|
|
|
|
key: train_recall
|
|
value: [0.99193548 0.99193548 0.99190283 0.99190283 1. 0.99595142
|
|
0.99595142 0.99596774 0.98790323 0.99596774]
|
|
|
|
mean value: 0.9939418179443646
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.9284802 0.96428571 0.96428571 0.94642857 0.98214286
|
|
0.98214286 0.9457672 1. 0.94510582]
|
|
|
|
mean value: 0.9658638934501004
|
|
|
|
key: train_roc_auc
|
|
value: [0.99399146 0.99201517 0.99595142 0.99595142 0.99606299 0.9940387
|
|
0.9960072 0.99798387 0.99198311 0.99601537]
|
|
|
|
mean value: 0.9950000708407828
|
|
|
|
key: test_jcc
|
|
value: [1. 0.86206897 0.93103448 0.93333333 0.9 0.96428571
|
|
0.96551724 0.89655172 1. 0.89285714]
|
|
|
|
mean value: 0.9345648604269294
|
|
|
|
key: train_jcc
|
|
value: [0.98795181 0.984 0.99190283 0.99190283 0.99196787 0.98795181
|
|
0.99193548 0.99596774 0.98393574 0.99196787]
|
|
|
|
mean value: 0.9899483994224252
|
|
|
|
MCC on Blind test: 0.14
|
|
|
|
Accuracy on Blind test: 0.37
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.17011619 0.17630982 0.1871922 0.19212961 0.17157412 0.15598917
|
|
0.15701985 0.17155218 0.17326117 0.15419507]
|
|
|
|
mean value: 0.17093393802642823
|
|
|
|
key: score_time
|
|
value: [0.02660775 0.02126074 0.02073812 0.01927352 0.02623224 0.02367377
|
|
0.0206089 0.02606893 0.02499342 0.01983643]
|
|
|
|
mean value: 0.02292938232421875
|
|
|
|
key: test_mcc
|
|
value: [0.89342711 0.74984143 0.89342711 0.71428571 0.71428571 0.78571429
|
|
0.68250015 0.78174603 0.72754449 0.81854376]
|
|
|
|
mean value: 0.7761315807091642
|
|
|
|
key: train_mcc
|
|
value: [0.83651026 0.85265474 0.84449262 0.84078809 0.85265708 0.84078809
|
|
0.84078809 0.84907279 0.86501334 0.85318007]
|
|
|
|
mean value: 0.8475945177079469
|
|
|
|
key: test_accuracy
|
|
value: [0.94642857 0.875 0.94642857 0.85714286 0.85714286 0.89285714
|
|
0.83928571 0.89090909 0.85454545 0.90909091]
|
|
|
|
mean value: 0.8868831168831168
|
|
|
|
key: train_accuracy
|
|
value: [0.91816367 0.9261477 0.92215569 0.92015968 0.9261477 0.92015968
|
|
0.92015968 0.92430279 0.93227092 0.92629482]
|
|
|
|
mean value: 0.9235962338271664
|
|
|
|
key: test_fscore
|
|
value: [0.94545455 0.86792453 0.94736842 0.85714286 0.85714286 0.89285714
|
|
0.84745763 0.88888889 0.86666667 0.90566038]
|
|
|
|
mean value: 0.8876563911984611
|
|
|
|
key: train_fscore
|
|
value: [0.91816367 0.92644135 0.92184369 0.92031873 0.9261477 0.92031873
|
|
0.92031873 0.92460317 0.93253968 0.92673267]
|
|
|
|
mean value: 0.9237428122217914
|
|
|
|
key: test_precision
|
|
value: [0.92857143 0.88461538 0.93103448 0.85714286 0.85714286 0.89285714
|
|
0.80645161 0.88888889 0.78787879 0.92307692]
|
|
|
|
mean value: 0.8757660365836116
|
|
|
|
key: train_precision
|
|
value: [0.90909091 0.91372549 0.91269841 0.90588235 0.91338583 0.90588235
|
|
0.90588235 0.91015625 0.91796875 0.91050584]
|
|
|
|
mean value: 0.9105178534156458
|
|
|
|
key: test_recall
|
|
value: [0.96296296 0.85185185 0.96428571 0.85714286 0.85714286 0.89285714
|
|
0.89285714 0.88888889 0.96296296 0.88888889]
|
|
|
|
mean value: 0.901984126984127
|
|
|
|
key: train_recall
|
|
value: [0.92741935 0.93951613 0.93117409 0.93522267 0.93927126 0.93522267
|
|
0.93522267 0.93951613 0.94758065 0.94354839]
|
|
|
|
mean value: 0.9373694005485177
|
|
|
|
key: test_roc_auc
|
|
value: [0.94699872 0.87420179 0.94642857 0.85714286 0.85714286 0.89285714
|
|
0.83928571 0.89087302 0.85648148 0.90873016]
|
|
|
|
mean value: 0.8870142309797482
|
|
|
|
key: train_roc_auc
|
|
value: [0.91825513 0.9262798 0.92227996 0.92036724 0.92632854 0.92036724
|
|
0.92036724 0.92448247 0.93245174 0.9264986 ]
|
|
|
|
mean value: 0.9237677975946037
|
|
|
|
key: test_jcc
|
|
value: [0.89655172 0.76666667 0.9 0.75 0.75 0.80645161
|
|
0.73529412 0.8 0.76470588 0.82758621]
|
|
|
|
mean value: 0.7997256210604375
|
|
|
|
key: train_jcc
|
|
value: [0.84870849 0.86296296 0.85501859 0.85239852 0.86245353 0.85239852
|
|
0.85239852 0.8597786 0.87360595 0.86346863]
|
|
|
|
mean value: 0.8583192321390376
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.25510955 0.24359798 0.24252176 0.24263096 0.24308658 0.2433722
|
|
0.24473262 0.24413776 0.24478555 0.24284172]
|
|
|
|
mean value: 0.2446816682815552
|
|
|
|
key: score_time
|
|
value: [0.00862026 0.00837231 0.00834084 0.00830126 0.00861955 0.00825286
|
|
0.00839138 0.00829506 0.00852728 0.00854683]
|
|
|
|
mean value: 0.008426761627197266
|
|
|
|
key: test_mcc
|
|
value: [1. 0.9284802 0.92857143 0.93094934 0.85933785 0.96490128
|
|
0.96490128 0.89153439 1. 0.8565805 ]
|
|
|
|
mean value: 0.9325256275611022
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.96428571 0.96428571 0.96428571 0.92857143 0.98214286
|
|
0.98214286 0.94545455 1. 0.92727273]
|
|
|
|
mean value: 0.9658441558441558
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.96296296 0.96428571 0.96551724 0.93103448 0.98181818
|
|
0.98245614 0.94545455 1. 0.92307692]
|
|
|
|
mean value: 0.9656606192087136
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.96296296 0.96428571 0.93333333 0.9 1.
|
|
0.96551724 0.92857143 1. 0.96 ]
|
|
|
|
mean value: 0.961467068053275
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.96296296 0.96428571 1. 0.96428571 0.96428571
|
|
1. 0.96296296 1. 0.88888889]
|
|
|
|
mean value: 0.9707671957671957
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.9642401 0.96428571 0.96428571 0.92857143 0.98214286
|
|
0.98214286 0.9457672 1. 0.9265873 ]
|
|
|
|
mean value: 0.9658023170954205
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.92857143 0.93103448 0.93333333 0.87096774 0.96428571
|
|
0.96551724 0.89655172 1. 0.85714286]
|
|
|
|
mean value: 0.9347404523544679
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.3
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01189399 0.01387286 0.01450205 0.01408744 0.01389623 0.0166738
|
|
0.01422548 0.01454496 0.01396155 0.01419568]
|
|
|
|
mean value: 0.014185404777526856
|
|
|
|
key: score_time
|
|
value: [0.01086521 0.01088333 0.01085353 0.01082253 0.01088142 0.01098228
|
|
0.01087499 0.01154423 0.01157999 0.01079154]
|
|
|
|
mean value: 0.011007905006408691
|
|
|
|
key: test_mcc
|
|
value: [0.9284802 0.54871911 0.78772636 0.71611487 0.47187011 0.60753044
|
|
0.68250015 0.60268595 0.81878307 0.67602163]
|
|
|
|
mean value: 0.6840431895998464
|
|
|
|
key: train_mcc
|
|
value: [0.80440606 0.81032473 0.79940894 0.79646944 0.70336606 0.78901365
|
|
0.78773489 0.80486309 0.79845601 0.7610531 ]
|
|
|
|
mean value: 0.7855095967948068
|
|
|
|
key: test_accuracy
|
|
value: [0.96428571 0.76785714 0.89285714 0.85714286 0.73214286 0.80357143
|
|
0.83928571 0.8 0.90909091 0.83636364]
|
|
|
|
mean value: 0.8402597402597403
|
|
|
|
key: train_accuracy
|
|
value: [0.90219561 0.90419162 0.89620758 0.89820359 0.83433134 0.89221557
|
|
0.89221557 0.90039841 0.89840637 0.87848606]
|
|
|
|
mean value: 0.88968517148969
|
|
|
|
key: test_fscore
|
|
value: [0.96296296 0.72340426 0.88888889 0.85185185 0.70588235 0.80701754
|
|
0.83018868 0.78431373 0.90909091 0.82352941]
|
|
|
|
mean value: 0.8287130581414772
|
|
|
|
key: train_fscore
|
|
value: [0.90060852 0.89958159 0.88695652 0.89570552 0.8 0.88412017
|
|
0.88510638 0.89361702 0.89352818 0.86993603]
|
|
|
|
mean value: 0.8809159946199812
|
|
|
|
key: test_precision
|
|
value: [0.96296296 0.85 0.92307692 0.88461538 0.7826087 0.79310345
|
|
0.88 0.83333333 0.89285714 0.875 ]
|
|
|
|
mean value: 0.8677557890773783
|
|
|
|
key: train_precision
|
|
value: [0.90612245 0.93478261 0.95774648 0.90495868 0.98809524 0.94063927
|
|
0.93273543 0.94594595 0.92640693 0.92307692]
|
|
|
|
mean value: 0.9360509943174828
|
|
|
|
key: test_recall
|
|
value: [0.96296296 0.62962963 0.85714286 0.82142857 0.64285714 0.82142857
|
|
0.78571429 0.74074074 0.92592593 0.77777778]
|
|
|
|
mean value: 0.7965608465608466
|
|
|
|
key: train_recall
|
|
value: [0.89516129 0.86693548 0.82591093 0.88663968 0.67206478 0.8340081
|
|
0.84210526 0.84677419 0.86290323 0.82258065]
|
|
|
|
mean value: 0.8355083583648949
|
|
|
|
key: test_roc_auc
|
|
value: [0.9642401 0.76309068 0.89285714 0.85714286 0.73214286 0.80357143
|
|
0.83928571 0.7989418 0.90939153 0.83531746]
|
|
|
|
mean value: 0.8395981572705711
|
|
|
|
key: train_roc_auc
|
|
value: [0.9021261 0.90382347 0.89523893 0.89804425 0.83209538 0.8914135
|
|
0.89152507 0.89976505 0.89798705 0.87782576]
|
|
|
|
mean value: 0.8889844552398375
|
|
|
|
key: test_jcc
|
|
value: [0.92857143 0.56666667 0.8 0.74193548 0.54545455 0.67647059
|
|
0.70967742 0.64516129 0.83333333 0.7 ]
|
|
|
|
mean value: 0.7147270755809655
|
|
|
|
key: train_jcc
|
|
value: [0.81918819 0.81749049 0.796875 0.81111111 0.66666667 0.79230769
|
|
0.79389313 0.80769231 0.80754717 0.76981132]
|
|
|
|
mean value: 0.7882583084293304
|
|
|
|
MCC on Blind test: 0.32
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01984334 0.02952814 0.0294714 0.0295279 0.02948236 0.02966619
|
|
0.02946544 0.02954245 0.02954078 0.02948689]
|
|
|
|
mean value: 0.02855548858642578
|
|
|
|
key: score_time
|
|
value: [0.01642299 0.01944399 0.02070069 0.01963282 0.01078868 0.02068901
|
|
0.01969433 0.0106039 0.02023768 0.02003217]
|
|
|
|
mean value: 0.017824625968933104
|
|
|
|
key: test_mcc
|
|
value: [0.93103448 0.82149863 0.89342711 0.78772636 0.67900461 0.85933785
|
|
0.71611487 0.78174603 0.71735629 0.81854376]
|
|
|
|
mean value: 0.8005790004186416
|
|
|
|
key: train_mcc
|
|
value: [0.82071187 0.83279667 0.82071472 0.80065667 0.82921429 0.83720268
|
|
0.82507217 0.81310081 0.82516195 0.81719167]
|
|
|
|
mean value: 0.822182349120051
|
|
|
|
key: test_accuracy
|
|
value: [0.96428571 0.91071429 0.94642857 0.89285714 0.83928571 0.92857143
|
|
0.85714286 0.89090909 0.85454545 0.90909091]
|
|
|
|
mean value: 0.8993831168831169
|
|
|
|
key: train_accuracy
|
|
value: [0.91017964 0.91616766 0.91017964 0.9001996 0.91417166 0.91816367
|
|
0.91217565 0.9063745 0.9123506 0.90836653]
|
|
|
|
mean value: 0.9108329158416235
|
|
|
|
key: test_fscore
|
|
value: [0.96428571 0.90566038 0.94736842 0.89655172 0.84210526 0.92592593
|
|
0.86206897 0.88888889 0.86206897 0.90566038]
|
|
|
|
mean value: 0.900058462320045
|
|
|
|
key: train_fscore
|
|
value: [0.91053678 0.91666667 0.91017964 0.9 0.91485149 0.91881188
|
|
0.91269841 0.90656064 0.91269841 0.90873016]
|
|
|
|
mean value: 0.9111734073355806
|
|
|
|
key: test_precision
|
|
value: [0.93103448 0.92307692 0.93103448 0.86666667 0.82758621 0.96153846
|
|
0.83333333 0.88888889 0.80645161 0.92307692]
|
|
|
|
mean value: 0.8892687981898215
|
|
|
|
key: train_precision
|
|
value: [0.89803922 0.90234375 0.8976378 0.88932806 0.89534884 0.89922481
|
|
0.89494163 0.89411765 0.8984375 0.89453125]
|
|
|
|
mean value: 0.8963950498913893
|
|
|
|
key: test_recall
|
|
value: [1. 0.88888889 0.96428571 0.92857143 0.85714286 0.89285714
|
|
0.89285714 0.88888889 0.92592593 0.88888889]
|
|
|
|
mean value: 0.9128306878306878
|
|
|
|
key: train_recall
|
|
value: [0.9233871 0.93145161 0.92307692 0.91093117 0.93522267 0.93927126
|
|
0.93117409 0.91935484 0.92741935 0.9233871 ]
|
|
|
|
mean value: 0.9264676113360324
|
|
|
|
key: test_roc_auc
|
|
value: [0.96551724 0.90996169 0.94642857 0.89285714 0.83928571 0.92857143
|
|
0.85714286 0.89087302 0.85582011 0.90873016]
|
|
|
|
mean value: 0.899518792191206
|
|
|
|
key: train_roc_auc
|
|
value: [0.91031015 0.91631869 0.91035736 0.90034748 0.91446173 0.91845453
|
|
0.91243744 0.90652781 0.91252858 0.90854394]
|
|
|
|
mean value: 0.9110287700326485
|
|
|
|
key: test_jcc
|
|
value: [0.93103448 0.82758621 0.9 0.8125 0.72727273 0.86206897
|
|
0.75757576 0.8 0.75757576 0.82758621]
|
|
|
|
mean value: 0.8203200104493208
|
|
|
|
key: train_jcc
|
|
value: [0.83576642 0.84615385 0.83516484 0.81818182 0.84306569 0.84981685
|
|
0.83941606 0.82909091 0.83941606 0.83272727]
|
|
|
|
mean value: 0.8368799764712174
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_config.py:122: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
baseline_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_config.py:125: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
baseline_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.19355822 0.24611878 0.21725583 0.18385339 0.20384955 0.20188379
|
|
0.19162393 0.19133949 0.19330359 0.22055078]
|
|
|
|
mean value: 0.2043337345123291
|
|
|
|
key: score_time
|
|
value: [0.02099872 0.01083326 0.02049541 0.02122831 0.02145958 0.01939106
|
|
0.01959395 0.02021599 0.01877356 0.0112946 ]
|
|
|
|
mean value: 0.018428444862365723
|
|
|
|
key: test_mcc
|
|
value: [0.93103448 0.82149863 0.89342711 0.78772636 0.67900461 0.89342711
|
|
0.67900461 0.78174603 0.71735629 0.8565805 ]
|
|
|
|
mean value: 0.8040805738070792
|
|
|
|
key: train_mcc
|
|
value: [0.84902508 0.84902508 0.84856792 0.80065667 0.86474639 0.86116786
|
|
0.85289102 0.8493299 0.83338631 0.84549238]
|
|
|
|
mean value: 0.8454288618434195
|
|
|
|
key: test_accuracy
|
|
value: [0.96428571 0.91071429 0.94642857 0.89285714 0.83928571 0.94642857
|
|
0.83928571 0.89090909 0.85454545 0.92727273]
|
|
|
|
mean value: 0.9012012987012987
|
|
|
|
key: train_accuracy
|
|
value: [0.9241517 0.9241517 0.9241517 0.9001996 0.93213573 0.93013972
|
|
0.9261477 0.92430279 0.91633466 0.92231076]
|
|
|
|
mean value: 0.9224026051482692
|
|
|
|
key: test_fscore
|
|
value: [0.96428571 0.90566038 0.94736842 0.89655172 0.84210526 0.94545455
|
|
0.84210526 0.88888889 0.86206897 0.92307692]
|
|
|
|
mean value: 0.9017566086088156
|
|
|
|
key: train_fscore
|
|
value: [0.92490119 0.92490119 0.924 0.9 0.93227092 0.93069307
|
|
0.92644135 0.92490119 0.91699605 0.92307692]
|
|
|
|
mean value: 0.9228181865350266
|
|
|
|
key: test_precision
|
|
value: [0.93103448 0.92307692 0.93103448 0.86666667 0.82758621 0.96296296
|
|
0.82758621 0.88888889 0.80645161 0.96 ]
|
|
|
|
mean value: 0.8925288433809012
|
|
|
|
key: train_precision
|
|
value: [0.90697674 0.90697674 0.91304348 0.88932806 0.91764706 0.91085271
|
|
0.91015625 0.90697674 0.89922481 0.9034749 ]
|
|
|
|
mean value: 0.9064657505738394
|
|
|
|
key: test_recall
|
|
value: [1. 0.88888889 0.96428571 0.92857143 0.85714286 0.92857143
|
|
0.85714286 0.88888889 0.92592593 0.88888889]
|
|
|
|
mean value: 0.9128306878306878
|
|
|
|
key: train_recall
|
|
value: [0.94354839 0.94354839 0.93522267 0.91093117 0.94736842 0.951417
|
|
0.94331984 0.94354839 0.93548387 0.94354839]
|
|
|
|
mean value: 0.939793652866658
|
|
|
|
key: test_roc_auc
|
|
value: [0.96551724 0.90996169 0.94642857 0.89285714 0.83928571 0.94642857
|
|
0.83928571 0.89087302 0.85582011 0.9265873 ]
|
|
|
|
mean value: 0.9013045064769203
|
|
|
|
key: train_roc_auc
|
|
value: [0.92434336 0.92434336 0.92430425 0.90034748 0.93234563 0.93043291
|
|
0.92638433 0.9245301 0.91656083 0.9225616 ]
|
|
|
|
mean value: 0.9226153848348726
|
|
|
|
key: test_jcc
|
|
value: [0.93103448 0.82758621 0.9 0.8125 0.72727273 0.89655172
|
|
0.72727273 0.8 0.75757576 0.85714286]
|
|
|
|
mean value: 0.8236936483057172
|
|
|
|
key: train_jcc
|
|
value: [0.86029412 0.86029412 0.85873606 0.81818182 0.87313433 0.87037037
|
|
0.86296296 0.86029412 0.84671533 0.85714286]
|
|
|
|
mean value: 0.8568126077904101
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02480316 0.04043341 0.03790212 0.02258253 0.02496028 0.02353382
|
|
0.02346492 0.02378702 0.02321482 0.02300453]
|
|
|
|
mean value: 0.02676866054534912
|
|
|
|
key: score_time
|
|
value: [0.01101851 0.01310444 0.01055646 0.01048899 0.01055861 0.01072121
|
|
0.01049709 0.01050425 0.01051044 0.01050949]
|
|
|
|
mean value: 0.010846948623657227
|
|
|
|
key: test_mcc
|
|
value: [0.8953202 0.8953202 0.82512315 0.79110556 0.71611487 0.89342711
|
|
0.75434227 0.75047877 0.68250015 0.82195294]
|
|
|
|
mean value: 0.8025685230193058
|
|
|
|
key: train_mcc
|
|
value: [0.82263766 0.83068165 0.82666897 0.83070006 0.83890131 0.81930411
|
|
0.83123063 0.8387452 0.83529327 0.81527029]
|
|
|
|
mean value: 0.8289433160428895
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.94736842 0.9122807 0.89473684 0.85714286 0.94642857
|
|
0.875 0.875 0.83928571 0.91071429]
|
|
|
|
mean value: 0.900532581453634
|
|
|
|
key: train_accuracy
|
|
value: [0.9112426 0.91518738 0.91321499 0.91518738 0.91929134 0.90944882
|
|
0.91535433 0.91929134 0.91732283 0.90748031]
|
|
|
|
mean value: 0.9143021323517992
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 0.94736842 0.9122807 0.9 0.86206897 0.94545455
|
|
0.88135593 0.87272727 0.84745763 0.90909091]
|
|
|
|
mean value: 0.9025172795971652
|
|
|
|
key: train_fscore
|
|
value: [0.9122807 0.91650485 0.9140625 0.91617934 0.92038835 0.91085271
|
|
0.91682785 0.92007797 0.91891892 0.90873786]
|
|
|
|
mean value: 0.9154831064752351
|
|
|
|
key: test_precision
|
|
value: [0.93103448 0.93103448 0.92857143 0.87096774 0.83333333 0.96296296
|
|
0.83870968 0.88888889 0.80645161 0.92592593]
|
|
|
|
mean value: 0.8917880537457845
|
|
|
|
key: train_precision
|
|
value: [0.9034749 0.90421456 0.9034749 0.90384615 0.90804598 0.89694656
|
|
0.90114068 0.91119691 0.90151515 0.89655172]
|
|
|
|
mean value: 0.9030407533340564
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.89655172 0.93103448 0.89285714 0.92857143
|
|
0.92857143 0.85714286 0.89285714 0.89285714]
|
|
|
|
mean value: 0.9149014778325123
|
|
|
|
key: train_recall
|
|
value: [0.92125984 0.92913386 0.92490119 0.92885375 0.93307087 0.92519685
|
|
0.93307087 0.92913386 0.93700787 0.92125984]
|
|
|
|
mean value: 0.9282888798979179
|
|
|
|
key: test_roc_auc
|
|
value: [0.9476601 0.9476601 0.91256158 0.89408867 0.85714286 0.94642857
|
|
0.875 0.875 0.83928571 0.91071429]
|
|
|
|
mean value: 0.9005541871921182
|
|
|
|
key: train_roc_auc
|
|
value: [0.91122281 0.91515981 0.91323799 0.91521428 0.91929134 0.90944882
|
|
0.91535433 0.91929134 0.91732283 0.90748031]
|
|
|
|
mean value: 0.914302387102798
|
|
|
|
key: test_jcc
|
|
value: [0.9 0.9 0.83870968 0.81818182 0.75757576 0.89655172
|
|
0.78787879 0.77419355 0.73529412 0.83333333]
|
|
|
|
mean value: 0.8241718764561139
|
|
|
|
key: train_jcc
|
|
value: [0.83870968 0.84587814 0.84172662 0.84532374 0.85251799 0.83629893
|
|
0.84642857 0.85198556 0.85 0.83274021]
|
|
|
|
mean value: 0.8441609435846644
|
|
|
|
MCC on Blind test: 0.28
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.66862154 0.89965343 0.87873912 0.72013283 0.79384041 0.85204673
|
|
0.71222496 0.79153037 0.77842331 0.70308733]
|
|
|
|
mean value: 0.779830002784729
|
|
|
|
key: score_time
|
|
value: [0.01159906 0.02059031 0.01229548 0.01344728 0.01333761 0.01349497
|
|
0.01231074 0.0122211 0.0130713 0.01268458]
|
|
|
|
mean value: 0.013505244255065918
|
|
|
|
key: test_mcc
|
|
value: [0.93202124 0.92980296 0.92980296 0.85960591 0.78772636 1.
|
|
0.85933785 0.85714286 0.78772636 0.78772636]
|
|
|
|
mean value: 0.8730892854406824
|
|
|
|
key: train_mcc
|
|
value: [0.93294638 0.93691352 0.94480151 0.93691156 0.93703692 0.93703692
|
|
0.92520402 0.9332517 0.92520402 0.94095217]
|
|
|
|
mean value: 0.9350258732625361
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.96491228 0.96491228 0.92982456 0.89285714 1.
|
|
0.92857143 0.92857143 0.89285714 0.89285714]
|
|
|
|
mean value: 0.9360275689223058
|
|
|
|
key: train_accuracy
|
|
value: [0.96646943 0.96844181 0.97238659 0.96844181 0.96850394 0.96850394
|
|
0.96259843 0.96653543 0.96259843 0.97047244]
|
|
|
|
mean value: 0.967495224339561
|
|
|
|
key: test_fscore
|
|
value: [0.96296296 0.96428571 0.96551724 0.93103448 0.89655172 1.
|
|
0.93103448 0.92857143 0.89655172 0.88888889]
|
|
|
|
mean value: 0.9365398649881409
|
|
|
|
key: train_fscore
|
|
value: [0.96646943 0.96837945 0.97222222 0.96825397 0.96837945 0.96837945
|
|
0.96267191 0.96620278 0.96267191 0.9704142 ]
|
|
|
|
mean value: 0.9674044754283551
|
|
|
|
key: test_precision
|
|
value: [1. 0.96428571 0.96551724 0.93103448 0.86666667 1.
|
|
0.9 0.92857143 0.86666667 0.92307692]
|
|
|
|
mean value: 0.934581912340533
|
|
|
|
key: train_precision
|
|
value: [0.96837945 0.97222222 0.97609562 0.97211155 0.97222222 0.97222222
|
|
0.96078431 0.97590361 0.96078431 0.97233202]
|
|
|
|
mean value: 0.9703057542340813
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.96428571 0.96551724 0.93103448 0.92857143 1.
|
|
0.96428571 0.92857143 0.92857143 0.85714286]
|
|
|
|
mean value: 0.9396551724137931
|
|
|
|
key: train_recall
|
|
value: [0.96456693 0.96456693 0.96837945 0.96442688 0.96456693 0.96456693
|
|
0.96456693 0.95669291 0.96456693 0.96850394]
|
|
|
|
mean value: 0.9645404749307522
|
|
|
|
key: test_roc_auc
|
|
value: [0.96428571 0.96490148 0.96490148 0.92980296 0.89285714 1.
|
|
0.92857143 0.92857143 0.89285714 0.89285714]
|
|
|
|
mean value: 0.935960591133005
|
|
|
|
key: train_roc_auc
|
|
value: [0.96647319 0.96844947 0.9723787 0.96843391 0.96850394 0.96850394
|
|
0.96259843 0.96653543 0.96259843 0.97047244]
|
|
|
|
mean value: 0.9674947869658586
|
|
|
|
key: test_jcc
|
|
value: [0.92857143 0.93103448 0.93333333 0.87096774 0.8125 1.
|
|
0.87096774 0.86666667 0.8125 0.8 ]
|
|
|
|
mean value: 0.8826541395201017
|
|
|
|
key: train_jcc
|
|
value: [0.9351145 0.93869732 0.94594595 0.93846154 0.93869732 0.93869732
|
|
0.9280303 0.93461538 0.9280303 0.94252874]
|
|
|
|
mean value: 0.9368818668555441
|
|
|
|
MCC on Blind test: 0.23
|
|
|
|
Accuracy on Blind test: 0.65
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01145101 0.01084852 0.0087235 0.0084331 0.0084486 0.00842762
|
|
0.00777602 0.00863409 0.00846457 0.00798273]
|
|
|
|
mean value: 0.008918976783752442
|
|
|
|
key: score_time
|
|
value: [0.01097488 0.00906825 0.00876808 0.00882697 0.0086298 0.00885534
|
|
0.00874305 0.00842094 0.00866842 0.00866818]
|
|
|
|
mean value: 0.008962392807006836
|
|
|
|
key: test_mcc
|
|
value: [0.77728159 0.68736396 0.77903565 0.56277738 0.43876345 0.49030429
|
|
0.75434227 0.65814518 0.73127242 0.65814518]
|
|
|
|
mean value: 0.6537431378840208
|
|
|
|
key: train_mcc
|
|
value: [0.65218808 0.64992518 0.66460838 0.66501403 0.62068788 0.66768511
|
|
0.6527166 0.71796573 0.66658604 0.66539291]
|
|
|
|
mean value: 0.66227699222193
|
|
|
|
key: test_accuracy
|
|
value: [0.87719298 0.84210526 0.87719298 0.77192982 0.71428571 0.73214286
|
|
0.875 0.82142857 0.85714286 0.82142857]
|
|
|
|
mean value: 0.818984962406015
|
|
|
|
key: train_accuracy
|
|
value: [0.81854043 0.81656805 0.82445759 0.82248521 0.79330709 0.82677165
|
|
0.81889764 0.85629921 0.82677165 0.82480315]
|
|
|
|
mean value: 0.8228901675752069
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 0.83018868 0.8627451 0.74509804 0.68 0.68085106
|
|
0.86792453 0.8 0.84 0.8 ]
|
|
|
|
mean value: 0.7963950265774716
|
|
|
|
key: train_fscore
|
|
value: [0.79735683 0.79379157 0.80266075 0.7972973 0.75294118 0.80701754
|
|
0.79735683 0.84696017 0.80786026 0.80353201]
|
|
|
|
mean value: 0.8006774440728486
|
|
|
|
key: test_precision
|
|
value: [1. 0.88 1. 0.86363636 0.77272727 0.84210526
|
|
0.92 0.90909091 0.95454545 0.90909091]
|
|
|
|
mean value: 0.9051196172248803
|
|
|
|
key: train_precision
|
|
value: [0.905 0.90862944 0.91414141 0.92670157 0.93567251 0.91089109
|
|
0.905 0.9058296 0.90686275 0.91457286]
|
|
|
|
mean value: 0.9133301236007405
|
|
|
|
key: test_recall
|
|
value: [0.75 0.78571429 0.75862069 0.65517241 0.60714286 0.57142857
|
|
0.82142857 0.71428571 0.75 0.71428571]
|
|
|
|
mean value: 0.712807881773399
|
|
|
|
key: train_recall
|
|
value: [0.71259843 0.70472441 0.71541502 0.69960474 0.62992126 0.72440945
|
|
0.71259843 0.79527559 0.72834646 0.71653543]
|
|
|
|
mean value: 0.7139429211664747
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.841133 0.87931034 0.77401478 0.71428571 0.73214286
|
|
0.875 0.82142857 0.85714286 0.82142857]
|
|
|
|
mean value: 0.8190886699507389
|
|
|
|
key: train_roc_auc
|
|
value: [0.81874981 0.81678908 0.82424294 0.82224332 0.79330709 0.82677165
|
|
0.81889764 0.85629921 0.82677165 0.82480315]
|
|
|
|
mean value: 0.8228875540755034
|
|
|
|
key: test_jcc
|
|
value: [0.75 0.70967742 0.75862069 0.59375 0.51515152 0.51612903
|
|
0.76666667 0.66666667 0.72413793 0.66666667]
|
|
|
|
mean value: 0.6667466587454074
|
|
|
|
key: train_jcc
|
|
value: [0.66300366 0.65808824 0.67037037 0.66292135 0.60377358 0.67647059
|
|
0.66300366 0.73454545 0.67765568 0.67158672]
|
|
|
|
mean value: 0.6681419301195666
|
|
|
|
MCC on Blind test: 0.34
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00911832 0.00889468 0.00884533 0.00869632 0.00868821 0.00798774
|
|
0.00814915 0.0086484 0.0087316 0.00859666]
|
|
|
|
mean value: 0.008635640144348145
|
|
|
|
key: score_time
|
|
value: [0.00931168 0.00890446 0.00889421 0.00877428 0.00848413 0.00837326
|
|
0.00868559 0.00825214 0.00880623 0.00885201]
|
|
|
|
mean value: 0.008733797073364257
|
|
|
|
key: test_mcc
|
|
value: [0.8953202 0.82512315 0.85960591 0.71921182 0.71611487 0.75047877
|
|
0.67900461 0.75047877 0.64450339 0.82195294]
|
|
|
|
mean value: 0.766179444196459
|
|
|
|
key: train_mcc
|
|
value: [0.76340037 0.76340037 0.76353762 0.75544282 0.77564465 0.77588525
|
|
0.77564465 0.77991449 0.79149195 0.76800824]
|
|
|
|
mean value: 0.7712370421013379
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.9122807 0.92982456 0.85964912 0.85714286 0.875
|
|
0.83928571 0.875 0.82142857 0.91071429]
|
|
|
|
mean value: 0.8827694235588972
|
|
|
|
key: train_accuracy
|
|
value: [0.8816568 0.8816568 0.8816568 0.87771203 0.88779528 0.88779528
|
|
0.88779528 0.88976378 0.89566929 0.88385827]
|
|
|
|
mean value: 0.8855359611113699
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 0.9122807 0.93103448 0.86206897 0.86206897 0.87272727
|
|
0.84210526 0.87272727 0.82758621 0.90909091]
|
|
|
|
mean value: 0.8839058461200022
|
|
|
|
key: train_fscore
|
|
value: [0.8828125 0.8828125 0.8828125 0.87698413 0.88845401 0.88932039
|
|
0.88845401 0.89147287 0.89668616 0.88543689]
|
|
|
|
mean value: 0.8865245960082
|
|
|
|
key: test_precision
|
|
value: [0.93103448 0.89655172 0.93103448 0.86206897 0.83333333 0.88888889
|
|
0.82758621 0.88888889 0.8 0.92592593]
|
|
|
|
mean value: 0.8785312899106003
|
|
|
|
key: train_precision
|
|
value: [0.87596899 0.87596899 0.87258687 0.88047809 0.88326848 0.87739464
|
|
0.88326848 0.8778626 0.88803089 0.87356322]
|
|
|
|
mean value: 0.8788391247569809
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.92857143 0.93103448 0.86206897 0.89285714 0.85714286
|
|
0.85714286 0.85714286 0.85714286 0.89285714]
|
|
|
|
mean value: 0.8900246305418719
|
|
|
|
key: train_recall
|
|
value: [0.88976378 0.88976378 0.89328063 0.87351779 0.89370079 0.9015748
|
|
0.89370079 0.90551181 0.90551181 0.8976378 ]
|
|
|
|
mean value: 0.8943963773303041
|
|
|
|
key: test_roc_auc
|
|
value: [0.9476601 0.91256158 0.92980296 0.85960591 0.85714286 0.875
|
|
0.83928571 0.875 0.82142857 0.91071429]
|
|
|
|
mean value: 0.882820197044335
|
|
|
|
key: train_roc_auc
|
|
value: [0.88164078 0.88164078 0.88167969 0.87770378 0.88779528 0.88779528
|
|
0.88779528 0.88976378 0.89566929 0.88385827]
|
|
|
|
mean value: 0.8855342192897825
|
|
|
|
key: test_jcc
|
|
value: [0.9 0.83870968 0.87096774 0.75757576 0.75757576 0.77419355
|
|
0.72727273 0.77419355 0.70588235 0.83333333]
|
|
|
|
mean value: 0.7939704444827784
|
|
|
|
key: train_jcc
|
|
value: [0.79020979 0.79020979 0.79020979 0.78091873 0.79929577 0.8006993
|
|
0.79929577 0.8041958 0.81272085 0.79442509]
|
|
|
|
mean value: 0.7962180687899996
|
|
|
|
MCC on Blind test: 0.28
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00799131 0.0081377 0.00847793 0.00799417 0.00966215 0.01012588
|
|
0.00796223 0.00843406 0.00802898 0.00803018]
|
|
|
|
mean value: 0.008484458923339844
|
|
|
|
key: score_time
|
|
value: [0.01276636 0.0125227 0.01176548 0.0133183 0.01890969 0.01576591
|
|
0.01556277 0.01163697 0.01146603 0.01169634]
|
|
|
|
mean value: 0.013541054725646973
|
|
|
|
key: test_mcc
|
|
value: [0.8953202 0.78940887 0.71921182 0.79110556 0.75047877 0.68250015
|
|
0.60753044 0.75047877 0.58501794 0.82195294]
|
|
|
|
mean value: 0.7393005465274064
|
|
|
|
key: train_mcc
|
|
value: [0.78707279 0.78304441 0.77919572 0.79093074 0.79951627 0.78742599
|
|
0.80317451 0.80759374 0.79936749 0.78395685]
|
|
|
|
mean value: 0.79212785011907
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.89473684 0.85964912 0.89473684 0.875 0.83928571
|
|
0.80357143 0.875 0.78571429 0.91071429]
|
|
|
|
mean value: 0.868577694235589
|
|
|
|
key: train_accuracy
|
|
value: [0.89349112 0.89151874 0.88954635 0.89546351 0.8996063 0.89370079
|
|
0.9015748 0.90354331 0.8996063 0.89173228]
|
|
|
|
mean value: 0.8959783503393437
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 0.89285714 0.86206897 0.9 0.87719298 0.83018868
|
|
0.80701754 0.87272727 0.80645161 0.90909091]
|
|
|
|
mean value: 0.8704963529709496
|
|
|
|
key: train_fscore
|
|
value: [0.89453125 0.89151874 0.89019608 0.8950495 0.90097087 0.89411765
|
|
0.90196078 0.90522244 0.9005848 0.89361702]
|
|
|
|
mean value: 0.8967769129948973
|
|
|
|
key: test_precision
|
|
value: [0.93103448 0.89285714 0.86206897 0.87096774 0.86206897 0.88
|
|
0.79310345 0.88888889 0.73529412 0.92592593]
|
|
|
|
mean value: 0.8642209679323466
|
|
|
|
key: train_precision
|
|
value: [0.8875969 0.89328063 0.88326848 0.8968254 0.88888889 0.890625
|
|
0.8984375 0.88973384 0.89189189 0.878327 ]
|
|
|
|
mean value: 0.8898875528234225
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.89285714 0.86206897 0.93103448 0.89285714 0.78571429
|
|
0.82142857 0.85714286 0.89285714 0.89285714]
|
|
|
|
mean value: 0.8793103448275862
|
|
|
|
key: train_recall
|
|
value: [0.9015748 0.88976378 0.8972332 0.89328063 0.91338583 0.8976378
|
|
0.90551181 0.92125984 0.90944882 0.90944882]
|
|
|
|
mean value: 0.9038545330055087
|
|
|
|
key: test_roc_auc
|
|
value: [0.9476601 0.89470443 0.85960591 0.89408867 0.875 0.83928571
|
|
0.80357143 0.875 0.78571429 0.91071429]
|
|
|
|
mean value: 0.8685344827586207
|
|
|
|
key: train_roc_auc
|
|
value: [0.89347515 0.89152221 0.88956148 0.89545921 0.8996063 0.89370079
|
|
0.9015748 0.90354331 0.8996063 0.89173228]
|
|
|
|
mean value: 0.8959781830630855
|
|
|
|
key: test_jcc
|
|
value: [0.9 0.80645161 0.75757576 0.81818182 0.78125 0.70967742
|
|
0.67647059 0.77419355 0.67567568 0.83333333]
|
|
|
|
mean value: 0.7732809753647041
|
|
|
|
key: train_jcc
|
|
value: [0.80918728 0.80427046 0.80212014 0.81003584 0.81978799 0.80851064
|
|
0.82142857 0.82685512 0.81914894 0.80769231]
|
|
|
|
mean value: 0.8129037288551658
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01498389 0.01484656 0.01499653 0.01508522 0.01529217 0.01519728
|
|
0.0148685 0.01477909 0.01449656 0.01571655]
|
|
|
|
mean value: 0.015026235580444336
|
|
|
|
key: score_time
|
|
value: [0.00945044 0.00991273 0.00937939 0.00918961 0.0093348 0.00939512
|
|
0.00931787 0.00938892 0.0092535 0.00957513]
|
|
|
|
mean value: 0.009419751167297364
|
|
|
|
key: test_mcc
|
|
value: [0.8953202 0.8953202 0.85960591 0.75462449 0.71611487 0.78772636
|
|
0.67900461 0.75047877 0.64450339 0.78772636]
|
|
|
|
mean value: 0.7770425163515529
|
|
|
|
key: train_mcc
|
|
value: [0.77929987 0.77929987 0.78334713 0.79108822 0.79936749 0.78779242
|
|
0.80337378 0.79567034 0.80324922 0.77974514]
|
|
|
|
mean value: 0.7902233465851163
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.94736842 0.92982456 0.87719298 0.85714286 0.89285714
|
|
0.83928571 0.875 0.82142857 0.89285714]
|
|
|
|
mean value: 0.8880325814536341
|
|
|
|
key: train_accuracy
|
|
value: [0.88954635 0.88954635 0.89151874 0.89546351 0.8996063 0.89370079
|
|
0.9015748 0.8976378 0.9015748 0.88976378]
|
|
|
|
mean value: 0.894993321840687
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 0.94736842 0.93103448 0.88135593 0.86206897 0.88888889
|
|
0.84210526 0.87272727 0.82758621 0.88888889]
|
|
|
|
mean value: 0.8889392743144012
|
|
|
|
key: train_fscore
|
|
value: [0.89105058 0.89105058 0.89278752 0.8962818 0.9005848 0.89534884
|
|
0.90272374 0.89922481 0.90234375 0.89105058]
|
|
|
|
mean value: 0.8962446999871675
|
|
|
|
key: test_precision
|
|
value: [0.93103448 0.93103448 0.93103448 0.86666667 0.83333333 0.92307692
|
|
0.82758621 0.88888889 0.8 0.92307692]
|
|
|
|
mean value: 0.8855732390215149
|
|
|
|
key: train_precision
|
|
value: [0.88076923 0.88076923 0.88076923 0.8875969 0.89189189 0.88167939
|
|
0.89230769 0.88549618 0.89534884 0.88076923]
|
|
|
|
mean value: 0.88573978162297
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.93103448 0.89655172 0.89285714 0.85714286
|
|
0.85714286 0.85714286 0.85714286 0.85714286]
|
|
|
|
mean value: 0.8934729064039408
|
|
|
|
key: train_recall
|
|
value: [0.9015748 0.9015748 0.90513834 0.90513834 0.90944882 0.90944882
|
|
0.91338583 0.91338583 0.90944882 0.9015748 ]
|
|
|
|
mean value: 0.9070119199526937
|
|
|
|
key: test_roc_auc
|
|
value: [0.9476601 0.9476601 0.92980296 0.87684729 0.85714286 0.89285714
|
|
0.83928571 0.875 0.82142857 0.89285714]
|
|
|
|
mean value: 0.8880541871921183
|
|
|
|
key: train_roc_auc
|
|
value: [0.88952258 0.88952258 0.89154555 0.89548256 0.8996063 0.89370079
|
|
0.9015748 0.8976378 0.9015748 0.88976378]
|
|
|
|
mean value: 0.8949931530297843
|
|
|
|
key: test_jcc
|
|
value: [0.9 0.9 0.87096774 0.78787879 0.75757576 0.8
|
|
0.72727273 0.77419355 0.70588235 0.8 ]
|
|
|
|
mean value: 0.802377091599103
|
|
|
|
key: train_jcc
|
|
value: [0.80350877 0.80350877 0.80633803 0.81205674 0.81914894 0.81052632
|
|
0.82269504 0.81690141 0.82206406 0.80350877]
|
|
|
|
mean value: 0.8120256834358026
|
|
|
|
MCC on Blind test: 0.22
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.55003285 1.53659248 1.37875795 1.49586344 1.57297134 1.42394137
|
|
1.5258956 1.60407376 1.39470887 1.90794826]
|
|
|
|
mean value: 1.5390785932540894
|
|
|
|
key: score_time
|
|
value: [0.01411986 0.01391315 0.01411891 0.01399922 0.01386738 0.01426673
|
|
0.01415634 0.01177144 0.01455188 0.01436853]
|
|
|
|
mean value: 0.013913345336914063
|
|
|
|
key: test_mcc
|
|
value: [0.8951918 0.92980296 0.82490815 0.8953202 0.75047877 0.89802651
|
|
0.89342711 0.78772636 0.78772636 0.85714286]
|
|
|
|
mean value: 0.851975108089572
|
|
|
|
key: train_mcc
|
|
value: [0.97245522 0.96055211 0.97239383 0.96055211 0.97250878 0.9645744
|
|
0.96463421 0.96850394 0.9645744 0.9645744 ]
|
|
|
|
mean value: 0.9665323428042959
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.96491228 0.9122807 0.94736842 0.875 0.94642857
|
|
0.94642857 0.89285714 0.89285714 0.92857143]
|
|
|
|
mean value: 0.9254072681704261
|
|
|
|
key: train_accuracy
|
|
value: [0.98619329 0.98027613 0.98619329 0.98027613 0.98622047 0.98228346
|
|
0.98228346 0.98425197 0.98228346 0.98228346]
|
|
|
|
mean value: 0.9832545155228378
|
|
|
|
key: test_fscore
|
|
value: [0.94545455 0.96428571 0.91525424 0.94736842 0.87719298 0.94339623
|
|
0.94736842 0.88888889 0.89655172 0.92857143]
|
|
|
|
mean value: 0.9254332589603143
|
|
|
|
key: train_fscore
|
|
value: [0.98613861 0.98031496 0.98613861 0.98023715 0.98613861 0.98224852
|
|
0.98217822 0.98425197 0.98224852 0.98231827]
|
|
|
|
mean value: 0.9832213455229958
|
|
|
|
key: test_precision
|
|
value: [0.96296296 0.96428571 0.9 0.96428571 0.86206897 1.
|
|
0.93103448 0.92307692 0.86666667 0.92857143]
|
|
|
|
mean value: 0.9302952858125272
|
|
|
|
key: train_precision
|
|
value: [0.99203187 0.98031496 0.98809524 0.98023715 0.99203187 0.98418972
|
|
0.98804781 0.98425197 0.98418972 0.98039216]
|
|
|
|
mean value: 0.9853782478667216
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.96428571 0.93103448 0.93103448 0.89285714 0.89285714
|
|
0.96428571 0.85714286 0.92857143 0.92857143]
|
|
|
|
mean value: 0.9219211822660098
|
|
|
|
key: train_recall
|
|
value: [0.98031496 0.98031496 0.98418972 0.98023715 0.98031496 0.98031496
|
|
0.97637795 0.98425197 0.98031496 0.98425197]
|
|
|
|
mean value: 0.9810883570383742
|
|
|
|
key: test_roc_auc
|
|
value: [0.94704433 0.96490148 0.91194581 0.9476601 0.875 0.94642857
|
|
0.94642857 0.89285714 0.89285714 0.92857143]
|
|
|
|
mean value: 0.9253694581280789
|
|
|
|
key: train_roc_auc
|
|
value: [0.98620491 0.98027606 0.98618935 0.98027606 0.98622047 0.98228346
|
|
0.98228346 0.98425197 0.98228346 0.98228346]
|
|
|
|
mean value: 0.9832552674986773
|
|
|
|
key: test_jcc
|
|
value: [0.89655172 0.93103448 0.84375 0.9 0.78125 0.89285714
|
|
0.9 0.8 0.8125 0.86666667]
|
|
|
|
mean value: 0.8624610016420361
|
|
|
|
key: train_jcc
|
|
value: [0.97265625 0.96138996 0.97265625 0.96124031 0.97265625 0.96511628
|
|
0.96498054 0.96899225 0.96511628 0.96525097]
|
|
|
|
mean value: 0.9670055337667078
|
|
|
|
MCC on Blind test: 0.26
|
|
|
|
Accuracy on Blind test: 0.66
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01483965 0.0120194 0.01096416 0.01053548 0.01014924 0.01097488
|
|
0.01111197 0.01054001 0.01057744 0.01146078]
|
|
|
|
mean value: 0.011317300796508788
|
|
|
|
key: score_time
|
|
value: [0.01083517 0.00851774 0.00850797 0.00900292 0.00824666 0.00808549
|
|
0.0081172 0.00822592 0.00818729 0.00816345]
|
|
|
|
mean value: 0.00858898162841797
|
|
|
|
key: test_mcc
|
|
value: [0.93202124 0.8951918 0.85960591 0.8953202 0.75434227 0.96490128
|
|
0.75434227 0.89342711 0.96490128 0.92857143]
|
|
|
|
mean value: 0.8842624793067261
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.94736842 0.92982456 0.94736842 0.875 0.98214286
|
|
0.875 0.94642857 0.98214286 0.96428571]
|
|
|
|
mean value: 0.9414473684210526
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96296296 0.94545455 0.93103448 0.94736842 0.88135593 0.98181818
|
|
0.88135593 0.94736842 0.98181818 0.96428571]
|
|
|
|
mean value: 0.942482277561025
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.96296296 0.93103448 0.96428571 0.83870968 1.
|
|
0.83870968 0.93103448 1. 0.96428571]
|
|
|
|
mean value: 0.9431022711890342
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.92857143 0.93103448 0.93103448 0.92857143 0.96428571
|
|
0.92857143 0.96428571 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9433497536945813
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96428571 0.94704433 0.92980296 0.9476601 0.875 0.98214286
|
|
0.875 0.94642857 0.98214286 0.96428571]
|
|
|
|
mean value: 0.9413793103448276
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.92857143 0.89655172 0.87096774 0.9 0.78787879 0.96428571
|
|
0.78787879 0.9 0.96428571 0.93103448]
|
|
|
|
mean value: 0.8931454381732469
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.11
|
|
|
|
Accuracy on Blind test: 0.36
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10266161 0.12123942 0.11855769 0.10826373 0.11115503 0.13033295
|
|
0.11559772 0.10858154 0.10145831 0.10418749]
|
|
|
|
mean value: 0.11220355033874511
|
|
|
|
key: score_time
|
|
value: [0.01758289 0.02243209 0.02058554 0.02079964 0.02117038 0.01818752
|
|
0.01786613 0.01750755 0.01726437 0.01872659]
|
|
|
|
mean value: 0.01921226978302002
|
|
|
|
key: test_mcc
|
|
value: [0.92980296 0.86189955 0.85960591 0.82490815 0.85714286 0.89342711
|
|
0.92857143 0.82195294 0.78571429 0.92857143]
|
|
|
|
mean value: 0.8691596616885752
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.92982456 0.92982456 0.9122807 0.92857143 0.94642857
|
|
0.96428571 0.91071429 0.89285714 0.96428571]
|
|
|
|
mean value: 0.9343984962406016
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96428571 0.93103448 0.93103448 0.91525424 0.92857143 0.94736842
|
|
0.96428571 0.90909091 0.89285714 0.96428571]
|
|
|
|
mean value: 0.9348068247234632
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96428571 0.9 0.93103448 0.9 0.92857143 0.93103448
|
|
0.96428571 0.92592593 0.89285714 0.96428571]
|
|
|
|
mean value: 0.9302280605728882
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.93103448 0.93103448 0.92857143 0.96428571
|
|
0.96428571 0.89285714 0.89285714 0.96428571]
|
|
|
|
mean value: 0.9397783251231527
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96490148 0.93041872 0.92980296 0.91194581 0.92857143 0.94642857
|
|
0.96428571 0.91071429 0.89285714 0.96428571]
|
|
|
|
mean value: 0.9344211822660099
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.93103448 0.87096774 0.87096774 0.84375 0.86666667 0.9
|
|
0.93103448 0.83333333 0.80645161 0.93103448]
|
|
|
|
mean value: 0.8785240545050056
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.32
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00810385 0.00779867 0.00782776 0.0077281 0.00779986 0.00821972
|
|
0.00770831 0.0078783 0.00787997 0.00846457]
|
|
|
|
mean value: 0.007940912246704101
|
|
|
|
key: score_time
|
|
value: [0.00810742 0.00859261 0.00797725 0.00838828 0.00801611 0.0083127
|
|
0.00793481 0.00823236 0.00814414 0.00799465]
|
|
|
|
mean value: 0.008170032501220703
|
|
|
|
key: test_mcc
|
|
value: [0.8951918 0.78940887 0.68472906 0.8615634 0.5118907 0.65814518
|
|
0.89342711 0.75434227 0.85933785 0.92857143]
|
|
|
|
mean value: 0.7836607672751627
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.89473684 0.84210526 0.92982456 0.75 0.82142857
|
|
0.94642857 0.875 0.92857143 0.96428571]
|
|
|
|
mean value: 0.8899749373433584
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94545455 0.89285714 0.84210526 0.93333333 0.77419355 0.8
|
|
0.94545455 0.86792453 0.92592593 0.96428571]
|
|
|
|
mean value: 0.8891534547158085
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96296296 0.89285714 0.85714286 0.90322581 0.70588235 0.90909091
|
|
0.96296296 0.92 0.96153846 0.96428571]
|
|
|
|
mean value: 0.90399491702338
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.89285714 0.82758621 0.96551724 0.85714286 0.71428571
|
|
0.92857143 0.82142857 0.89285714 0.96428571]
|
|
|
|
mean value: 0.8793103448275862
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.94704433 0.89470443 0.84236453 0.92918719 0.75 0.82142857
|
|
0.94642857 0.875 0.92857143 0.96428571]
|
|
|
|
mean value: 0.8899014778325124
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.89655172 0.80645161 0.72727273 0.875 0.63157895 0.66666667
|
|
0.89655172 0.76666667 0.86206897 0.93103448]
|
|
|
|
mean value: 0.8059843517429431
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.32
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.33799982 1.31312704 1.34472799 1.3418479 1.33652711 1.34488511
|
|
1.34392357 1.36048555 1.3535583 1.39411759]
|
|
|
|
mean value: 1.3471199989318847
|
|
|
|
key: score_time
|
|
value: [0.09937048 0.09200263 0.0983386 0.09501576 0.09319186 0.09341598
|
|
0.09465718 0.09849429 0.09647918 0.09067464]
|
|
|
|
mean value: 0.09516406059265137
|
|
|
|
key: test_mcc
|
|
value: [0.96547546 0.8953202 0.92980296 0.8951918 0.85933785 1.
|
|
0.92857143 0.89342711 0.93094934 0.92857143]
|
|
|
|
mean value: 0.922664756643307
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.94736842 0.96491228 0.94736842 0.92857143 1.
|
|
0.96428571 0.94642857 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9609962406015038
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98181818 0.94736842 0.96551724 0.94915254 0.93103448 1.
|
|
0.96428571 0.94736842 0.96296296 0.96428571]
|
|
|
|
mean value: 0.961379368196865
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.93103448 0.96551724 0.93333333 0.9 1.
|
|
0.96428571 0.93103448 1. 0.96428571]
|
|
|
|
mean value: 0.9589490968801314
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.96551724 0.96551724 0.96428571 1.
|
|
0.96428571 0.96428571 0.92857143 0.96428571]
|
|
|
|
mean value: 0.9645320197044335
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98214286 0.9476601 0.96490148 0.94704433 0.92857143 1.
|
|
0.96428571 0.94642857 0.96428571 0.96428571]
|
|
|
|
mean value: 0.960960591133005
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96428571 0.9 0.93333333 0.90322581 0.87096774 1.
|
|
0.93103448 0.9 0.92857143 0.93103448]
|
|
|
|
mean value: 0.9262452990094814
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.19
|
|
|
|
Accuracy on Blind test: 0.49
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.92916346 0.96192503 0.91043425 0.91636205 0.90995216 0.86325049
|
|
0.90527892 0.94523883 0.87693882 0.92262363]
|
|
|
|
mean value: 0.9141167640686035
|
|
|
|
key: score_time
|
|
value: [0.27692175 0.23238063 0.2464447 0.29479647 0.18034601 0.23226142
|
|
0.23399925 0.25210285 0.25105858 0.17118669]
|
|
|
|
mean value: 0.23714983463287354
|
|
|
|
key: test_mcc
|
|
value: [0.96547546 0.8953202 0.92980296 0.8951918 0.85714286 1.
|
|
0.92857143 0.89342711 0.93094934 0.92857143]
|
|
|
|
mean value: 0.9224452574728608
|
|
|
|
key: train_mcc
|
|
value: [0.94503515 0.95277969 0.94878539 0.95278262 0.95687833 0.94112724
|
|
0.94888508 0.95278544 0.94499908 0.94900279]
|
|
|
|
mean value: 0.9493060812767512
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.94736842 0.96491228 0.94736842 0.92857143 1.
|
|
0.96428571 0.94642857 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9609962406015038
|
|
|
|
key: train_accuracy
|
|
value: [0.97238659 0.97633136 0.97435897 0.97633136 0.97834646 0.97047244
|
|
0.97440945 0.97637795 0.97244094 0.97440945]
|
|
|
|
mean value: 0.9745864976937054
|
|
|
|
key: test_fscore
|
|
value: [0.98181818 0.94736842 0.96551724 0.94915254 0.92857143 1.
|
|
0.96428571 0.94736842 0.96296296 0.96428571]
|
|
|
|
mean value: 0.9611330627781457
|
|
|
|
key: train_fscore
|
|
value: [0.97276265 0.9765625 0.97445972 0.97647059 0.9785575 0.97076023
|
|
0.97455969 0.97647059 0.97265625 0.97465887]
|
|
|
|
mean value: 0.9747918592411458
|
|
|
|
key: test_precision
|
|
value: [1. 0.93103448 0.96551724 0.93333333 0.92857143 1.
|
|
0.96428571 0.93103448 1. 0.96428571]
|
|
|
|
mean value: 0.9618062397372742
|
|
|
|
key: train_precision
|
|
value: [0.96153846 0.96899225 0.96875 0.9688716 0.96911197 0.96138996
|
|
0.9688716 0.97265625 0.96511628 0.96525097]
|
|
|
|
mean value: 0.9670549325084619
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.96551724 0.96551724 0.92857143 1.
|
|
0.96428571 0.96428571 0.92857143 0.96428571]
|
|
|
|
mean value: 0.9609605911330049
|
|
|
|
key: train_recall
|
|
value: [0.98425197 0.98425197 0.98023715 0.98418972 0.98818898 0.98031496
|
|
0.98031496 0.98031496 0.98031496 0.98425197]
|
|
|
|
mean value: 0.9826631601879805
|
|
|
|
key: test_roc_auc
|
|
value: [0.98214286 0.9476601 0.96490148 0.94704433 0.92857143 1.
|
|
0.96428571 0.94642857 0.96428571 0.96428571]
|
|
|
|
mean value: 0.960960591133005
|
|
|
|
key: train_roc_auc
|
|
value: [0.97236314 0.97631571 0.97437055 0.97634683 0.97834646 0.97047244
|
|
0.97440945 0.97637795 0.97244094 0.97440945]
|
|
|
|
mean value: 0.974585291463073
|
|
|
|
key: test_jcc
|
|
value: [0.96428571 0.9 0.93333333 0.90322581 0.86666667 1.
|
|
0.93103448 0.9 0.92857143 0.93103448]
|
|
|
|
mean value: 0.9258151914825997
|
|
|
|
key: train_jcc
|
|
value: [0.9469697 0.95419847 0.95019157 0.95402299 0.95801527 0.94318182
|
|
0.95038168 0.95402299 0.94676806 0.95057034]
|
|
|
|
mean value: 0.9508322885933389
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.5
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00826478 0.00817037 0.00865364 0.00832152 0.00791717 0.00800776
|
|
0.00859976 0.00781083 0.00889087 0.00782251]
|
|
|
|
mean value: 0.00824592113494873
|
|
|
|
key: score_time
|
|
value: [0.00801754 0.00853825 0.01080704 0.00824523 0.00855088 0.00804639
|
|
0.00831342 0.00840735 0.00844622 0.00819445]
|
|
|
|
mean value: 0.008556675910949708
|
|
|
|
key: test_mcc
|
|
value: [0.8953202 0.82512315 0.85960591 0.71921182 0.71611487 0.75047877
|
|
0.67900461 0.75047877 0.64450339 0.82195294]
|
|
|
|
mean value: 0.766179444196459
|
|
|
|
key: train_mcc
|
|
value: [0.76340037 0.76340037 0.76353762 0.75544282 0.77564465 0.77588525
|
|
0.77564465 0.77991449 0.79149195 0.76800824]
|
|
|
|
mean value: 0.7712370421013379
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.9122807 0.92982456 0.85964912 0.85714286 0.875
|
|
0.83928571 0.875 0.82142857 0.91071429]
|
|
|
|
mean value: 0.8827694235588972
|
|
|
|
key: train_accuracy
|
|
value: [0.8816568 0.8816568 0.8816568 0.87771203 0.88779528 0.88779528
|
|
0.88779528 0.88976378 0.89566929 0.88385827]
|
|
|
|
mean value: 0.8855359611113699
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 0.9122807 0.93103448 0.86206897 0.86206897 0.87272727
|
|
0.84210526 0.87272727 0.82758621 0.90909091]
|
|
|
|
mean value: 0.8839058461200022
|
|
|
|
key: train_fscore
|
|
value: [0.8828125 0.8828125 0.8828125 0.87698413 0.88845401 0.88932039
|
|
0.88845401 0.89147287 0.89668616 0.88543689]
|
|
|
|
mean value: 0.8865245960082
|
|
|
|
key: test_precision
|
|
value: [0.93103448 0.89655172 0.93103448 0.86206897 0.83333333 0.88888889
|
|
0.82758621 0.88888889 0.8 0.92592593]
|
|
|
|
mean value: 0.8785312899106003
|
|
|
|
key: train_precision
|
|
value: [0.87596899 0.87596899 0.87258687 0.88047809 0.88326848 0.87739464
|
|
0.88326848 0.8778626 0.88803089 0.87356322]
|
|
|
|
mean value: 0.8788391247569809
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.92857143 0.93103448 0.86206897 0.89285714 0.85714286
|
|
0.85714286 0.85714286 0.85714286 0.89285714]
|
|
|
|
mean value: 0.8900246305418719
|
|
|
|
key: train_recall
|
|
value: [0.88976378 0.88976378 0.89328063 0.87351779 0.89370079 0.9015748
|
|
0.89370079 0.90551181 0.90551181 0.8976378 ]
|
|
|
|
mean value: 0.8943963773303041
|
|
|
|
key: test_roc_auc
|
|
value: [0.9476601 0.91256158 0.92980296 0.85960591 0.85714286 0.875
|
|
0.83928571 0.875 0.82142857 0.91071429]
|
|
|
|
mean value: 0.882820197044335
|
|
|
|
key: train_roc_auc
|
|
value: [0.88164078 0.88164078 0.88167969 0.87770378 0.88779528 0.88779528
|
|
0.88779528 0.88976378 0.89566929 0.88385827]
|
|
|
|
mean value: 0.8855342192897825
|
|
|
|
key: test_jcc
|
|
value: [0.9 0.83870968 0.87096774 0.75757576 0.75757576 0.77419355
|
|
0.72727273 0.77419355 0.70588235 0.83333333]
|
|
|
|
mean value: 0.7939704444827784
|
|
|
|
key: train_jcc
|
|
value: [0.79020979 0.79020979 0.79020979 0.78091873 0.79929577 0.8006993
|
|
0.79929577 0.8041958 0.81272085 0.79442509]
|
|
|
|
mean value: 0.7962180687899996
|
|
|
|
MCC on Blind test: 0.28
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.07053018 0.055233 0.05712581 0.05738688 0.05596948 0.2296176
|
|
0.04953313 0.04858136 0.05947566 0.05467701]
|
|
|
|
mean value: 0.07381300926208496
|
|
|
|
key: score_time
|
|
value: [0.01033711 0.01042008 0.01019645 0.01021481 0.01025701 0.01056623
|
|
0.01241708 0.00986052 0.01005435 0.01003385]
|
|
|
|
mean value: 0.010435748100280761
|
|
|
|
key: test_mcc
|
|
value: [0.96547546 0.92980296 0.96547546 0.96547546 0.89342711 1.
|
|
0.96490128 0.89342711 0.96490128 0.92857143]
|
|
|
|
mean value: 0.9471457541234694
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.96491228 0.98245614 0.98245614 0.94642857 1.
|
|
0.98214286 0.94642857 0.98214286 0.96428571]
|
|
|
|
mean value: 0.9733709273182957
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98181818 0.96428571 0.98305085 0.98305085 0.94736842 1.
|
|
0.98245614 0.94736842 0.98181818 0.96428571]
|
|
|
|
mean value: 0.9735502469579187
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.96428571 0.96666667 0.96666667 0.93103448 1.
|
|
0.96551724 0.93103448 1. 0.96428571]
|
|
|
|
mean value: 0.9689490968801313
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 1. 1. 0.96428571 1.
|
|
1. 0.96428571 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9785714285714285
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98214286 0.96490148 0.98214286 0.98214286 0.94642857 1.
|
|
0.98214286 0.94642857 0.98214286 0.96428571]
|
|
|
|
mean value: 0.9732758620689657
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96428571 0.93103448 0.96666667 0.96666667 0.9 1.
|
|
0.96551724 0.9 0.96428571 0.93103448]
|
|
|
|
mean value: 0.9489490968801314
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.08
|
|
|
|
Accuracy on Blind test: 0.36
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01379895 0.04094553 0.04091215 0.04156613 0.04213095 0.04107809
|
|
0.04116917 0.04115582 0.03444266 0.04156756]
|
|
|
|
mean value: 0.03787670135498047
|
|
|
|
key: score_time
|
|
value: [0.01027441 0.021981 0.01984763 0.02085829 0.02091908 0.02200484
|
|
0.01950645 0.01698136 0.02125549 0.01091409]
|
|
|
|
mean value: 0.018454265594482423
|
|
|
|
key: test_mcc
|
|
value: [0.85960591 0.8953202 0.85960591 0.82490815 0.75434227 0.78772636
|
|
0.75434227 0.71611487 0.68250015 0.82195294]
|
|
|
|
mean value: 0.7956419031963872
|
|
|
|
key: train_mcc
|
|
value: [0.86611359 0.85893744 0.84648438 0.84263794 0.86253233 0.85105352
|
|
0.83910959 0.85105352 0.85545187 0.83890131]
|
|
|
|
mean value: 0.8512275503641218
|
|
|
|
key: test_accuracy
|
|
value: [0.92982456 0.94736842 0.92982456 0.9122807 0.875 0.89285714
|
|
0.875 0.85714286 0.83928571 0.91071429]
|
|
|
|
mean value: 0.8969298245614035
|
|
|
|
key: train_accuracy
|
|
value: [0.93293886 0.92899408 0.92307692 0.92110454 0.93110236 0.92519685
|
|
0.91929134 0.92519685 0.92716535 0.91929134]
|
|
|
|
mean value: 0.925335849291028
|
|
|
|
key: test_fscore
|
|
value: [0.92857143 0.94736842 0.93103448 0.91525424 0.88135593 0.88888889
|
|
0.88135593 0.85185185 0.84745763 0.90909091]
|
|
|
|
mean value: 0.898222971102789
|
|
|
|
key: train_fscore
|
|
value: [0.93385214 0.93076923 0.92397661 0.92217899 0.93203883 0.92664093
|
|
0.92069632 0.92664093 0.92898273 0.92038835]
|
|
|
|
mean value: 0.9266165055588382
|
|
|
|
key: test_precision
|
|
value: [0.92857143 0.93103448 0.93103448 0.9 0.83870968 0.92307692
|
|
0.83870968 0.88461538 0.80645161 0.92592593]
|
|
|
|
mean value: 0.8908129595448839
|
|
|
|
key: train_precision
|
|
value: [0.92307692 0.90977444 0.91153846 0.90804598 0.91954023 0.90909091
|
|
0.90494297 0.90909091 0.90636704 0.90804598]
|
|
|
|
mean value: 0.9109513829773443
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.96428571 0.93103448 0.93103448 0.92857143 0.85714286
|
|
0.92857143 0.82142857 0.89285714 0.89285714]
|
|
|
|
mean value: 0.9076354679802956
|
|
|
|
key: train_recall
|
|
value: [0.94488189 0.95275591 0.93675889 0.93675889 0.94488189 0.94488189
|
|
0.93700787 0.94488189 0.95275591 0.93307087]
|
|
|
|
mean value: 0.9428635896797485
|
|
|
|
key: test_roc_auc
|
|
value: [0.92980296 0.9476601 0.92980296 0.91194581 0.875 0.89285714
|
|
0.875 0.85714286 0.83928571 0.91071429]
|
|
|
|
mean value: 0.8969211822660099
|
|
|
|
key: train_roc_auc
|
|
value: [0.93291525 0.92894712 0.92310386 0.92113535 0.93110236 0.92519685
|
|
0.91929134 0.92519685 0.92716535 0.91929134]
|
|
|
|
mean value: 0.9253345678628117
|
|
|
|
key: test_jcc
|
|
value: [0.86666667 0.9 0.87096774 0.84375 0.78787879 0.8
|
|
0.78787879 0.74193548 0.73529412 0.83333333]
|
|
|
|
mean value: 0.8167704919211086
|
|
|
|
key: train_jcc
|
|
value: [0.87591241 0.8705036 0.85869565 0.85559567 0.87272727 0.86330935
|
|
0.85304659 0.86330935 0.86738351 0.85251799]
|
|
|
|
mean value: 0.8633001396827011
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02298999 0.00778127 0.00742817 0.00751019 0.00740218 0.00754929
|
|
0.00757813 0.00776744 0.00745249 0.00765824]
|
|
|
|
mean value: 0.009111738204956055
|
|
|
|
key: score_time
|
|
value: [0.00836825 0.00810742 0.00794792 0.00793934 0.00797534 0.0079298
|
|
0.00799894 0.00800514 0.00802684 0.00787163]
|
|
|
|
mean value: 0.00801706314086914
|
|
|
|
key: test_mcc
|
|
value: [0.8953202 0.82512315 0.85960591 0.78940887 0.71611487 0.75047877
|
|
0.67900461 0.75047877 0.64450339 0.82195294]
|
|
|
|
mean value: 0.7731991486299565
|
|
|
|
key: train_mcc
|
|
value: [0.76340037 0.76340037 0.76741581 0.77919572 0.78351922 0.77588525
|
|
0.78749923 0.78361641 0.79139378 0.77186893]
|
|
|
|
mean value: 0.776719508855672
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.9122807 0.92982456 0.89473684 0.85714286 0.875
|
|
0.83928571 0.875 0.82142857 0.91071429]
|
|
|
|
mean value: 0.8862781954887218
|
|
|
|
key: train_accuracy
|
|
value: [0.8816568 0.8816568 0.88362919 0.88954635 0.89173228 0.88779528
|
|
0.89370079 0.89173228 0.89566929 0.88582677]
|
|
|
|
mean value: 0.8882945844787152
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 0.9122807 0.93103448 0.89655172 0.86206897 0.87272727
|
|
0.84210526 0.87272727 0.82758621 0.90909091]
|
|
|
|
mean value: 0.8873541219820712
|
|
|
|
key: train_fscore
|
|
value: [0.8828125 0.8828125 0.88454012 0.89019608 0.89236791 0.88932039
|
|
0.89453125 0.89278752 0.8962818 0.88715953]
|
|
|
|
mean value: 0.8892809598096044
|
|
|
|
key: test_precision
|
|
value: [0.93103448 0.89655172 0.93103448 0.89655172 0.83333333 0.88888889
|
|
0.82758621 0.88888889 0.8 0.92592593]
|
|
|
|
mean value: 0.8819795657726692
|
|
|
|
key: train_precision
|
|
value: [0.87596899 0.87596899 0.87596899 0.88326848 0.88715953 0.87739464
|
|
0.8875969 0.88416988 0.89105058 0.87692308]
|
|
|
|
mean value: 0.8815470072299069
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.92857143 0.93103448 0.89655172 0.89285714 0.85714286
|
|
0.85714286 0.85714286 0.85714286 0.89285714]
|
|
|
|
mean value: 0.8934729064039408
|
|
|
|
key: train_recall
|
|
value: [0.88976378 0.88976378 0.89328063 0.8972332 0.8976378 0.9015748
|
|
0.9015748 0.9015748 0.9015748 0.8976378 ]
|
|
|
|
mean value: 0.8971616196196819
|
|
|
|
key: test_roc_auc
|
|
value: [0.9476601 0.91256158 0.92980296 0.89470443 0.85714286 0.875
|
|
0.83928571 0.875 0.82142857 0.91071429]
|
|
|
|
mean value: 0.8863300492610838
|
|
|
|
key: train_roc_auc
|
|
value: [0.88164078 0.88164078 0.88364819 0.88956148 0.89173228 0.88779528
|
|
0.89370079 0.89173228 0.89566929 0.88582677]
|
|
|
|
mean value: 0.8882947931903769
|
|
|
|
key: test_jcc
|
|
value: [0.9 0.83870968 0.87096774 0.8125 0.75757576 0.77419355
|
|
0.72727273 0.77419355 0.70588235 0.83333333]
|
|
|
|
mean value: 0.7994628687252027
|
|
|
|
key: train_jcc
|
|
value: [0.79020979 0.79020979 0.79298246 0.80212014 0.80565371 0.8006993
|
|
0.80918728 0.80633803 0.81205674 0.7972028 ]
|
|
|
|
mean value: 0.8006660030961745
|
|
|
|
MCC on Blind test: 0.28
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00975227 0.01262665 0.01330996 0.01325607 0.01203847 0.01280475
|
|
0.01180601 0.01305366 0.01356101 0.01355767]
|
|
|
|
mean value: 0.012576651573181153
|
|
|
|
key: score_time
|
|
value: [0.00793934 0.01009583 0.01011109 0.01049399 0.01059914 0.01051426
|
|
0.01052403 0.0104363 0.01050282 0.01055479]
|
|
|
|
mean value: 0.010177159309387207
|
|
|
|
key: test_mcc
|
|
value: [0.93202124 0.8953202 0.89952865 0.86189955 0.75047877 0.93094934
|
|
0.79385662 0.78571429 0.56573571 0.78571429]
|
|
|
|
mean value: 0.8201218647363345
|
|
|
|
key: train_mcc
|
|
value: [0.90138807 0.90933566 0.85396037 0.85053095 0.9021413 0.91064232
|
|
0.84093872 0.88232751 0.83427977 0.86279984]
|
|
|
|
mean value: 0.8748344497092373
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.94736842 0.94736842 0.92982456 0.875 0.96428571
|
|
0.89285714 0.89285714 0.76785714 0.89285714]
|
|
|
|
mean value: 0.9075187969924812
|
|
|
|
key: train_accuracy
|
|
value: [0.95069034 0.95463511 0.92504931 0.92504931 0.9507874 0.95472441
|
|
0.91929134 0.94094488 0.91338583 0.92913386]
|
|
|
|
mean value: 0.9363691779651804
|
|
|
|
key: test_fscore
|
|
value: [0.96296296 0.94736842 0.95081967 0.92857143 0.87272727 0.96296296
|
|
0.9 0.89285714 0.8 0.89285714]
|
|
|
|
mean value: 0.9111127006122692
|
|
|
|
key: train_fscore
|
|
value: [0.95069034 0.95445545 0.92830189 0.92607004 0.9498998 0.95353535
|
|
0.92220114 0.94186047 0.91881919 0.93258427]
|
|
|
|
mean value: 0.9378417921178791
|
|
|
|
key: test_precision
|
|
value: [1. 0.93103448 0.90625 0.96296296 0.88888889 1.
|
|
0.84375 0.89285714 0.7027027 0.89285714]
|
|
|
|
mean value: 0.9021303323027461
|
|
|
|
key: train_precision
|
|
value: [0.95256917 0.96015936 0.88808664 0.91187739 0.96734694 0.97925311
|
|
0.89010989 0.92748092 0.86458333 0.88928571]
|
|
|
|
mean value: 0.9230752474313746
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.96428571 1. 0.89655172 0.85714286 0.92857143
|
|
0.96428571 0.89285714 0.92857143 0.89285714]
|
|
|
|
mean value: 0.9253694581280788
|
|
|
|
key: train_recall
|
|
value: [0.9488189 0.9488189 0.97233202 0.94071146 0.93307087 0.92913386
|
|
0.95669291 0.95669291 0.98031496 0.98031496]
|
|
|
|
mean value: 0.9546901745977405
|
|
|
|
key: test_roc_auc
|
|
value: [0.96428571 0.9476601 0.94642857 0.93041872 0.875 0.96428571
|
|
0.89285714 0.89285714 0.76785714 0.89285714]
|
|
|
|
mean value: 0.9074507389162563
|
|
|
|
key: train_roc_auc
|
|
value: [0.95069403 0.9546466 0.92514239 0.92508014 0.9507874 0.95472441
|
|
0.91929134 0.94094488 0.91338583 0.92913386]
|
|
|
|
mean value: 0.9363830879835673
|
|
|
|
key: test_jcc
|
|
value: [0.92857143 0.9 0.90625 0.86666667 0.77419355 0.92857143
|
|
0.81818182 0.80645161 0.66666667 0.80645161]
|
|
|
|
mean value: 0.8402004782851558
|
|
|
|
key: train_jcc
|
|
value: [0.90601504 0.91287879 0.86619718 0.86231884 0.90458015 0.91119691
|
|
0.8556338 0.89010989 0.84982935 0.87368421]
|
|
|
|
mean value: 0.8832444168008685
|
|
|
|
MCC on Blind test: 0.16
|
|
|
|
Accuracy on Blind test: 0.45
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01402736 0.01417279 0.01328135 0.01392245 0.01277757 0.01429009
|
|
0.01231146 0.01305246 0.01252604 0.01286674]
|
|
|
|
mean value: 0.013322830200195312
|
|
|
|
key: score_time
|
|
value: [0.0107367 0.0107584 0.01064038 0.01052928 0.01053858 0.01060414
|
|
0.01047802 0.01054215 0.01091051 0.01058197]
|
|
|
|
mean value: 0.010632014274597168
|
|
|
|
key: test_mcc
|
|
value: [0.93202124 0.83703659 0.82942474 0.82490815 0.64951905 0.93094934
|
|
0.82195294 0.85714286 0.59628479 0.92857143]
|
|
|
|
mean value: 0.8207811130466791
|
|
|
|
key: train_mcc
|
|
value: [0.89234379 0.81176962 0.87340231 0.8905544 0.89426234 0.88323242
|
|
0.90174953 0.91732994 0.87948771 0.89075842]
|
|
|
|
mean value: 0.8834890481349534
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.9122807 0.9122807 0.9122807 0.82142857 0.96428571
|
|
0.91071429 0.92857143 0.78571429 0.96428571]
|
|
|
|
mean value: 0.9076754385964912
|
|
|
|
key: train_accuracy
|
|
value: [0.94477318 0.89940828 0.93491124 0.94477318 0.94685039 0.94094488
|
|
0.9507874 0.95866142 0.93897638 0.94488189]
|
|
|
|
mean value: 0.9404968239916756
|
|
|
|
key: test_fscore
|
|
value: [0.96296296 0.90196078 0.90909091 0.91525424 0.83333333 0.96551724
|
|
0.9122807 0.92857143 0.8125 0.96428571]
|
|
|
|
mean value: 0.9105757312979906
|
|
|
|
key: train_fscore
|
|
value: [0.94262295 0.88984881 0.93167702 0.94594595 0.94777563 0.94252874
|
|
0.95126706 0.95874263 0.94072658 0.94615385]
|
|
|
|
mean value: 0.9397289204487953
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.96153846 0.9 0.78125 0.93333333
|
|
0.89655172 0.92857143 0.72222222 0.96428571]
|
|
|
|
mean value: 0.9087752884089091
|
|
|
|
key: train_precision
|
|
value: [0.98290598 0.98564593 0.97826087 0.9245283 0.93155894 0.91791045
|
|
0.94208494 0.95686275 0.91449814 0.92481203]
|
|
|
|
mean value: 0.9459068329016868
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.82142857 0.86206897 0.93103448 0.89285714 1.
|
|
0.92857143 0.92857143 0.92857143 0.96428571]
|
|
|
|
mean value: 0.9185960591133004
|
|
|
|
key: train_recall
|
|
value: [0.90551181 0.81102362 0.88932806 0.96837945 0.96456693 0.96850394
|
|
0.96062992 0.96062992 0.96850394 0.96850394]
|
|
|
|
mean value: 0.9365581525629454
|
|
|
|
key: test_roc_auc
|
|
value: [0.96428571 0.91071429 0.91317734 0.91194581 0.82142857 0.96428571
|
|
0.91071429 0.92857143 0.78571429 0.96428571]
|
|
|
|
mean value: 0.907512315270936
|
|
|
|
key: train_roc_auc
|
|
value: [0.94485077 0.89958296 0.93482151 0.94481964 0.94685039 0.94094488
|
|
0.9507874 0.95866142 0.93897638 0.94488189]
|
|
|
|
mean value: 0.940517724316081
|
|
|
|
key: test_jcc
|
|
value: [0.92857143 0.82142857 0.83333333 0.84375 0.71428571 0.93333333
|
|
0.83870968 0.86666667 0.68421053 0.93103448]
|
|
|
|
mean value: 0.8395323734112813
|
|
|
|
key: train_jcc
|
|
value: [0.89147287 0.80155642 0.87209302 0.8974359 0.90073529 0.89130435
|
|
0.9070632 0.92075472 0.88808664 0.89781022]
|
|
|
|
mean value: 0.8868312626670497
|
|
|
|
MCC on Blind test: 0.23
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10967422 0.09465766 0.0946455 0.09465408 0.09431386 0.09605312
|
|
0.09578514 0.09517407 0.09609222 0.09636235]
|
|
|
|
mean value: 0.09674122333526611
|
|
|
|
key: score_time
|
|
value: [0.01450157 0.01411438 0.01444912 0.0144279 0.0141592 0.01532269
|
|
0.01432991 0.01458526 0.01432729 0.01431394]
|
|
|
|
mean value: 0.014453125
|
|
|
|
key: test_mcc
|
|
value: [0.93202124 0.92980296 0.8953202 0.93202124 0.82618439 0.96490128
|
|
0.96490128 0.89342711 0.96490128 0.89342711]
|
|
|
|
mean value: 0.919690809367707
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.96491228 0.94736842 0.96491228 0.91071429 0.98214286
|
|
0.98214286 0.94642857 0.98214286 0.94642857]
|
|
|
|
mean value: 0.9592105263157894
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96296296 0.96428571 0.94736842 0.96666667 0.91525424 0.98181818
|
|
0.98245614 0.94736842 0.98181818 0.94545455]
|
|
|
|
mean value: 0.9595453472750529
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.96428571 0.96428571 0.93548387 0.87096774 1.
|
|
0.96551724 0.93103448 1. 0.96296296]
|
|
|
|
mean value: 0.9594537728575548
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.96428571 0.93103448 1. 0.96428571 0.96428571
|
|
1. 0.96428571 0.96428571 0.92857143]
|
|
|
|
mean value: 0.9609605911330049
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96428571 0.96490148 0.9476601 0.96428571 0.91071429 0.98214286
|
|
0.98214286 0.94642857 0.98214286 0.94642857]
|
|
|
|
mean value: 0.9591133004926109
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.92857143 0.93103448 0.9 0.93548387 0.84375 0.96428571
|
|
0.96551724 0.9 0.96428571 0.89655172]
|
|
|
|
mean value: 0.9229480176386461
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.14
|
|
|
|
Accuracy on Blind test: 0.43
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03699589 0.04877973 0.05854607 0.05341744 0.04653335 0.03373337
|
|
0.049371 0.03414083 0.04613113 0.05712962]
|
|
|
|
mean value: 0.04647784233093262
|
|
|
|
key: score_time
|
|
value: [0.02744269 0.02593279 0.03697324 0.0348525 0.0197053 0.0282557
|
|
0.0171566 0.02019095 0.02783036 0.03664637]
|
|
|
|
mean value: 0.027498650550842284
|
|
|
|
key: test_mcc
|
|
value: [0.96547546 0.92980296 0.8953202 0.93202124 0.82618439 1.
|
|
0.96490128 0.89342711 0.93094934 0.92857143]
|
|
|
|
mean value: 0.9266653398520664
|
|
|
|
key: train_mcc
|
|
value: [0.99214142 0.99211042 0.99214118 1. 0.99212598 0.98428248
|
|
0.98825791 1. 0.99212598 0.98819663]
|
|
|
|
mean value: 0.9921382021238081
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.96491228 0.94736842 0.96491228 0.91071429 1.
|
|
0.98214286 0.94642857 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9627506265664161
|
|
|
|
key: train_accuracy
|
|
value: [0.99605523 0.99605523 0.99605523 1. 0.99606299 0.99212598
|
|
0.99409449 1. 0.99606299 0.99409449]
|
|
|
|
mean value: 0.9960606625355263
|
|
|
|
key: test_fscore
|
|
value: [0.98181818 0.96428571 0.94736842 0.96666667 0.91525424 1.
|
|
0.98245614 0.94736842 0.96296296 0.96428571]
|
|
|
|
mean value: 0.9632466459763516
|
|
|
|
key: train_fscore
|
|
value: [0.99604743 0.99606299 0.99603175 1. 0.99606299 0.99209486
|
|
0.99405941 1. 0.99606299 0.99408284]
|
|
|
|
mean value: 0.9960505261077098
|
|
|
|
key: test_precision
|
|
value: [1. 0.96428571 0.96428571 0.93548387 0.87096774 1.
|
|
0.96551724 0.93103448 1. 0.96428571]
|
|
|
|
mean value: 0.9595860479898299
|
|
|
|
key: train_precision
|
|
value: [1. 0.99606299 1. 1. 0.99606299 0.99603175
|
|
1. 1. 0.99606299 0.99604743]
|
|
|
|
mean value: 0.9980268153239739
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.93103448 1. 0.96428571 1.
|
|
1. 0.96428571 0.92857143 0.96428571]
|
|
|
|
mean value: 0.968103448275862
|
|
|
|
key: train_recall
|
|
value: [0.99212598 0.99606299 0.99209486 1. 0.99606299 0.98818898
|
|
0.98818898 1. 0.99606299 0.99212598]
|
|
|
|
mean value: 0.9940913759297875
|
|
|
|
key: test_roc_auc
|
|
value: [0.98214286 0.96490148 0.9476601 0.96428571 0.91071429 1.
|
|
0.98214286 0.94642857 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9626847290640395
|
|
|
|
key: train_roc_auc
|
|
value: [0.99606299 0.99605521 0.99604743 1. 0.99606299 0.99212598
|
|
0.99409449 1. 0.99606299 0.99409449]
|
|
|
|
mean value: 0.9960606579315926
|
|
|
|
key: test_jcc
|
|
value: [0.96428571 0.93103448 0.9 0.93548387 0.84375 1.
|
|
0.96551724 0.9 0.92857143 0.93103448]
|
|
|
|
mean value: 0.9299677220721436
|
|
|
|
key: train_jcc
|
|
value: [0.99212598 0.99215686 0.99209486 1. 0.99215686 0.98431373
|
|
0.98818898 1. 0.99215686 0.98823529]
|
|
|
|
mean value: 0.9921429430133137
|
|
|
|
MCC on Blind test: 0.15
|
|
|
|
Accuracy on Blind test: 0.38
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.15483451 0.16543531 0.17816734 0.18977332 0.14207339 0.17039871
|
|
0.14135075 0.18826556 0.17411089 0.15683222]
|
|
|
|
mean value: 0.16612420082092286
|
|
|
|
key: score_time
|
|
value: [0.01990747 0.02101636 0.02146673 0.02004623 0.02186728 0.02010036
|
|
0.01258993 0.02006197 0.02341485 0.02047729]
|
|
|
|
mean value: 0.020094847679138182
|
|
|
|
key: test_mcc
|
|
value: [0.8953202 0.86189955 0.82512315 0.79110556 0.75047877 0.75047877
|
|
0.68250015 0.75047877 0.64951905 0.85714286]
|
|
|
|
mean value: 0.7814046839086336
|
|
|
|
key: train_mcc
|
|
value: [0.85051239 0.85019923 0.84231823 0.8428767 0.85465533 0.84293789
|
|
0.83890131 0.85513299 0.87062545 0.84677832]
|
|
|
|
mean value: 0.8494937826202889
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.92982456 0.9122807 0.89473684 0.875 0.875
|
|
0.83928571 0.875 0.82142857 0.92857143]
|
|
|
|
mean value: 0.8898496240601503
|
|
|
|
key: train_accuracy
|
|
value: [0.92504931 0.92504931 0.92110454 0.92110454 0.92716535 0.92125984
|
|
0.91929134 0.92716535 0.93503937 0.92322835]
|
|
|
|
mean value: 0.9245457298606905
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 0.93103448 0.9122807 0.9 0.87719298 0.87272727
|
|
0.84745763 0.87272727 0.83333333 0.92857143]
|
|
|
|
mean value: 0.892269352249973
|
|
|
|
key: train_fscore
|
|
value: [0.92635659 0.92578125 0.92156863 0.92248062 0.92815534 0.92248062
|
|
0.92038835 0.92870906 0.93617021 0.92427184]
|
|
|
|
mean value: 0.925636250953157
|
|
|
|
key: test_precision
|
|
value: [0.93103448 0.9 0.92857143 0.87096774 0.86206897 0.88888889
|
|
0.80645161 0.88888889 0.78125 0.92857143]
|
|
|
|
mean value: 0.8786693438035207
|
|
|
|
key: train_precision
|
|
value: [0.91221374 0.91860465 0.91439689 0.90494297 0.91570881 0.90839695
|
|
0.90804598 0.90943396 0.92015209 0.91187739]
|
|
|
|
mean value: 0.9123773428551641
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.89655172 0.93103448 0.89285714 0.85714286
|
|
0.89285714 0.85714286 0.89285714 0.92857143]
|
|
|
|
mean value: 0.9077586206896552
|
|
|
|
key: train_recall
|
|
value: [0.94094488 0.93307087 0.92885375 0.94071146 0.94094488 0.93700787
|
|
0.93307087 0.9488189 0.95275591 0.93700787]
|
|
|
|
mean value: 0.9393187264635399
|
|
|
|
key: test_roc_auc
|
|
value: [0.9476601 0.93041872 0.91256158 0.89408867 0.875 0.875
|
|
0.83928571 0.875 0.82142857 0.92857143]
|
|
|
|
mean value: 0.8899014778325124
|
|
|
|
key: train_roc_auc
|
|
value: [0.9250179 0.92503346 0.92111979 0.92114313 0.92716535 0.92125984
|
|
0.91929134 0.92716535 0.93503937 0.92322835]
|
|
|
|
mean value: 0.9245463882232112
|
|
|
|
key: test_jcc
|
|
value: [0.9 0.87096774 0.83870968 0.81818182 0.78125 0.77419355
|
|
0.73529412 0.77419355 0.71428571 0.86666667]
|
|
|
|
mean value: 0.807374283291029
|
|
|
|
key: train_jcc
|
|
value: [0.86281588 0.86181818 0.85454545 0.85611511 0.86594203 0.85611511
|
|
0.85251799 0.86690647 0.88 0.85920578]
|
|
|
|
mean value: 0.8615982002257956
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.25657248 0.24637365 0.24630475 0.24570489 0.24687362 0.24694991
|
|
0.24950528 0.24740005 0.24689674 0.24667573]
|
|
|
|
mean value: 0.2479257106781006
|
|
|
|
key: score_time
|
|
value: [0.00848842 0.00830841 0.00831699 0.00836349 0.00849056 0.00834179
|
|
0.00837541 0.00853562 0.00849915 0.00830841]
|
|
|
|
mean value: 0.008402824401855469
|
|
|
|
key: test_mcc
|
|
value: [0.96547546 0.92980296 0.92980296 0.93202124 0.82195294 1.
|
|
0.96490128 0.89342711 0.96490128 0.92857143]
|
|
|
|
mean value: 0.9330856656296584
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.96491228 0.96491228 0.96491228 0.91071429 1.
|
|
0.98214286 0.94642857 0.98214286 0.96428571]
|
|
|
|
mean value: 0.9662907268170426
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98181818 0.96428571 0.96551724 0.96666667 0.9122807 1.
|
|
0.98245614 0.94736842 0.98181818 0.96428571]
|
|
|
|
mean value: 0.9666496963411664
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.96428571 0.96551724 0.93548387 0.89655172 1.
|
|
0.96551724 0.93103448 1. 0.96428571]
|
|
|
|
mean value: 0.9622675989194343
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.96551724 1. 0.92857143 1.
|
|
1. 0.96428571 0.96428571 0.96428571]
|
|
|
|
mean value: 0.971551724137931
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98214286 0.96490148 0.96490148 0.96428571 0.91071429 1.
|
|
0.98214286 0.94642857 0.98214286 0.96428571]
|
|
|
|
mean value: 0.9661945812807883
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96428571 0.93103448 0.93333333 0.93548387 0.83870968 1.
|
|
0.96551724 0.9 0.96428571 0.93103448]
|
|
|
|
mean value: 0.9363684517188411
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.3
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01195836 0.01368475 0.01422262 0.01405859 0.01395178 0.01564765
|
|
0.01407719 0.01829219 0.02588701 0.01483369]
|
|
|
|
mean value: 0.0156613826751709
|
|
|
|
key: score_time
|
|
value: [0.01075363 0.0107801 0.01071596 0.01084328 0.01077461 0.01080632
|
|
0.01084328 0.01160669 0.01160884 0.01082087]
|
|
|
|
mean value: 0.010955357551574707
|
|
|
|
key: test_mcc
|
|
value: [0.58069726 0.65466436 0.5920535 0.56277738 0.30588765 0.43876345
|
|
0.77459667 0.64116714 0.57735027 0.55339859]
|
|
|
|
mean value: 0.5681356266624504
|
|
|
|
key: train_mcc
|
|
value: [0.6451496 0.68602482 0.64393328 0.68142563 0.57742076 0.7295157
|
|
0.62763342 0.69688549 0.64324077 0.65891447]
|
|
|
|
mean value: 0.6590143937690011
|
|
|
|
key: test_accuracy
|
|
value: [0.75438596 0.8245614 0.77192982 0.77192982 0.64285714 0.71428571
|
|
0.875 0.80357143 0.75 0.75 ]
|
|
|
|
mean value: 0.7658521303258146
|
|
|
|
key: train_accuracy
|
|
value: [0.79684418 0.82840237 0.79487179 0.82248521 0.7519685 0.86220472
|
|
0.78740157 0.83070866 0.79724409 0.80708661]
|
|
|
|
mean value: 0.8079217723524205
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.80769231 0.72340426 0.74509804 0.56521739 0.68
|
|
0.85714286 0.76595745 0.66666667 0.68181818]
|
|
|
|
mean value: 0.7159663812634374
|
|
|
|
key: train_fscore
|
|
value: [0.74816626 0.8 0.74257426 0.78773585 0.671875 0.85355649
|
|
0.73399015 0.79906542 0.74939173 0.76442308]
|
|
|
|
mean value: 0.7650778223767692
|
|
|
|
key: test_precision
|
|
value: [1. 0.875 0.94444444 0.86363636 0.72222222 0.77272727
|
|
1. 0.94736842 1. 0.9375 ]
|
|
|
|
mean value: 0.9062898724082935
|
|
|
|
key: train_precision
|
|
value: [0.98709677 0.96132597 0.99337748 0.97660819 0.99230769 0.91071429
|
|
0.98026316 0.98275862 0.98089172 0.98148148]
|
|
|
|
mean value: 0.9746825369455663
|
|
|
|
key: test_recall
|
|
value: [0.5 0.75 0.5862069 0.65517241 0.46428571 0.60714286
|
|
0.75 0.64285714 0.5 0.53571429]
|
|
|
|
mean value: 0.5991379310344828
|
|
|
|
key: train_recall
|
|
value: [0.6023622 0.68503937 0.59288538 0.66007905 0.50787402 0.80314961
|
|
0.58661417 0.67322835 0.60629921 0.62598425]
|
|
|
|
mean value: 0.6343515607979833
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.82327586 0.77524631 0.77401478 0.64285714 0.71428571
|
|
0.875 0.80357143 0.75 0.75 ]
|
|
|
|
mean value: 0.7658251231527093
|
|
|
|
key: train_roc_auc
|
|
value: [0.79722853 0.82868569 0.79447418 0.82216551 0.7519685 0.86220472
|
|
0.78740157 0.83070866 0.79724409 0.80708661]
|
|
|
|
mean value: 0.8079168093118795
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.67741935 0.56666667 0.59375 0.39393939 0.51515152
|
|
0.75 0.62068966 0.5 0.51724138]
|
|
|
|
mean value: 0.5634857965079044
|
|
|
|
key: train_jcc
|
|
value: [0.59765625 0.66666667 0.59055118 0.64980545 0.50588235 0.74452555
|
|
0.57976654 0.66536965 0.59922179 0.61867704]
|
|
|
|
mean value: 0.621812246508153
|
|
|
|
MCC on Blind test: 0.37
|
|
|
|
Accuracy on Blind test: 0.82
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02161765 0.04498935 0.02991724 0.02177143 0.01122379 0.01113892
|
|
0.01118159 0.02967238 0.01116443 0.01118255]
|
|
|
|
mean value: 0.020385932922363282
|
|
|
|
key: score_time
|
|
value: [0.01993227 0.02460504 0.02005219 0.01057243 0.0105114 0.01053619
|
|
0.01052952 0.01053238 0.010499 0.0106318 ]
|
|
|
|
mean value: 0.013840222358703613
|
|
|
|
key: test_mcc
|
|
value: [0.8953202 0.8953202 0.85960591 0.79110556 0.71611487 0.82195294
|
|
0.71611487 0.71611487 0.68250015 0.82195294]
|
|
|
|
mean value: 0.7916102525004516
|
|
|
|
key: train_mcc
|
|
value: [0.81126698 0.82324487 0.81877755 0.81895888 0.82769588 0.81142619
|
|
0.81142619 0.82718204 0.83529327 0.8154727 ]
|
|
|
|
mean value: 0.8200744570697928
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.94736842 0.92982456 0.89473684 0.85714286 0.91071429
|
|
0.85714286 0.85714286 0.83928571 0.91071429]
|
|
|
|
mean value: 0.8951441102756892
|
|
|
|
key: train_accuracy
|
|
value: [0.90532544 0.9112426 0.90927022 0.90927022 0.91338583 0.90551181
|
|
0.90551181 0.91338583 0.91732283 0.90748031]
|
|
|
|
mean value: 0.9097706906459178
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 0.94736842 0.93103448 0.9 0.86206897 0.90909091
|
|
0.86206897 0.85185185 0.84745763 0.90909091]
|
|
|
|
mean value: 0.8967400553050681
|
|
|
|
key: train_fscore
|
|
value: [0.90733591 0.9132948 0.91015625 0.91050584 0.91538462 0.90697674
|
|
0.90697674 0.91472868 0.91891892 0.90909091]
|
|
|
|
mean value: 0.9113369405536723
|
|
|
|
key: test_precision
|
|
value: [0.93103448 0.93103448 0.93103448 0.87096774 0.83333333 0.92592593
|
|
0.83333333 0.88461538 0.80645161 0.92592593]
|
|
|
|
mean value: 0.8873656706248475
|
|
|
|
key: train_precision
|
|
value: [0.89015152 0.89433962 0.8996139 0.89655172 0.89473684 0.89312977
|
|
0.89312977 0.90076336 0.90151515 0.89353612]
|
|
|
|
mean value: 0.8957467777601632
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.93103448 0.93103448 0.89285714 0.89285714
|
|
0.89285714 0.82142857 0.89285714 0.89285714]
|
|
|
|
mean value: 0.9076354679802956
|
|
|
|
key: train_recall
|
|
value: [0.92519685 0.93307087 0.92094862 0.92490119 0.93700787 0.92125984
|
|
0.92125984 0.92913386 0.93700787 0.92519685]
|
|
|
|
mean value: 0.9274983660639258
|
|
|
|
key: test_roc_auc
|
|
value: [0.9476601 0.9476601 0.92980296 0.89408867 0.85714286 0.91071429
|
|
0.85714286 0.85714286 0.83928571 0.91071429]
|
|
|
|
mean value: 0.8951354679802956
|
|
|
|
key: train_roc_auc
|
|
value: [0.90528617 0.91119946 0.90929321 0.90930099 0.91338583 0.90551181
|
|
0.90551181 0.91338583 0.91732283 0.90748031]
|
|
|
|
mean value: 0.9097678254645047
|
|
|
|
key: test_jcc
|
|
value: [0.9 0.9 0.87096774 0.81818182 0.75757576 0.83333333
|
|
0.75757576 0.74193548 0.73529412 0.83333333]
|
|
|
|
mean value: 0.814819734345351
|
|
|
|
key: train_jcc
|
|
value: [0.83038869 0.84042553 0.83512545 0.83571429 0.84397163 0.82978723
|
|
0.82978723 0.84285714 0.85 0.83333333]
|
|
|
|
mean value: 0.8371390533718615
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_config.py:143: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
smnc_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_config.py:146: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
smnc_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.18555856 0.20449281 0.19151855 0.191679 0.19248724 0.19242501
|
|
0.20449567 0.27775383 0.19228506 0.1919651 ]
|
|
|
|
mean value: 0.20246608257293702
|
|
|
|
key: score_time
|
|
value: [0.02051473 0.01998162 0.02048826 0.02080917 0.02009439 0.01971388
|
|
0.0109446 0.02007937 0.01075292 0.01076293]
|
|
|
|
mean value: 0.017414188385009764
|
|
|
|
key: test_mcc
|
|
value: [0.85960591 0.8953202 0.85960591 0.82490815 0.75434227 0.82195294
|
|
0.71611487 0.71611487 0.68250015 0.82195294]
|
|
|
|
mean value: 0.7952418219423117
|
|
|
|
key: train_mcc
|
|
value: [0.86225372 0.8551535 0.84648438 0.83474492 0.86253233 0.8431734
|
|
0.81142619 0.85105352 0.83529327 0.8154727 ]
|
|
|
|
mean value: 0.8417587938288557
|
|
|
|
key: test_accuracy
|
|
value: [0.92982456 0.94736842 0.92982456 0.9122807 0.875 0.91071429
|
|
0.85714286 0.85714286 0.83928571 0.91071429]
|
|
|
|
mean value: 0.8969298245614035
|
|
|
|
key: train_accuracy
|
|
value: [0.93096647 0.9270217 0.92307692 0.91715976 0.93110236 0.92125984
|
|
0.90551181 0.92519685 0.91732283 0.90748031]
|
|
|
|
mean value: 0.9206098867819037
|
|
|
|
key: test_fscore
|
|
value: [0.92857143 0.94736842 0.93103448 0.91525424 0.88135593 0.90909091
|
|
0.86206897 0.85185185 0.84745763 0.90909091]
|
|
|
|
mean value: 0.8983144764543761
|
|
|
|
key: train_fscore
|
|
value: [0.93203883 0.92898273 0.92397661 0.91828794 0.93203883 0.92277992
|
|
0.90697674 0.92664093 0.91891892 0.90909091]
|
|
|
|
mean value: 0.9219732362977793
|
|
|
|
key: test_precision
|
|
value: [0.92857143 0.93103448 0.93103448 0.9 0.83870968 0.92592593
|
|
0.83333333 0.88461538 0.80645161 0.92592593]
|
|
|
|
mean value: 0.8905602254211821
|
|
|
|
key: train_precision
|
|
value: [0.91954023 0.90636704 0.91153846 0.90421456 0.91954023 0.90530303
|
|
0.89312977 0.90909091 0.90151515 0.89353612]
|
|
|
|
mean value: 0.9063775505468512
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.96428571 0.93103448 0.93103448 0.92857143 0.89285714
|
|
0.89285714 0.82142857 0.89285714 0.89285714]
|
|
|
|
mean value: 0.9076354679802956
|
|
|
|
key: train_recall
|
|
value: [0.94488189 0.95275591 0.93675889 0.93280632 0.94488189 0.94094488
|
|
0.92125984 0.94488189 0.93700787 0.92519685]
|
|
|
|
mean value: 0.9381376241013352
|
|
|
|
key: test_roc_auc
|
|
value: [0.92980296 0.9476601 0.92980296 0.91194581 0.875 0.91071429
|
|
0.85714286 0.85714286 0.83928571 0.91071429]
|
|
|
|
mean value: 0.8969211822660099
|
|
|
|
key: train_roc_auc
|
|
value: [0.93093897 0.92697084 0.92310386 0.91719056 0.93110236 0.92125984
|
|
0.90551181 0.92519685 0.91732283 0.90748031]
|
|
|
|
mean value: 0.920607824219601
|
|
|
|
key: test_jcc
|
|
value: [0.86666667 0.9 0.87096774 0.84375 0.78787879 0.83333333
|
|
0.75757576 0.74193548 0.73529412 0.83333333]
|
|
|
|
mean value: 0.817073522224139
|
|
|
|
key: train_jcc
|
|
value: [0.87272727 0.86738351 0.85869565 0.84892086 0.87272727 0.85663082
|
|
0.82978723 0.86330935 0.85 0.83333333]
|
|
|
|
mean value: 0.8553515317749246
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03999615 0.02583027 0.02410555 0.02264023 0.02619028 0.02323699
|
|
0.02574015 0.02130461 0.0231998 0.02364349]
|
|
|
|
mean value: 0.0255887508392334
|
|
|
|
key: score_time
|
|
value: [0.01082921 0.01091146 0.01050711 0.01049757 0.01048684 0.01047969
|
|
0.01068377 0.01047754 0.01049089 0.01047397]
|
|
|
|
mean value: 0.010583806037902831
|
|
|
|
key: test_mcc
|
|
value: [0.8953202 0.8953202 0.82512315 0.82490815 0.71611487 0.89342711
|
|
0.71611487 0.75047877 0.68250015 0.85933785]
|
|
|
|
mean value: 0.8058645326851578
|
|
|
|
key: train_mcc
|
|
value: [0.83454496 0.83472439 0.83070006 0.83456039 0.85486752 0.81527029
|
|
0.83505996 0.83076661 0.8355787 0.81511857]
|
|
|
|
mean value: 0.8321191457866195
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.94736842 0.9122807 0.9122807 0.85714286 0.94642857
|
|
0.85714286 0.875 0.83928571 0.92857143]
|
|
|
|
mean value: 0.9022869674185463
|
|
|
|
key: train_accuracy
|
|
value: [0.91715976 0.91715976 0.91518738 0.91715976 0.92716535 0.90748031
|
|
0.91732283 0.91535433 0.91732283 0.90748031]
|
|
|
|
mean value: 0.9158792650918636
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 0.94736842 0.9122807 0.91525424 0.86206897 0.94545455
|
|
0.86206897 0.87272727 0.84745763 0.92592593]
|
|
|
|
mean value: 0.9037975083408656
|
|
|
|
key: train_fscore
|
|
value: [0.91828794 0.91860465 0.91617934 0.91796875 0.92843327 0.90873786
|
|
0.91860465 0.91585127 0.91923077 0.90838207]
|
|
|
|
mean value: 0.9170280567760439
|
|
|
|
key: test_precision
|
|
value: [0.93103448 0.93103448 0.92857143 0.9 0.83333333 0.96296296
|
|
0.83333333 0.88888889 0.80645161 0.96153846]
|
|
|
|
mean value: 0.8977148987048875
|
|
|
|
key: train_precision
|
|
value: [0.90769231 0.90458015 0.90384615 0.90733591 0.91254753 0.89655172
|
|
0.90458015 0.91050584 0.89849624 0.8996139 ]
|
|
|
|
mean value: 0.9045749903664201
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.89655172 0.93103448 0.89285714 0.92857143
|
|
0.89285714 0.85714286 0.89285714 0.89285714]
|
|
|
|
mean value: 0.9113300492610837
|
|
|
|
key: train_recall
|
|
value: [0.92913386 0.93307087 0.92885375 0.92885375 0.94488189 0.92125984
|
|
0.93307087 0.92125984 0.94094488 0.91732283]
|
|
|
|
mean value: 0.9298652391771187
|
|
|
|
key: test_roc_auc
|
|
value: [0.9476601 0.9476601 0.91256158 0.91194581 0.85714286 0.94642857
|
|
0.85714286 0.875 0.83928571 0.92857143]
|
|
|
|
mean value: 0.9023399014778325
|
|
|
|
key: train_roc_auc
|
|
value: [0.9171361 0.91712832 0.91521428 0.91718278 0.92716535 0.90748031
|
|
0.91732283 0.91535433 0.91732283 0.90748031]
|
|
|
|
mean value: 0.9158787463819987
|
|
|
|
key: test_jcc
|
|
value: [0.9 0.9 0.83870968 0.84375 0.75757576 0.89655172
|
|
0.75757576 0.77419355 0.73529412 0.86206897]
|
|
|
|
mean value: 0.8265719548260198
|
|
|
|
key: train_jcc
|
|
value: [0.84892086 0.84946237 0.84532374 0.84837545 0.86642599 0.83274021
|
|
0.84946237 0.84476534 0.85053381 0.83214286]
|
|
|
|
mean value: 0.8468153000998123
|
|
|
|
MCC on Blind test: 0.28
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.74657226 0.72350025 0.66616726 0.69195914 0.85434008 0.67523313
|
|
0.67365026 0.74283338 0.70560384 0.68593574]
|
|
|
|
mean value: 0.716579532623291
|
|
|
|
key: score_time
|
|
value: [0.01196027 0.01932144 0.020437 0.01222825 0.01223254 0.01219296
|
|
0.01108098 0.01215911 0.01234174 0.01253176]
|
|
|
|
mean value: 0.013648605346679688
|
|
|
|
key: test_mcc
|
|
value: [0.93202124 0.92980296 0.92980296 0.85960591 0.78772636 1.
|
|
0.85933785 0.85714286 0.78772636 0.85714286]
|
|
|
|
mean value: 0.8800309350106305
|
|
|
|
key: train_mcc
|
|
value: [0.93691352 0.93691352 0.94480151 0.93691156 0.93703692 0.93703692
|
|
0.92913386 0.9332517 0.92520402 0.9330781 ]
|
|
|
|
mean value: 0.9350281642225636
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.96491228 0.96491228 0.92982456 0.89285714 1.
|
|
0.92857143 0.92857143 0.89285714 0.92857143]
|
|
|
|
mean value: 0.9395989974937343
|
|
|
|
key: train_accuracy
|
|
value: [0.96844181 0.96844181 0.97238659 0.96844181 0.96850394 0.96850394
|
|
0.96456693 0.96653543 0.96259843 0.96653543]
|
|
|
|
mean value: 0.9674956126046373
|
|
|
|
key: test_fscore
|
|
value: [0.96296296 0.96428571 0.96551724 0.93103448 0.89655172 1.
|
|
0.93103448 0.92857143 0.89655172 0.92857143]
|
|
|
|
mean value: 0.9405081189563949
|
|
|
|
key: train_fscore
|
|
value: [0.96837945 0.96837945 0.97222222 0.96825397 0.96837945 0.96837945
|
|
0.96456693 0.96620278 0.96267191 0.96646943]
|
|
|
|
mean value: 0.9673905023176848
|
|
|
|
key: test_precision
|
|
value: [1. 0.96428571 0.96551724 0.93103448 0.86666667 1.
|
|
0.9 0.92857143 0.86666667 0.92857143]
|
|
|
|
mean value: 0.9351313628899836
|
|
|
|
key: train_precision
|
|
value: [0.97222222 0.97222222 0.97609562 0.97211155 0.97222222 0.97222222
|
|
0.96456693 0.97590361 0.96078431 0.96837945]
|
|
|
|
mean value: 0.9706730364161126
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.96428571 0.96551724 0.93103448 0.92857143 1.
|
|
0.96428571 0.92857143 0.92857143 0.92857143]
|
|
|
|
mean value: 0.9467980295566503
|
|
|
|
key: train_recall
|
|
value: [0.96456693 0.96456693 0.96837945 0.96442688 0.96456693 0.96456693
|
|
0.96456693 0.95669291 0.96456693 0.96456693]
|
|
|
|
mean value: 0.9641467741433506
|
|
|
|
key: test_roc_auc
|
|
value: [0.96428571 0.96490148 0.96490148 0.92980296 0.89285714 1.
|
|
0.92857143 0.92857143 0.89285714 0.92857143]
|
|
|
|
mean value: 0.9395320197044336
|
|
|
|
key: train_roc_auc
|
|
value: [0.96844947 0.96844947 0.9723787 0.96843391 0.96850394 0.96850394
|
|
0.96456693 0.96653543 0.96259843 0.96653543]
|
|
|
|
mean value: 0.9674955650306558
|
|
|
|
key: test_jcc
|
|
value: [0.92857143 0.93103448 0.93333333 0.87096774 0.8125 1.
|
|
0.87096774 0.86666667 0.8125 0.86666667]
|
|
|
|
mean value: 0.8893208061867683
|
|
|
|
key: train_jcc
|
|
value: [0.93869732 0.93869732 0.94594595 0.93846154 0.93869732 0.93869732
|
|
0.93155894 0.93461538 0.9280303 0.9351145 ]
|
|
|
|
mean value: 0.9368515883261834
|
|
|
|
MCC on Blind test: 0.23
|
|
|
|
Accuracy on Blind test: 0.65
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01068711 0.00849342 0.00775218 0.0076437 0.00741124 0.00742817
|
|
0.00740385 0.0076189 0.00729799 0.0073278 ]
|
|
|
|
mean value: 0.007906436920166016
|
|
|
|
key: score_time
|
|
value: [0.0128932 0.00827646 0.00841212 0.00812268 0.00794554 0.00796461
|
|
0.00785041 0.00781918 0.00777936 0.00783634]
|
|
|
|
mean value: 0.008489990234375
|
|
|
|
key: test_mcc
|
|
value: [0.77728159 0.68736396 0.77903565 0.56277738 0.47187011 0.58501794
|
|
0.72168784 0.65814518 0.70082556 0.65814518]
|
|
|
|
mean value: 0.6602150384851577
|
|
|
|
key: train_mcc
|
|
value: [0.66258992 0.65336491 0.67038524 0.68202471 0.62396093 0.66768511
|
|
0.66768511 0.72158618 0.67809175 0.67572951]
|
|
|
|
mean value: 0.6703103372008967
|
|
|
|
key: test_accuracy
|
|
value: [0.87719298 0.84210526 0.87719298 0.77192982 0.73214286 0.78571429
|
|
0.85714286 0.82142857 0.83928571 0.82142857]
|
|
|
|
mean value: 0.8225563909774436
|
|
|
|
key: train_accuracy
|
|
value: [0.82445759 0.81854043 0.82840237 0.83234714 0.79527559 0.82677165
|
|
0.82677165 0.85826772 0.83267717 0.83070866]
|
|
|
|
mean value: 0.8274219975461647
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 0.83018868 0.8627451 0.74509804 0.70588235 0.76
|
|
0.84615385 0.8 0.81632653 0.8 ]
|
|
|
|
mean value: 0.8023537403350309
|
|
|
|
key: train_fscore
|
|
value: [0.80525164 0.79646018 0.80879121 0.81069042 0.75586854 0.80701754
|
|
0.80701754 0.84937238 0.81481481 0.81140351]
|
|
|
|
mean value: 0.8066687790927018
|
|
|
|
key: test_precision
|
|
value: [1. 0.88 1. 0.86363636 0.7826087 0.86363636
|
|
0.91666667 0.90909091 0.95238095 0.90909091]
|
|
|
|
mean value: 0.9077110860154338
|
|
|
|
key: train_precision
|
|
value: [0.90640394 0.90909091 0.91089109 0.92857143 0.93604651 0.91089109
|
|
0.91089109 0.90625 0.91219512 0.91584158]
|
|
|
|
mean value: 0.9147072763613312
|
|
|
|
key: test_recall
|
|
value: [0.75 0.78571429 0.75862069 0.65517241 0.64285714 0.67857143
|
|
0.78571429 0.71428571 0.71428571 0.71428571]
|
|
|
|
mean value: 0.7199507389162562
|
|
|
|
key: train_recall
|
|
value: [0.72440945 0.70866142 0.72727273 0.71936759 0.63385827 0.72440945
|
|
0.72440945 0.7992126 0.73622047 0.72834646]
|
|
|
|
mean value: 0.7226167875260652
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.841133 0.87931034 0.77401478 0.73214286 0.78571429
|
|
0.85714286 0.82142857 0.83928571 0.82142857]
|
|
|
|
mean value: 0.8226600985221675
|
|
|
|
key: train_roc_auc
|
|
value: [0.82465532 0.81875759 0.82820329 0.83212474 0.79527559 0.82677165
|
|
0.82677165 0.85826772 0.83267717 0.83070866]
|
|
|
|
mean value: 0.8274213376489994
|
|
|
|
key: test_jcc
|
|
value: [0.75 0.70967742 0.75862069 0.59375 0.54545455 0.61290323
|
|
0.73333333 0.66666667 0.68965517 0.66666667]
|
|
|
|
mean value: 0.6726727719351467
|
|
|
|
key: train_jcc
|
|
value: [0.67399267 0.66176471 0.67896679 0.68164794 0.60754717 0.67647059
|
|
0.67647059 0.73818182 0.6875 0.68265683]
|
|
|
|
mean value: 0.6765199100649822
|
|
|
|
MCC on Blind test: 0.34
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00778794 0.00758052 0.00752497 0.00759459 0.00758076 0.00756574
|
|
0.00753331 0.00752449 0.00757456 0.0075686 ]
|
|
|
|
mean value: 0.00758354663848877
|
|
|
|
key: score_time
|
|
value: [0.00791955 0.00788665 0.00793982 0.00793576 0.00796843 0.00790358
|
|
0.00784445 0.00793791 0.00797367 0.00794005]
|
|
|
|
mean value: 0.007924985885620118
|
|
|
|
key: test_mcc
|
|
value: [0.8953202 0.82512315 0.85960591 0.71921182 0.71611487 0.75047877
|
|
0.64285714 0.75047877 0.64450339 0.82195294]
|
|
|
|
mean value: 0.7625646979424463
|
|
|
|
key: train_mcc
|
|
value: [0.75941547 0.75148224 0.759525 0.75544282 0.77167747 0.77186893
|
|
0.77564465 0.77588525 0.78749923 0.76800824]
|
|
|
|
mean value: 0.7676449294755058
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.9122807 0.92982456 0.85964912 0.85714286 0.875
|
|
0.82142857 0.875 0.82142857 0.91071429]
|
|
|
|
mean value: 0.8809837092731829
|
|
|
|
key: train_accuracy
|
|
value: [0.87968442 0.87573964 0.87968442 0.87771203 0.88582677 0.88582677
|
|
0.88779528 0.88779528 0.89370079 0.88385827]
|
|
|
|
mean value: 0.8837623662426812
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 0.9122807 0.93103448 0.86206897 0.86206897 0.87272727
|
|
0.82142857 0.87272727 0.82758621 0.90909091]
|
|
|
|
mean value: 0.8818381769470699
|
|
|
|
key: train_fscore
|
|
value: [0.88062622 0.8762279 0.88062622 0.87698413 0.88627451 0.88715953
|
|
0.88845401 0.88932039 0.89453125 0.88543689]
|
|
|
|
mean value: 0.8845641057179913
|
|
|
|
key: test_precision
|
|
value: [0.93103448 0.89655172 0.93103448 0.86206897 0.83333333 0.88888889
|
|
0.82142857 0.88888889 0.8 0.92592593]
|
|
|
|
mean value: 0.8779155263638022
|
|
|
|
key: train_precision
|
|
value: [0.87548638 0.8745098 0.87209302 0.88047809 0.8828125 0.87692308
|
|
0.88326848 0.87739464 0.8875969 0.87356322]
|
|
|
|
mean value: 0.8784126109194028
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.92857143 0.93103448 0.86206897 0.89285714 0.85714286
|
|
0.82142857 0.85714286 0.85714286 0.89285714]
|
|
|
|
mean value: 0.8864532019704433
|
|
|
|
key: train_recall
|
|
value: [0.88582677 0.87795276 0.88932806 0.87351779 0.88976378 0.8976378
|
|
0.89370079 0.9015748 0.9015748 0.8976378 ]
|
|
|
|
mean value: 0.8908515141140954
|
|
|
|
key: test_roc_auc
|
|
value: [0.9476601 0.91256158 0.92980296 0.85960591 0.85714286 0.875
|
|
0.82142857 0.875 0.82142857 0.91071429]
|
|
|
|
mean value: 0.8810344827586207
|
|
|
|
key: train_roc_auc
|
|
value: [0.87967228 0.87573527 0.8797034 0.87770378 0.88582677 0.88582677
|
|
0.88779528 0.88779528 0.89370079 0.88385827]
|
|
|
|
mean value: 0.8837617876816781
|
|
|
|
key: test_jcc
|
|
value: [0.9 0.83870968 0.87096774 0.75757576 0.75757576 0.77419355
|
|
0.6969697 0.77419355 0.70588235 0.83333333]
|
|
|
|
mean value: 0.7909401414524755
|
|
|
|
key: train_jcc
|
|
value: [0.78671329 0.77972028 0.78671329 0.78091873 0.79577465 0.7972028
|
|
0.79929577 0.8006993 0.80918728 0.79442509]
|
|
|
|
mean value: 0.7930650467759314
|
|
|
|
MCC on Blind test: 0.28
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00749707 0.0071063 0.00800991 0.00797606 0.00807238 0.00814319
|
|
0.00821209 0.00825953 0.0080924 0.00823736]
|
|
|
|
mean value: 0.007960629463195801
|
|
|
|
key: score_time
|
|
value: [0.01054406 0.01405478 0.01150608 0.0120914 0.0119555 0.01721978
|
|
0.01335192 0.0119431 0.01190829 0.01170444]
|
|
|
|
mean value: 0.012627935409545899
|
|
|
|
key: test_mcc
|
|
value: [0.8953202 0.78940887 0.71921182 0.79110556 0.75047877 0.68250015
|
|
0.60753044 0.75047877 0.58501794 0.82195294]
|
|
|
|
mean value: 0.7393005465274064
|
|
|
|
key: train_mcc
|
|
value: [0.78308641 0.78304441 0.77919572 0.79093074 0.79951627 0.78742599
|
|
0.80317451 0.80759374 0.80324922 0.78395685]
|
|
|
|
mean value: 0.7921173847894009
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.89473684 0.85964912 0.89473684 0.875 0.83928571
|
|
0.80357143 0.875 0.78571429 0.91071429]
|
|
|
|
mean value: 0.868577694235589
|
|
|
|
key: train_accuracy
|
|
value: [0.89151874 0.89151874 0.88954635 0.89546351 0.8996063 0.89370079
|
|
0.9015748 0.90354331 0.9015748 0.89173228]
|
|
|
|
mean value: 0.8959779620742673
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 0.89285714 0.86206897 0.9 0.87719298 0.83018868
|
|
0.80701754 0.87272727 0.80645161 0.90909091]
|
|
|
|
mean value: 0.8704963529709496
|
|
|
|
key: train_fscore
|
|
value: [0.89236791 0.89151874 0.89019608 0.8950495 0.90097087 0.89411765
|
|
0.90196078 0.90522244 0.90234375 0.89361702]
|
|
|
|
mean value: 0.8967364740693871
|
|
|
|
key: test_precision
|
|
value: [0.93103448 0.89285714 0.86206897 0.87096774 0.86206897 0.88
|
|
0.79310345 0.88888889 0.73529412 0.92592593]
|
|
|
|
mean value: 0.8642209679323466
|
|
|
|
key: train_precision
|
|
value: [0.88715953 0.89328063 0.88326848 0.8968254 0.88888889 0.890625
|
|
0.8984375 0.88973384 0.89534884 0.878327 ]
|
|
|
|
mean value: 0.8901895107400759
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.89285714 0.86206897 0.93103448 0.89285714 0.78571429
|
|
0.82142857 0.85714286 0.89285714 0.89285714]
|
|
|
|
mean value: 0.8793103448275862
|
|
|
|
key: train_recall
|
|
value: [0.8976378 0.88976378 0.8972332 0.89328063 0.91338583 0.8976378
|
|
0.90551181 0.92125984 0.90944882 0.90944882]
|
|
|
|
mean value: 0.9034608322181071
|
|
|
|
key: test_roc_auc
|
|
value: [0.9476601 0.89470443 0.85960591 0.89408867 0.875 0.83928571
|
|
0.80357143 0.875 0.78571429 0.91071429]
|
|
|
|
mean value: 0.8685344827586207
|
|
|
|
key: train_roc_auc
|
|
value: [0.89150664 0.89152221 0.88956148 0.89545921 0.8996063 0.89370079
|
|
0.9015748 0.90354331 0.9015748 0.89173228]
|
|
|
|
mean value: 0.8959781830630855
|
|
|
|
key: test_jcc
|
|
value: [0.9 0.80645161 0.75757576 0.81818182 0.78125 0.70967742
|
|
0.67647059 0.77419355 0.67567568 0.83333333]
|
|
|
|
mean value: 0.7732809753647041
|
|
|
|
key: train_jcc
|
|
value: [0.80565371 0.80427046 0.80212014 0.81003584 0.81978799 0.80851064
|
|
0.82142857 0.82685512 0.82206406 0.80769231]
|
|
|
|
mean value: 0.8128418840416354
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01613593 0.01766896 0.01700807 0.01466203 0.01586533 0.01819515
|
|
0.01554966 0.01456881 0.01771259 0.01677132]
|
|
|
|
mean value: 0.016413784027099608
|
|
|
|
key: score_time
|
|
value: [0.00918293 0.01023507 0.00923419 0.00916266 0.01018572 0.01020145
|
|
0.00912857 0.00951362 0.0102036 0.00925684]
|
|
|
|
mean value: 0.009630465507507324
|
|
|
|
key: test_mcc
|
|
value: [0.8953202 0.8953202 0.85960591 0.75462449 0.71611487 0.78772636
|
|
0.64285714 0.75047877 0.64450339 0.78772636]
|
|
|
|
mean value: 0.7734277700975402
|
|
|
|
key: train_mcc
|
|
value: [0.77528914 0.77528914 0.77932046 0.78708603 0.79537422 0.78376226
|
|
0.80337378 0.79163927 0.79926835 0.77574087]
|
|
|
|
mean value: 0.7866143511152437
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.94736842 0.92982456 0.87719298 0.85714286 0.89285714
|
|
0.82142857 0.875 0.82142857 0.89285714]
|
|
|
|
mean value: 0.8862468671679198
|
|
|
|
key: train_accuracy
|
|
value: [0.88757396 0.88757396 0.88954635 0.89349112 0.8976378 0.89173228
|
|
0.9015748 0.89566929 0.8996063 0.88779528]
|
|
|
|
mean value: 0.8932201152370747
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 0.94736842 0.93103448 0.88135593 0.86206897 0.88888889
|
|
0.82142857 0.87272727 0.82758621 0.88888889]
|
|
|
|
mean value: 0.8868716051414689
|
|
|
|
key: train_fscore
|
|
value: [0.88888889 0.88888889 0.890625 0.89411765 0.8984375 0.89320388
|
|
0.90272374 0.89708738 0.90019569 0.88888889]
|
|
|
|
mean value: 0.8943057505986216
|
|
|
|
key: test_precision
|
|
value: [0.93103448 0.93103448 0.93103448 0.86666667 0.83333333 0.92307692
|
|
0.82142857 0.88888889 0.8 0.92307692]
|
|
|
|
mean value: 0.8849574754747168
|
|
|
|
key: train_precision
|
|
value: [0.88030888 0.88030888 0.88030888 0.88715953 0.89147287 0.88122605
|
|
0.89230769 0.88505747 0.89494163 0.88030888]
|
|
|
|
mean value: 0.8853400773979657
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.93103448 0.89655172 0.89285714 0.85714286
|
|
0.82142857 0.85714286 0.85714286 0.85714286]
|
|
|
|
mean value: 0.8899014778325123
|
|
|
|
key: train_recall
|
|
value: [0.8976378 0.8976378 0.90118577 0.90118577 0.90551181 0.90551181
|
|
0.91338583 0.90944882 0.90551181 0.8976378 ]
|
|
|
|
mean value: 0.9034655006068906
|
|
|
|
key: test_roc_auc
|
|
value: [0.9476601 0.9476601 0.92980296 0.87684729 0.85714286 0.89285714
|
|
0.82142857 0.875 0.82142857 0.89285714]
|
|
|
|
mean value: 0.886268472906404
|
|
|
|
key: train_roc_auc
|
|
value: [0.88755408 0.88755408 0.88956926 0.89350627 0.8976378 0.89173228
|
|
0.9015748 0.89566929 0.8996063 0.88779528]
|
|
|
|
mean value: 0.8932199433568828
|
|
|
|
key: test_jcc
|
|
value: [0.9 0.9 0.87096774 0.78787879 0.75757576 0.8
|
|
0.6969697 0.77419355 0.70588235 0.8 ]
|
|
|
|
mean value: 0.7993467885687999
|
|
|
|
key: train_jcc
|
|
value: [0.8 0.8 0.8028169 0.80851064 0.81560284 0.80701754
|
|
0.82269504 0.81338028 0.81850534 0.8 ]
|
|
|
|
mean value: 0.808852857567483
|
|
|
|
MCC on Blind test: 0.22
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.39372373 1.54294658 1.46989751 1.48421979 1.53672004 1.60921836
|
|
1.45766068 1.5458262 1.44554925 1.52575564]
|
|
|
|
mean value: 1.5011517763137818
|
|
|
|
key: score_time
|
|
value: [0.01374149 0.01342797 0.01947975 0.01363492 0.01389122 0.02115655
|
|
0.0138762 0.01382446 0.01419258 0.01371765]
|
|
|
|
mean value: 0.015094280242919922
|
|
|
|
key: test_mcc
|
|
value: [0.8951918 0.8953202 0.82490815 0.85960591 0.75047877 0.89802651
|
|
0.85933785 0.78772636 0.78772636 0.85714286]
|
|
|
|
mean value: 0.8415464773043235
|
|
|
|
key: train_mcc
|
|
value: [0.98028353 0.96055211 0.97239383 0.96055211 0.97640822 0.97244848
|
|
0.96463421 0.96850394 0.9645744 0.96853396]
|
|
|
|
mean value: 0.9688884814344612
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.94736842 0.9122807 0.92982456 0.875 0.94642857
|
|
0.92857143 0.89285714 0.89285714 0.92857143]
|
|
|
|
mean value: 0.9201127819548872
|
|
|
|
key: train_accuracy
|
|
value: [0.99013807 0.98027613 0.98619329 0.98027613 0.98818898 0.98622047
|
|
0.98228346 0.98425197 0.98228346 0.98425197]
|
|
|
|
mean value: 0.9844363944151951
|
|
|
|
key: test_fscore
|
|
value: [0.94545455 0.94736842 0.91525424 0.93103448 0.87719298 0.94339623
|
|
0.93103448 0.88888889 0.89655172 0.92857143]
|
|
|
|
mean value: 0.9204747419782038
|
|
|
|
key: train_fscore
|
|
value: [0.99017682 0.98031496 0.98613861 0.98023715 0.98814229 0.98619329
|
|
0.98217822 0.98425197 0.98224852 0.98418972]
|
|
|
|
mean value: 0.9844071562661963
|
|
|
|
key: test_precision
|
|
value: [0.96296296 0.93103448 0.9 0.93103448 0.86206897 1.
|
|
0.9 0.92307692 0.86666667 0.92857143]
|
|
|
|
mean value: 0.9205415912312465
|
|
|
|
key: train_precision
|
|
value: [0.98823529 0.98031496 0.98809524 0.98023715 0.99206349 0.98814229
|
|
0.98804781 0.98425197 0.98418972 0.98809524]
|
|
|
|
mean value: 0.9861673170230888
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.96428571 0.93103448 0.93103448 0.89285714 0.89285714
|
|
0.96428571 0.85714286 0.92857143 0.92857143]
|
|
|
|
mean value: 0.9219211822660098
|
|
|
|
key: train_recall
|
|
value: [0.99212598 0.98031496 0.98418972 0.98023715 0.98425197 0.98425197
|
|
0.97637795 0.98425197 0.98031496 0.98031496]
|
|
|
|
mean value: 0.9826631601879805
|
|
|
|
key: test_roc_auc
|
|
value: [0.94704433 0.9476601 0.91194581 0.92980296 0.875 0.94642857
|
|
0.92857143 0.89285714 0.89285714 0.92857143]
|
|
|
|
mean value: 0.9200738916256158
|
|
|
|
key: train_roc_auc
|
|
value: [0.99013414 0.98027606 0.98618935 0.98027606 0.98818898 0.98622047
|
|
0.98228346 0.98425197 0.98228346 0.98425197]
|
|
|
|
mean value: 0.9844355917960848
|
|
|
|
key: test_jcc
|
|
value: [0.89655172 0.9 0.84375 0.87096774 0.78125 0.89285714
|
|
0.87096774 0.8 0.8125 0.86666667]
|
|
|
|
mean value: 0.8535511017532709
|
|
|
|
key: train_jcc
|
|
value: [0.98054475 0.96138996 0.97265625 0.96124031 0.9765625 0.97276265
|
|
0.96498054 0.96899225 0.96511628 0.9688716 ]
|
|
|
|
mean value: 0.9693117081673194
|
|
|
|
MCC on Blind test: 0.26
|
|
|
|
Accuracy on Blind test: 0.66
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01417804 0.01202106 0.01139545 0.01080203 0.01007557 0.01062059
|
|
0.01066804 0.01060534 0.01126242 0.01187325]
|
|
|
|
mean value: 0.011350178718566894
|
|
|
|
key: score_time
|
|
value: [0.01092696 0.00883508 0.00887847 0.00816321 0.00810766 0.00819612
|
|
0.00795102 0.00797558 0.00864434 0.00838518]
|
|
|
|
mean value: 0.008606362342834472
|
|
|
|
key: test_mcc
|
|
value: [0.93202124 0.8951918 0.85960591 0.8953202 0.75434227 0.96490128
|
|
0.75434227 0.89342711 0.96490128 0.92857143]
|
|
|
|
mean value: 0.8842624793067261
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.94736842 0.92982456 0.94736842 0.875 0.98214286
|
|
0.875 0.94642857 0.98214286 0.96428571]
|
|
|
|
mean value: 0.9414473684210526
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96296296 0.94545455 0.93103448 0.94736842 0.88135593 0.98181818
|
|
0.88135593 0.94736842 0.98181818 0.96428571]
|
|
|
|
mean value: 0.942482277561025
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.96296296 0.93103448 0.96428571 0.83870968 1.
|
|
0.83870968 0.93103448 1. 0.96428571]
|
|
|
|
mean value: 0.9431022711890342
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.92857143 0.93103448 0.93103448 0.92857143 0.96428571
|
|
0.92857143 0.96428571 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9433497536945813
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96428571 0.94704433 0.92980296 0.9476601 0.875 0.98214286
|
|
0.875 0.94642857 0.98214286 0.96428571]
|
|
|
|
mean value: 0.9413793103448276
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.92857143 0.89655172 0.87096774 0.9 0.78787879 0.96428571
|
|
0.78787879 0.9 0.96428571 0.93103448]
|
|
|
|
mean value: 0.8931454381732469
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.11
|
|
|
|
Accuracy on Blind test: 0.36
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10496068 0.10379076 0.104743 0.10231304 0.1054554 0.10581684
|
|
0.10448885 0.10502958 0.10399008 0.10775542]
|
|
|
|
mean value: 0.10483436584472657
|
|
|
|
key: score_time
|
|
value: [0.01817036 0.01749301 0.01778865 0.01884627 0.01766968 0.01870561
|
|
0.01813245 0.01771808 0.01825023 0.01763487]
|
|
|
|
mean value: 0.018040919303894044
|
|
|
|
key: test_mcc
|
|
value: [0.8953202 0.86189955 0.85960591 0.82490815 0.75434227 0.96490128
|
|
0.82618439 0.82195294 0.68250015 0.92857143]
|
|
|
|
mean value: 0.8420186261363041
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.92982456 0.92982456 0.9122807 0.875 0.98214286
|
|
0.91071429 0.91071429 0.83928571 0.96428571]
|
|
|
|
mean value: 0.9201441102756892
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 0.93103448 0.93103448 0.91525424 0.88135593 0.98245614
|
|
0.91525424 0.90909091 0.84745763 0.96428571]
|
|
|
|
mean value: 0.9224592184195679
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.93103448 0.9 0.93103448 0.9 0.83870968 0.96551724
|
|
0.87096774 0.92592593 0.80645161 0.96428571]
|
|
|
|
mean value: 0.9033926879366256
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.93103448 0.93103448 0.92857143 1.
|
|
0.96428571 0.89285714 0.89285714 0.96428571]
|
|
|
|
mean value: 0.9433497536945813
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9476601 0.93041872 0.92980296 0.91194581 0.875 0.98214286
|
|
0.91071429 0.91071429 0.83928571 0.96428571]
|
|
|
|
mean value: 0.9201970443349754
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.9 0.87096774 0.87096774 0.84375 0.78787879 0.96551724
|
|
0.84375 0.83333333 0.73529412 0.93103448]
|
|
|
|
mean value: 0.8582493446868079
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.33
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0083456 0.00800943 0.00787878 0.00753284 0.00806522 0.00796127
|
|
0.00782919 0.00870728 0.00797272 0.00807309]
|
|
|
|
mean value: 0.008037543296813965
|
|
|
|
key: score_time
|
|
value: [0.00834203 0.00855613 0.00791216 0.00868464 0.00868344 0.00816584
|
|
0.00837827 0.00838041 0.00823331 0.00819325]
|
|
|
|
mean value: 0.008352947235107423
|
|
|
|
key: test_mcc
|
|
value: [0.8951918 0.68850906 0.79110556 0.78940887 0.57142857 0.65814518
|
|
0.4330127 0.85714286 0.78772636 0.64450339]
|
|
|
|
mean value: 0.7116174353174761
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.84210526 0.89473684 0.89473684 0.78571429 0.82142857
|
|
0.71428571 0.92857143 0.89285714 0.82142857]
|
|
|
|
mean value: 0.8543233082706767
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94545455 0.84745763 0.9 0.89655172 0.78571429 0.8
|
|
0.73333333 0.92857143 0.88888889 0.81481481]
|
|
|
|
mean value: 0.8540786648033872
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96296296 0.80645161 0.87096774 0.89655172 0.78571429 0.90909091
|
|
0.6875 0.92857143 0.92307692 0.84615385]
|
|
|
|
mean value: 0.8617041434546996
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.89285714 0.93103448 0.89655172 0.78571429 0.71428571
|
|
0.78571429 0.92857143 0.85714286 0.78571429]
|
|
|
|
mean value: 0.850615763546798
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.94704433 0.8429803 0.89408867 0.89470443 0.78571429 0.82142857
|
|
0.71428571 0.92857143 0.89285714 0.82142857]
|
|
|
|
mean value: 0.8543103448275862
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.89655172 0.73529412 0.81818182 0.8125 0.64705882 0.66666667
|
|
0.57894737 0.86666667 0.8 0.6875 ]
|
|
|
|
mean value: 0.7509367185250606
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.23
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.3327291 1.3053844 1.30537295 1.2954855 1.29172778 1.30082202
|
|
1.30300689 1.31275725 1.33773541 1.36065793]
|
|
|
|
mean value: 1.3145679235458374
|
|
|
|
key: score_time
|
|
value: [0.09119868 0.0915029 0.14295626 0.09044981 0.09054136 0.09039283
|
|
0.09088302 0.09034443 0.09237862 0.09947395]
|
|
|
|
mean value: 0.09701218605041503
|
|
|
|
key: test_mcc
|
|
value: [0.96547546 0.8953202 0.92980296 0.8951918 0.85933785 1.
|
|
0.92857143 0.89342711 0.93094934 0.92857143]
|
|
|
|
mean value: 0.922664756643307
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.94736842 0.96491228 0.94736842 0.92857143 1.
|
|
0.96428571 0.94642857 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9609962406015038
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98181818 0.94736842 0.96551724 0.94915254 0.93103448 1.
|
|
0.96428571 0.94736842 0.96296296 0.96428571]
|
|
|
|
mean value: 0.961379368196865
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.93103448 0.96551724 0.93333333 0.9 1.
|
|
0.96428571 0.93103448 1. 0.96428571]
|
|
|
|
mean value: 0.9589490968801314
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.96551724 0.96551724 0.96428571 1.
|
|
0.96428571 0.96428571 0.92857143 0.96428571]
|
|
|
|
mean value: 0.9645320197044335
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98214286 0.9476601 0.96490148 0.94704433 0.92857143 1.
|
|
0.96428571 0.94642857 0.96428571 0.96428571]
|
|
|
|
mean value: 0.960960591133005
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96428571 0.9 0.93333333 0.90322581 0.87096774 1.
|
|
0.93103448 0.9 0.92857143 0.93103448]
|
|
|
|
mean value: 0.9262452990094814
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.19
|
|
|
|
Accuracy on Blind test: 0.48
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.90509057 0.91448712 0.93358636 0.96710658 0.93032479 0.91889691
|
|
0.90740323 0.95062542 0.90149426 0.91975093]
|
|
|
|
mean value: 0.924876618385315
|
|
|
|
key: score_time
|
|
value: [0.17131925 0.23245525 0.21618485 0.23573542 0.27564526 0.17861819
|
|
0.19408727 0.25907493 0.20869422 0.24864411]
|
|
|
|
mean value: 0.2220458745956421
|
|
|
|
key: test_mcc
|
|
value: [0.96547546 0.8953202 0.92980296 0.8951918 0.85714286 1.
|
|
0.92857143 0.89342711 0.93094934 0.92857143]
|
|
|
|
mean value: 0.9224452574728608
|
|
|
|
key: train_mcc
|
|
value: [0.94890036 0.95277969 0.94878539 0.95278262 0.95687833 0.94112724
|
|
0.94499908 0.95278544 0.94101052 0.94900279]
|
|
|
|
mean value: 0.9489051458683196
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.94736842 0.96491228 0.94736842 0.92857143 1.
|
|
0.96428571 0.94642857 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9609962406015038
|
|
|
|
key: train_accuracy
|
|
value: [0.97435897 0.97633136 0.97435897 0.97633136 0.97834646 0.97047244
|
|
0.97244094 0.97637795 0.97047244 0.97440945]
|
|
|
|
mean value: 0.974390035565081
|
|
|
|
key: test_fscore
|
|
value: [0.98181818 0.94736842 0.96551724 0.94915254 0.92857143 1.
|
|
0.96428571 0.94736842 0.96296296 0.96428571]
|
|
|
|
mean value: 0.9611330627781457
|
|
|
|
key: train_fscore
|
|
value: [0.97465887 0.9765625 0.97445972 0.97647059 0.9785575 0.97076023
|
|
0.97265625 0.97647059 0.97064579 0.97465887]
|
|
|
|
mean value: 0.9745900921567919
|
|
|
|
key: test_precision
|
|
value: [1. 0.93103448 0.96551724 0.93333333 0.92857143 1.
|
|
0.96428571 0.93103448 1. 0.96428571]
|
|
|
|
mean value: 0.9618062397372742
|
|
|
|
key: train_precision
|
|
value: [0.96525097 0.96899225 0.96875 0.9688716 0.96911197 0.96138996
|
|
0.96511628 0.97265625 0.96498054 0.96525097]
|
|
|
|
mean value: 0.9670370778213465
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.96551724 0.96551724 0.92857143 1.
|
|
0.96428571 0.96428571 0.92857143 0.96428571]
|
|
|
|
mean value: 0.9609605911330049
|
|
|
|
key: train_recall
|
|
value: [0.98425197 0.98425197 0.98023715 0.98418972 0.98818898 0.98031496
|
|
0.98031496 0.98031496 0.97637795 0.98425197]
|
|
|
|
mean value: 0.9822694594005789
|
|
|
|
key: test_roc_auc
|
|
value: [0.98214286 0.9476601 0.96490148 0.94704433 0.92857143 1.
|
|
0.96428571 0.94642857 0.96428571 0.96428571]
|
|
|
|
mean value: 0.960960591133005
|
|
|
|
key: train_roc_auc
|
|
value: [0.97433942 0.97631571 0.97437055 0.97634683 0.97834646 0.97047244
|
|
0.97244094 0.97637795 0.97047244 0.97440945]
|
|
|
|
mean value: 0.9743892191341695
|
|
|
|
key: test_jcc
|
|
value: [0.96428571 0.9 0.93333333 0.90322581 0.86666667 1.
|
|
0.93103448 0.9 0.92857143 0.93103448]
|
|
|
|
mean value: 0.9258151914825997
|
|
|
|
key: train_jcc
|
|
value: [0.95057034 0.95419847 0.95019157 0.95402299 0.95801527 0.94318182
|
|
0.94676806 0.95402299 0.94296578 0.95057034]
|
|
|
|
mean value: 0.9504507631247383
|
|
|
|
MCC on Blind test: 0.21
|
|
|
|
Accuracy on Blind test: 0.52
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01842809 0.00753212 0.00757003 0.00761199 0.0075736 0.00752425
|
|
0.00748992 0.00754261 0.00776935 0.00763273]
|
|
|
|
mean value: 0.008667469024658203
|
|
|
|
key: score_time
|
|
value: [0.01342535 0.00787449 0.00798821 0.007864 0.00845551 0.00779438
|
|
0.00783849 0.00777411 0.00842023 0.00789094]
|
|
|
|
mean value: 0.008532571792602538
|
|
|
|
key: test_mcc
|
|
value: [0.8953202 0.82512315 0.85960591 0.71921182 0.71611487 0.75047877
|
|
0.64285714 0.75047877 0.64450339 0.82195294]
|
|
|
|
mean value: 0.7625646979424463
|
|
|
|
key: train_mcc
|
|
value: [0.75941547 0.75148224 0.759525 0.75544282 0.77167747 0.77186893
|
|
0.77564465 0.77588525 0.78749923 0.76800824]
|
|
|
|
mean value: 0.7676449294755058
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.9122807 0.92982456 0.85964912 0.85714286 0.875
|
|
0.82142857 0.875 0.82142857 0.91071429]
|
|
|
|
mean value: 0.8809837092731829
|
|
|
|
key: train_accuracy
|
|
value: [0.87968442 0.87573964 0.87968442 0.87771203 0.88582677 0.88582677
|
|
0.88779528 0.88779528 0.89370079 0.88385827]
|
|
|
|
mean value: 0.8837623662426812
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 0.9122807 0.93103448 0.86206897 0.86206897 0.87272727
|
|
0.82142857 0.87272727 0.82758621 0.90909091]
|
|
|
|
mean value: 0.8818381769470699
|
|
|
|
key: train_fscore
|
|
value: [0.88062622 0.8762279 0.88062622 0.87698413 0.88627451 0.88715953
|
|
0.88845401 0.88932039 0.89453125 0.88543689]
|
|
|
|
mean value: 0.8845641057179913
|
|
|
|
key: test_precision
|
|
value: [0.93103448 0.89655172 0.93103448 0.86206897 0.83333333 0.88888889
|
|
0.82142857 0.88888889 0.8 0.92592593]
|
|
|
|
mean value: 0.8779155263638022
|
|
|
|
key: train_precision
|
|
value: [0.87548638 0.8745098 0.87209302 0.88047809 0.8828125 0.87692308
|
|
0.88326848 0.87739464 0.8875969 0.87356322]
|
|
|
|
mean value: 0.8784126109194028
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.92857143 0.93103448 0.86206897 0.89285714 0.85714286
|
|
0.82142857 0.85714286 0.85714286 0.89285714]
|
|
|
|
mean value: 0.8864532019704433
|
|
|
|
key: train_recall
|
|
value: [0.88582677 0.87795276 0.88932806 0.87351779 0.88976378 0.8976378
|
|
0.89370079 0.9015748 0.9015748 0.8976378 ]
|
|
|
|
mean value: 0.8908515141140954
|
|
|
|
key: test_roc_auc
|
|
value: [0.9476601 0.91256158 0.92980296 0.85960591 0.85714286 0.875
|
|
0.82142857 0.875 0.82142857 0.91071429]
|
|
|
|
mean value: 0.8810344827586207
|
|
|
|
key: train_roc_auc
|
|
value: [0.87967228 0.87573527 0.8797034 0.87770378 0.88582677 0.88582677
|
|
0.88779528 0.88779528 0.89370079 0.88385827]
|
|
|
|
mean value: 0.8837617876816781
|
|
|
|
key: test_jcc
|
|
value: [0.9 0.83870968 0.87096774 0.75757576 0.75757576 0.77419355
|
|
0.6969697 0.77419355 0.70588235 0.83333333]
|
|
|
|
mean value: 0.7909401414524755
|
|
|
|
key: train_jcc
|
|
value: [0.78671329 0.77972028 0.78671329 0.78091873 0.79577465 0.7972028
|
|
0.79929577 0.8006993 0.80918728 0.79442509]
|
|
|
|
mean value: 0.7930650467759314
|
|
|
|
MCC on Blind test: 0.28
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.06567097 0.04970622 0.05028796 0.05296206 0.05908322 0.0577023
|
|
0.05627537 0.05472136 0.06450558 0.06165719]
|
|
|
|
mean value: 0.057257223129272464
|
|
|
|
key: score_time
|
|
value: [0.00984359 0.00965667 0.00961947 0.01044655 0.01020241 0.01003504
|
|
0.01031113 0.00977564 0.01015902 0.00963831]
|
|
|
|
mean value: 0.009968781471252441
|
|
|
|
key: test_mcc
|
|
value: [0.96547546 0.8951918 0.92980296 0.8951918 0.89342711 1.
|
|
0.96490128 0.89342711 0.96490128 0.92857143]
|
|
|
|
mean value: 0.9330890233388842
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.94736842 0.96491228 0.94736842 0.94642857 1.
|
|
0.98214286 0.94642857 0.98214286 0.96428571]
|
|
|
|
mean value: 0.9663533834586466
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98181818 0.94545455 0.96551724 0.94915254 0.94736842 1.
|
|
0.98245614 0.94736842 0.98181818 0.96428571]
|
|
|
|
mean value: 0.9665239389584955
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.96296296 0.96551724 0.93333333 0.93103448 1.
|
|
0.96551724 0.93103448 1. 0.96428571]
|
|
|
|
mean value: 0.9653685458857872
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.92857143 0.96551724 0.96551724 0.96428571 1.
|
|
1. 0.96428571 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9681034482758621
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98214286 0.94704433 0.96490148 0.94704433 0.94642857 1.
|
|
0.98214286 0.94642857 0.98214286 0.96428571]
|
|
|
|
mean value: 0.966256157635468
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96428571 0.89655172 0.93333333 0.90322581 0.9 1.
|
|
0.96551724 0.9 0.96428571 0.93103448]
|
|
|
|
mean value: 0.9358234016632236
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.37
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01330185 0.04045773 0.04038882 0.04065561 0.04066443 0.04291534
|
|
0.04127479 0.04550576 0.04016542 0.04041195]
|
|
|
|
mean value: 0.03857417106628418
|
|
|
|
key: score_time
|
|
value: [0.01009989 0.01929498 0.01896739 0.01059246 0.01052094 0.01064014
|
|
0.02125072 0.01962495 0.01934791 0.01900911]
|
|
|
|
mean value: 0.015934848785400392
|
|
|
|
key: test_mcc
|
|
value: [0.85960591 0.8953202 0.85960591 0.82490815 0.75434227 0.82195294
|
|
0.71611487 0.71611487 0.64450339 0.85933785]
|
|
|
|
mean value: 0.7951806363828539
|
|
|
|
key: train_mcc
|
|
value: [0.87014673 0.87036164 0.85437653 0.85842397 0.8746939 0.85134433
|
|
0.83910959 0.85465533 0.87089581 0.85513299]
|
|
|
|
mean value: 0.8599140831820147
|
|
|
|
key: test_accuracy
|
|
value: [0.92982456 0.94736842 0.92982456 0.9122807 0.875 0.91071429
|
|
0.85714286 0.85714286 0.82142857 0.92857143]
|
|
|
|
mean value: 0.8969298245614035
|
|
|
|
key: train_accuracy
|
|
value: [0.93491124 0.93491124 0.9270217 0.92899408 0.93700787 0.92519685
|
|
0.91929134 0.92716535 0.93503937 0.92716535]
|
|
|
|
mean value: 0.9296704406032086
|
|
|
|
key: test_fscore
|
|
value: [0.92857143 0.94736842 0.93103448 0.91525424 0.88135593 0.90909091
|
|
0.86206897 0.85185185 0.82758621 0.92592593]
|
|
|
|
mean value: 0.8980108361156687
|
|
|
|
key: train_fscore
|
|
value: [0.93592233 0.93617021 0.92787524 0.92996109 0.93822394 0.92692308
|
|
0.92069632 0.92815534 0.93641618 0.92870906]
|
|
|
|
mean value: 0.9309052796774193
|
|
|
|
key: test_precision
|
|
value: [0.92857143 0.93103448 0.93103448 0.9 0.83870968 0.92592593
|
|
0.83333333 0.88461538 0.8 0.96153846]
|
|
|
|
mean value: 0.893476317692113
|
|
|
|
key: train_precision
|
|
value: [0.92337165 0.92015209 0.91538462 0.91570881 0.92045455 0.90601504
|
|
0.90494297 0.91570881 0.91698113 0.90943396]
|
|
|
|
mean value: 0.914815362183764
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.96428571 0.93103448 0.93103448 0.92857143 0.89285714
|
|
0.89285714 0.82142857 0.85714286 0.89285714]
|
|
|
|
mean value: 0.904064039408867
|
|
|
|
key: train_recall
|
|
value: [0.9488189 0.95275591 0.94071146 0.94466403 0.95669291 0.9488189
|
|
0.93700787 0.94094488 0.95669291 0.9488189 ]
|
|
|
|
mean value: 0.9475926675173508
|
|
|
|
key: test_roc_auc
|
|
value: [0.92980296 0.9476601 0.92980296 0.91194581 0.875 0.91071429
|
|
0.85714286 0.85714286 0.82142857 0.92857143]
|
|
|
|
mean value: 0.8969211822660099
|
|
|
|
key: train_roc_auc
|
|
value: [0.93488376 0.93487598 0.92704864 0.92902493 0.93700787 0.92519685
|
|
0.91929134 0.92716535 0.93503937 0.92716535]
|
|
|
|
mean value: 0.9296699449130124
|
|
|
|
key: test_jcc
|
|
value: [0.86666667 0.9 0.87096774 0.84375 0.78787879 0.83333333
|
|
0.75757576 0.74193548 0.70588235 0.86206897]
|
|
|
|
mean value: 0.8170059089719415
|
|
|
|
key: train_jcc
|
|
value: [0.87956204 0.88 0.86545455 0.86909091 0.88363636 0.86379928
|
|
0.85304659 0.86594203 0.88043478 0.86690647]
|
|
|
|
mean value: 0.8707873026527986
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01948237 0.00829005 0.0076189 0.00788856 0.00796366 0.00804019
|
|
0.00793099 0.00874853 0.00803947 0.00794411]
|
|
|
|
mean value: 0.009194684028625489
|
|
|
|
key: score_time
|
|
value: [0.00859404 0.0083673 0.00849128 0.00827527 0.00781775 0.00821924
|
|
0.00822783 0.00834608 0.00824499 0.00831842]
|
|
|
|
mean value: 0.0082902193069458
|
|
|
|
key: test_mcc
|
|
value: [0.8953202 0.82512315 0.85960591 0.78940887 0.71611487 0.75047877
|
|
0.64285714 0.75047877 0.64450339 0.82195294]
|
|
|
|
mean value: 0.7695844023759438
|
|
|
|
key: train_mcc
|
|
value: [0.75941547 0.76333276 0.76341509 0.77515483 0.77955173 0.77186893
|
|
0.78749923 0.77962424 0.78742599 0.76786532]
|
|
|
|
mean value: 0.7735153587774711
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.9122807 0.92982456 0.89473684 0.85714286 0.875
|
|
0.82142857 0.875 0.82142857 0.91071429]
|
|
|
|
mean value: 0.8844924812030075
|
|
|
|
key: train_accuracy
|
|
value: [0.87968442 0.8816568 0.8816568 0.88757396 0.88976378 0.88582677
|
|
0.89370079 0.88976378 0.89370079 0.88385827]
|
|
|
|
mean value: 0.88671861653388
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 0.9122807 0.93103448 0.89655172 0.86206897 0.87272727
|
|
0.82142857 0.87272727 0.82758621 0.90909091]
|
|
|
|
mean value: 0.8852864528091389
|
|
|
|
key: train_fscore
|
|
value: [0.88062622 0.88235294 0.88235294 0.88757396 0.89019608 0.88715953
|
|
0.89453125 0.890625 0.89411765 0.88499025]
|
|
|
|
mean value: 0.8874525831917391
|
|
|
|
key: test_precision
|
|
value: [0.93103448 0.89655172 0.93103448 0.89655172 0.83333333 0.88888889
|
|
0.82142857 0.88888889 0.8 0.92592593]
|
|
|
|
mean value: 0.8813638022258712
|
|
|
|
key: train_precision
|
|
value: [0.87548638 0.87890625 0.87548638 0.88582677 0.88671875 0.87692308
|
|
0.8875969 0.88372093 0.890625 0.87644788]
|
|
|
|
mean value: 0.8817738317127776
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.92857143 0.93103448 0.89655172 0.89285714 0.85714286
|
|
0.82142857 0.85714286 0.85714286 0.89285714]
|
|
|
|
mean value: 0.8899014778325123
|
|
|
|
key: train_recall
|
|
value: [0.88582677 0.88582677 0.88932806 0.88932806 0.89370079 0.8976378
|
|
0.9015748 0.8976378 0.8976378 0.89370079]
|
|
|
|
mean value: 0.8932199433568827
|
|
|
|
key: test_roc_auc
|
|
value: [0.9476601 0.91256158 0.92980296 0.89470443 0.85714286 0.875
|
|
0.82142857 0.875 0.82142857 0.91071429]
|
|
|
|
mean value: 0.8845443349753694
|
|
|
|
key: train_roc_auc
|
|
value: [0.87967228 0.88164856 0.88167191 0.88757742 0.88976378 0.88582677
|
|
0.89370079 0.88976378 0.89370079 0.88385827]
|
|
|
|
mean value: 0.8867184339111761
|
|
|
|
key: test_jcc
|
|
value: [0.9 0.83870968 0.87096774 0.8125 0.75757576 0.77419355
|
|
0.6969697 0.77419355 0.70588235 0.83333333]
|
|
|
|
mean value: 0.7964325656948996
|
|
|
|
key: train_jcc
|
|
value: [0.78671329 0.78947368 0.78947368 0.79787234 0.80212014 0.7972028
|
|
0.80918728 0.8028169 0.80851064 0.79370629]
|
|
|
|
mean value: 0.7977077046669985
|
|
|
|
MCC on Blind test: 0.28
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00967669 0.01218224 0.01342869 0.01291203 0.01177907 0.01280165
|
|
0.01166844 0.01290774 0.01253223 0.01339507]
|
|
|
|
mean value: 0.012328386306762695
|
|
|
|
key: score_time
|
|
value: [0.00771666 0.00980401 0.00986791 0.01040888 0.01046586 0.01049089
|
|
0.010355 0.01109457 0.01037741 0.01041889]
|
|
|
|
mean value: 0.010100007057189941
|
|
|
|
key: test_mcc
|
|
value: [0.93202124 0.8953202 0.82942474 0.86189955 0.75047877 1.
|
|
0.79385662 0.78571429 0.75047877 0.78571429]
|
|
|
|
mean value: 0.8384908463006829
|
|
|
|
key: train_mcc
|
|
value: [0.90172947 0.91347458 0.90633247 0.84245181 0.90979438 0.87444958
|
|
0.84046723 0.88616336 0.8819171 0.87366794]
|
|
|
|
mean value: 0.8830447927155709
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.94736842 0.9122807 0.92982456 0.875 1.
|
|
0.89285714 0.89285714 0.875 0.89285714]
|
|
|
|
mean value: 0.9182957393483709
|
|
|
|
key: train_accuracy
|
|
value: [0.95069034 0.9566075 0.95266272 0.92110454 0.95472441 0.93700787
|
|
0.91929134 0.94291339 0.94094488 0.93503937]
|
|
|
|
mean value: 0.9410986348599916
|
|
|
|
key: test_fscore
|
|
value: [0.96296296 0.94736842 0.90909091 0.92857143 0.87272727 1.
|
|
0.9 0.89285714 0.87719298 0.89285714]
|
|
|
|
mean value: 0.9183628262575632
|
|
|
|
key: train_fscore
|
|
value: [0.9500998 0.9561753 0.951417 0.921875 0.95409182 0.936
|
|
0.92190476 0.94368932 0.94117647 0.93785311]
|
|
|
|
mean value: 0.941428257984581
|
|
|
|
key: test_precision
|
|
value: [1. 0.93103448 0.96153846 0.96296296 0.88888889 1.
|
|
0.84375 0.89285714 0.86206897 0.89285714]
|
|
|
|
mean value: 0.9235958047380461
|
|
|
|
key: train_precision
|
|
value: [0.96356275 0.96774194 0.97510373 0.91119691 0.96761134 0.95121951
|
|
0.89298893 0.93103448 0.9375 0.89891697]
|
|
|
|
mean value: 0.9396876562541508
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.96428571 0.86206897 0.89655172 0.85714286 1.
|
|
0.96428571 0.89285714 0.89285714 0.89285714]
|
|
|
|
mean value: 0.9151477832512316
|
|
|
|
key: train_recall
|
|
value: [0.93700787 0.94488189 0.92885375 0.93280632 0.94094488 0.92125984
|
|
0.95275591 0.95669291 0.94488189 0.98031496]
|
|
|
|
mean value: 0.9440400236531699
|
|
|
|
key: test_roc_auc
|
|
value: [0.96428571 0.9476601 0.91317734 0.93041872 0.875 1.
|
|
0.89285714 0.89285714 0.875 0.89285714]
|
|
|
|
mean value: 0.9184113300492611
|
|
|
|
key: train_roc_auc
|
|
value: [0.95071738 0.95663067 0.95261585 0.92112757 0.95472441 0.93700787
|
|
0.91929134 0.94291339 0.94094488 0.93503937]
|
|
|
|
mean value: 0.9411012729140082
|
|
|
|
key: test_jcc
|
|
value: [0.92857143 0.9 0.83333333 0.86666667 0.77419355 1.
|
|
0.81818182 0.80645161 0.78125 0.80645161]
|
|
|
|
mean value: 0.8515100020946795
|
|
|
|
key: train_jcc
|
|
value: [0.90494297 0.91603053 0.90733591 0.85507246 0.91221374 0.87969925
|
|
0.85512367 0.89338235 0.88888889 0.88297872]
|
|
|
|
mean value: 0.8895668499958933
|
|
|
|
MCC on Blind test: 0.19
|
|
|
|
Accuracy on Blind test: 0.58
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01474547 0.01279259 0.01564717 0.01446366 0.01306367 0.01507711
|
|
0.01365781 0.01378751 0.01366973 0.01418114]
|
|
|
|
mean value: 0.014108586311340331
|
|
|
|
key: score_time
|
|
value: [0.01045132 0.01075387 0.0109086 0.01076341 0.01093888 0.01090598
|
|
0.01117682 0.01138139 0.01105213 0.01108289]
|
|
|
|
mean value: 0.010941529273986816
|
|
|
|
key: test_mcc
|
|
value: [0.8951918 0.92980296 0.8951918 0.8953202 0.64951905 0.8660254
|
|
0.70082556 0.82195294 0.89342711 0.92857143]
|
|
|
|
mean value: 0.8475828256723696
|
|
|
|
key: train_mcc
|
|
value: [0.91324443 0.8974355 0.9215681 0.93352251 0.878014 0.86150531
|
|
0.84768598 0.89200643 0.92554839 0.91732994]
|
|
|
|
mean value: 0.8987860594733551
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.96491228 0.94736842 0.94736842 0.82142857 0.92857143
|
|
0.83928571 0.91071429 0.94642857 0.96428571]
|
|
|
|
mean value: 0.9217731829573934
|
|
|
|
key: train_accuracy
|
|
value: [0.9566075 0.94871795 0.96055227 0.96646943 0.93897638 0.92913386
|
|
0.92125984 0.94488189 0.96259843 0.95866142]
|
|
|
|
mean value: 0.948785895106307
|
|
|
|
key: test_fscore
|
|
value: [0.94545455 0.96428571 0.94915254 0.94736842 0.83333333 0.93333333
|
|
0.85714286 0.9122807 0.94545455 0.96428571]
|
|
|
|
mean value: 0.9252091708469943
|
|
|
|
key: train_fscore
|
|
value: [0.95652174 0.9488189 0.96108949 0.96579477 0.93933464 0.93207547
|
|
0.92537313 0.94676806 0.96207585 0.95857988]
|
|
|
|
mean value: 0.9496431934331271
|
|
|
|
key: test_precision
|
|
value: [0.96296296 0.96428571 0.93333333 0.96428571 0.78125 0.875
|
|
0.77142857 0.89655172 0.96296296 0.96428571]
|
|
|
|
mean value: 0.9076346697682904
|
|
|
|
key: train_precision
|
|
value: [0.96031746 0.9488189 0.94636015 0.98360656 0.93385214 0.89492754
|
|
0.87943262 0.91544118 0.9757085 0.96047431]
|
|
|
|
mean value: 0.9398939355807465
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.96428571 0.96551724 0.93103448 0.89285714 1.
|
|
0.96428571 0.92857143 0.92857143 0.96428571]
|
|
|
|
mean value: 0.9467980295566503
|
|
|
|
key: train_recall
|
|
value: [0.95275591 0.9488189 0.97628458 0.9486166 0.94488189 0.97244094
|
|
0.97637795 0.98031496 0.9488189 0.95669291]
|
|
|
|
mean value: 0.9606003547975476
|
|
|
|
key: test_roc_auc
|
|
value: [0.94704433 0.96490148 0.94704433 0.9476601 0.82142857 0.92857143
|
|
0.83928571 0.91071429 0.94642857 0.96428571]
|
|
|
|
mean value: 0.9217364532019705
|
|
|
|
key: train_roc_auc
|
|
value: [0.95661511 0.94871775 0.96058324 0.96643428 0.93897638 0.92913386
|
|
0.92125984 0.94488189 0.96259843 0.95866142]
|
|
|
|
mean value: 0.9487862189163113
|
|
|
|
key: test_jcc
|
|
value: [0.89655172 0.93103448 0.90322581 0.9 0.71428571 0.875
|
|
0.75 0.83870968 0.89655172 0.93103448]
|
|
|
|
mean value: 0.8636393611949785
|
|
|
|
key: train_jcc
|
|
value: [0.91666667 0.90262172 0.92509363 0.93385214 0.88560886 0.87279152
|
|
0.86111111 0.89891697 0.92692308 0.92045455]
|
|
|
|
mean value: 0.904404023907068
|
|
|
|
MCC on Blind test: 0.32
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.1140337 0.10048437 0.10199213 0.10053515 0.10115218 0.10382915
|
|
0.10217881 0.09598231 0.0961473 0.09758639]
|
|
|
|
mean value: 0.10139214992523193
|
|
|
|
key: score_time
|
|
value: [0.01537442 0.01496673 0.01581311 0.01557422 0.01550794 0.01505017
|
|
0.01430726 0.01472378 0.01485848 0.01430631]
|
|
|
|
mean value: 0.01504824161529541
|
|
|
|
key: test_mcc
|
|
value: [0.96547546 0.92980296 0.92980296 0.93202124 0.82618439 1.
|
|
0.96490128 0.93094934 0.92857143 0.85933785]
|
|
|
|
mean value: 0.9267046893623845
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.96491228 0.96491228 0.96491228 0.91071429 1.
|
|
0.98214286 0.96428571 0.96428571 0.92857143]
|
|
|
|
mean value: 0.9627192982456141
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98181818 0.96428571 0.96551724 0.96666667 0.91525424 1.
|
|
0.98245614 0.96551724 0.96428571 0.93103448]
|
|
|
|
mean value: 0.9636835620212532
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.96428571 0.96551724 0.93548387 0.87096774 1.
|
|
0.96551724 0.93333333 0.96428571 0.9 ]
|
|
|
|
mean value: 0.9499390857566609
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.96551724 1. 0.96428571 1.
|
|
1. 1. 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9786945812807882
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98214286 0.96490148 0.96490148 0.96428571 0.91071429 1.
|
|
0.98214286 0.96428571 0.96428571 0.92857143]
|
|
|
|
mean value: 0.9626231527093597
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96428571 0.93103448 0.93333333 0.93548387 0.84375 1.
|
|
0.96551724 0.93333333 0.93103448 0.87096774]
|
|
|
|
mean value: 0.9308740200752158
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.39
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03926826 0.03829074 0.04713559 0.05223989 0.04820871 0.04241133
|
|
0.03772259 0.03749561 0.03793573 0.04857802]
|
|
|
|
mean value: 0.042928647994995114
|
|
|
|
key: score_time
|
|
value: [0.0239563 0.02649641 0.02141261 0.03752351 0.02890968 0.03851295
|
|
0.02284622 0.0233736 0.02389741 0.0230751 ]
|
|
|
|
mean value: 0.02700037956237793
|
|
|
|
key: test_mcc
|
|
value: [0.96547546 0.92980296 0.8953202 0.93202124 0.82618439 1.
|
|
0.96490128 0.89342711 0.93094934 0.92857143]
|
|
|
|
mean value: 0.9266653398520664
|
|
|
|
key: train_mcc
|
|
value: [0.99214142 0.99211042 0.99214118 1. 0.98819663 0.98428248
|
|
0.98825791 1. 0.99212598 0.98819663]
|
|
|
|
mean value: 0.991745267193298
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.96491228 0.94736842 0.96491228 0.91071429 1.
|
|
0.98214286 0.94642857 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9627506265664161
|
|
|
|
key: train_accuracy
|
|
value: [0.99605523 0.99605523 0.99605523 1. 0.99409449 0.99212598
|
|
0.99409449 1. 0.99606299 0.99409449]
|
|
|
|
mean value: 0.9958638121418255
|
|
|
|
key: test_fscore
|
|
value: [0.98181818 0.96428571 0.94736842 0.96666667 0.91525424 1.
|
|
0.98245614 0.94736842 0.96296296 0.96428571]
|
|
|
|
mean value: 0.9632466459763516
|
|
|
|
key: train_fscore
|
|
value: [0.99604743 0.99606299 0.99603175 1. 0.99408284 0.99209486
|
|
0.99405941 1. 0.99606299 0.99408284]
|
|
|
|
mean value: 0.99585251091878
|
|
|
|
key: test_precision
|
|
value: [1. 0.96428571 0.96428571 0.93548387 0.87096774 1.
|
|
0.96551724 0.93103448 1. 0.96428571]
|
|
|
|
mean value: 0.9595860479898299
|
|
|
|
key: train_precision
|
|
value: [1. 0.99606299 1. 1. 0.99604743 0.99603175
|
|
1. 1. 0.99606299 0.99604743]
|
|
|
|
mean value: 0.9980252591943793
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.93103448 1. 0.96428571 1.
|
|
1. 0.96428571 0.92857143 0.96428571]
|
|
|
|
mean value: 0.968103448275862
|
|
|
|
key: train_recall
|
|
value: [0.99212598 0.99606299 0.99209486 1. 0.99212598 0.98818898
|
|
0.98818898 1. 0.99606299 0.99212598]
|
|
|
|
mean value: 0.9936976751423858
|
|
|
|
key: test_roc_auc
|
|
value: [0.98214286 0.96490148 0.9476601 0.96428571 0.91071429 1.
|
|
0.98214286 0.94642857 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9626847290640395
|
|
|
|
key: train_roc_auc
|
|
value: [0.99606299 0.99605521 0.99604743 1. 0.99409449 0.99212598
|
|
0.99409449 1. 0.99606299 0.99409449]
|
|
|
|
mean value: 0.9958638075378917
|
|
|
|
key: test_jcc
|
|
value: [0.96428571 0.93103448 0.9 0.93548387 0.84375 1.
|
|
0.96551724 0.9 0.92857143 0.93103448]
|
|
|
|
mean value: 0.9299677220721436
|
|
|
|
key: train_jcc
|
|
value: [0.99212598 0.99215686 0.99209486 1. 0.98823529 0.98431373
|
|
0.98818898 1. 0.99215686 0.98823529]
|
|
|
|
mean value: 0.9917507861505687
|
|
|
|
MCC on Blind test: 0.14
|
|
|
|
Accuracy on Blind test: 0.37
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.16959524 0.2451508 0.17405295 0.16930294 0.16698885 0.11223626
|
|
0.10615253 0.17977715 0.10680389 0.14715362]
|
|
|
|
mean value: 0.1577214241027832
|
|
|
|
key: score_time
|
|
value: [0.0270555 0.02043033 0.02078581 0.0202899 0.02019739 0.01269197
|
|
0.01305366 0.02091789 0.01312304 0.02039123]
|
|
|
|
mean value: 0.018893671035766602
|
|
|
|
key: test_mcc
|
|
value: [0.8953202 0.86189955 0.82512315 0.82490815 0.75047877 0.78571429
|
|
0.64450339 0.75047877 0.64951905 0.85714286]
|
|
|
|
mean value: 0.7845088175007775
|
|
|
|
key: train_mcc
|
|
value: [0.85051239 0.85019923 0.84231823 0.8428767 0.85465533 0.84293789
|
|
0.84677832 0.85513299 0.87062545 0.84677832]
|
|
|
|
mean value: 0.8502814833818734
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.92982456 0.9122807 0.9122807 0.875 0.89285714
|
|
0.82142857 0.875 0.82142857 0.92857143]
|
|
|
|
mean value: 0.8916040100250626
|
|
|
|
key: train_accuracy
|
|
value: [0.92504931 0.92504931 0.92110454 0.92110454 0.92716535 0.92125984
|
|
0.92322835 0.92716535 0.93503937 0.92322835]
|
|
|
|
mean value: 0.924939430648092
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 0.93103448 0.9122807 0.91525424 0.87719298 0.89285714
|
|
0.82758621 0.87272727 0.83333333 0.92857143]
|
|
|
|
mean value: 0.8938206209695644
|
|
|
|
key: train_fscore
|
|
value: [0.92635659 0.92578125 0.92156863 0.92248062 0.92815534 0.92248062
|
|
0.92427184 0.92870906 0.93617021 0.92427184]
|
|
|
|
mean value: 0.9260246004677202
|
|
|
|
key: test_precision
|
|
value: [0.93103448 0.9 0.92857143 0.9 0.86206897 0.89285714
|
|
0.8 0.88888889 0.78125 0.92857143]
|
|
|
|
mean value: 0.8813242337164751
|
|
|
|
key: train_precision
|
|
value: [0.91221374 0.91860465 0.91439689 0.90494297 0.91570881 0.90839695
|
|
0.91187739 0.90943396 0.92015209 0.91187739]
|
|
|
|
mean value: 0.9127604846176163
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.89655172 0.93103448 0.89285714 0.89285714
|
|
0.85714286 0.85714286 0.89285714 0.92857143]
|
|
|
|
mean value: 0.9077586206896552
|
|
|
|
key: train_recall
|
|
value: [0.94094488 0.93307087 0.92885375 0.94071146 0.94094488 0.93700787
|
|
0.93700787 0.9488189 0.95275591 0.93700787]
|
|
|
|
mean value: 0.9397124272509414
|
|
|
|
key: test_roc_auc
|
|
value: [0.9476601 0.93041872 0.91256158 0.91194581 0.875 0.89285714
|
|
0.82142857 0.875 0.82142857 0.92857143]
|
|
|
|
mean value: 0.8916871921182267
|
|
|
|
key: train_roc_auc
|
|
value: [0.9250179 0.92503346 0.92111979 0.92114313 0.92716535 0.92125984
|
|
0.92322835 0.92716535 0.93503937 0.92322835]
|
|
|
|
mean value: 0.9249400890106128
|
|
|
|
key: test_jcc
|
|
value: [0.9 0.87096774 0.83870968 0.84375 0.78125 0.80645161
|
|
0.70588235 0.77419355 0.71428571 0.86666667]
|
|
|
|
mean value: 0.8102157314538718
|
|
|
|
key: train_jcc
|
|
value: [0.86281588 0.86181818 0.85454545 0.85611511 0.86594203 0.85611511
|
|
0.85920578 0.86690647 0.88 0.85920578]
|
|
|
|
mean value: 0.862266979281973
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.26614237 0.2468183 0.2485559 0.24616241 0.24786353 0.24828482
|
|
0.25354338 0.26733065 0.25087976 0.24782729]
|
|
|
|
mean value: 0.25234084129333495
|
|
|
|
key: score_time
|
|
value: [0.00884628 0.00858235 0.00861478 0.00854993 0.0087781 0.00885868
|
|
0.00969672 0.00946689 0.0085597 0.00855279]
|
|
|
|
mean value: 0.008850622177124023
|
|
|
|
key: test_mcc
|
|
value: [0.96547546 0.92980296 0.96547546 0.93202124 0.82195294 1.
|
|
0.96490128 0.89342711 0.96490128 0.92857143]
|
|
|
|
mean value: 0.9366529157151744
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.96491228 0.98245614 0.96491228 0.91071429 1.
|
|
0.98214286 0.94642857 0.98214286 0.96428571]
|
|
|
|
mean value: 0.9680451127819548
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98181818 0.96428571 0.98305085 0.96666667 0.9122807 1.
|
|
0.98245614 0.94736842 0.98181818 0.96428571]
|
|
|
|
mean value: 0.9684030569489981
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.96428571 0.96666667 0.93548387 0.89655172 1.
|
|
0.96551724 0.93103448 1. 0.96428571]
|
|
|
|
mean value: 0.9623825414481699
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 1. 1. 0.92857143 1.
|
|
1. 0.96428571 0.96428571 0.96428571]
|
|
|
|
mean value: 0.975
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98214286 0.96490148 0.98214286 0.96428571 0.91071429 1.
|
|
0.98214286 0.94642857 0.98214286 0.96428571]
|
|
|
|
mean value: 0.9679187192118227
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96428571 0.93103448 0.96666667 0.93548387 0.83870968 1.
|
|
0.96551724 0.9 0.96428571 0.93103448]
|
|
|
|
mean value: 0.9397017850521744
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.3
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01340175 0.01410508 0.01441169 0.01409388 0.02937245 0.01492167
|
|
0.0153625 0.01417089 0.01423931 0.01514316]
|
|
|
|
mean value: 0.015922236442565917
|
|
|
|
key: score_time
|
|
value: [0.01146483 0.0109961 0.01089525 0.01093698 0.01172209 0.01099157
|
|
0.01097131 0.01087904 0.01158166 0.01099563]
|
|
|
|
mean value: 0.01114344596862793
|
|
|
|
key: test_mcc
|
|
value: [0.76550573 0.75462449 0.79161589 0.68850906 0.50518149 0.47187011
|
|
0.68250015 0.67900461 0.79385662 0.67900461]
|
|
|
|
mean value: 0.6811672742900674
|
|
|
|
key: train_mcc
|
|
value: [0.79484005 0.76863111 0.78816439 0.79111205 0.71433965 0.76123378
|
|
0.77349899 0.80474782 0.76277007 0.76987347]
|
|
|
|
mean value: 0.7729211371212047
|
|
|
|
key: test_accuracy
|
|
value: [0.87719298 0.87719298 0.89473684 0.84210526 0.75 0.73214286
|
|
0.83928571 0.83928571 0.89285714 0.83928571]
|
|
|
|
mean value: 0.8384085213032582
|
|
|
|
key: train_accuracy
|
|
value: [0.89546351 0.88362919 0.89151874 0.89349112 0.84251969 0.87795276
|
|
0.88385827 0.9015748 0.87795276 0.88385827]
|
|
|
|
mean value: 0.8831819099535635
|
|
|
|
key: test_fscore
|
|
value: [0.8627451 0.87272727 0.89285714 0.83636364 0.73076923 0.75409836
|
|
0.83018868 0.83636364 0.88461538 0.83636364]
|
|
|
|
mean value: 0.8337092078000177
|
|
|
|
key: train_fscore
|
|
value: [0.89026915 0.88032454 0.88469602 0.8875 0.81651376 0.88475836
|
|
0.87631027 0.89837398 0.86919831 0.8793456 ]
|
|
|
|
mean value: 0.8767290009085706
|
|
|
|
key: test_precision
|
|
value: [0.95652174 0.88888889 0.92592593 0.88461538 0.79166667 0.6969697
|
|
0.88 0.85185185 0.95833333 0.85185185]
|
|
|
|
mean value: 0.8686625339234035
|
|
|
|
key: train_precision
|
|
value: [0.93886463 0.90794979 0.94196429 0.93832599 0.97802198 0.83802817
|
|
0.93721973 0.92857143 0.93636364 0.91489362]
|
|
|
|
mean value: 0.9260203256453761
|
|
|
|
key: test_recall
|
|
value: [0.78571429 0.85714286 0.86206897 0.79310345 0.67857143 0.82142857
|
|
0.78571429 0.82142857 0.82142857 0.82142857]
|
|
|
|
mean value: 0.8048029556650246
|
|
|
|
key: train_recall
|
|
value: [0.84645669 0.85433071 0.83399209 0.84189723 0.7007874 0.93700787
|
|
0.82283465 0.87007874 0.81102362 0.84645669]
|
|
|
|
mean value: 0.8364865706015997
|
|
|
|
key: test_roc_auc
|
|
value: [0.87561576 0.87684729 0.8953202 0.8429803 0.75 0.73214286
|
|
0.83928571 0.83928571 0.89285714 0.83928571]
|
|
|
|
mean value: 0.8383620689655172
|
|
|
|
key: train_roc_auc
|
|
value: [0.89556036 0.88368709 0.8914055 0.89338956 0.84251969 0.87795276
|
|
0.88385827 0.9015748 0.87795276 0.88385827]
|
|
|
|
mean value: 0.8831759048893593
|
|
|
|
key: test_jcc
|
|
value: [0.75862069 0.77419355 0.80645161 0.71875 0.57575758 0.60526316
|
|
0.70967742 0.71875 0.79310345 0.71875 ]
|
|
|
|
mean value: 0.7179317452228509
|
|
|
|
key: train_jcc
|
|
value: [0.80223881 0.78623188 0.79323308 0.79775281 0.68992248 0.79333333
|
|
0.77985075 0.81549815 0.76865672 0.78467153]
|
|
|
|
mean value: 0.7811389546191971
|
|
|
|
MCC on Blind test: 0.3
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01172638 0.01136661 0.01153398 0.02457762 0.01147556 0.01133466
|
|
0.0113709 0.02615333 0.0302875 0.0303762 ]
|
|
|
|
mean value: 0.018020272254943848
|
|
|
|
key: score_time
|
|
value: [0.01069403 0.01067114 0.01076221 0.01973557 0.01065874 0.01060033
|
|
0.01062369 0.01272726 0.01075745 0.01385522]
|
|
|
|
mean value: 0.012108564376831055
|
|
|
|
key: test_mcc
|
|
value: [0.8953202 0.8953202 0.85960591 0.79110556 0.71611487 0.82195294
|
|
0.67900461 0.71611487 0.68250015 0.82195294]
|
|
|
|
mean value: 0.7878992256362354
|
|
|
|
key: train_mcc
|
|
value: [0.83472439 0.83904026 0.81877755 0.82280791 0.83123063 0.8154727
|
|
0.81142619 0.82718204 0.8431734 0.81527029]
|
|
|
|
mean value: 0.8259105358013283
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.94736842 0.92982456 0.89473684 0.85714286 0.91071429
|
|
0.83928571 0.85714286 0.83928571 0.91071429]
|
|
|
|
mean value: 0.8933583959899749
|
|
|
|
key: train_accuracy
|
|
value: [0.91715976 0.91913215 0.90927022 0.9112426 0.91535433 0.90748031
|
|
0.90551181 0.91338583 0.92125984 0.90748031]
|
|
|
|
mean value: 0.9127277174672692
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 0.94736842 0.93103448 0.9 0.86206897 0.90909091
|
|
0.84210526 0.85185185 0.84745763 0.90909091]
|
|
|
|
mean value: 0.8947436850691334
|
|
|
|
key: train_fscore
|
|
value: [0.91860465 0.92100193 0.91015625 0.9122807 0.91682785 0.90909091
|
|
0.90697674 0.91472868 0.92277992 0.90873786]
|
|
|
|
mean value: 0.9141185505002607
|
|
|
|
key: test_precision
|
|
value: [0.93103448 0.93103448 0.93103448 0.87096774 0.83333333 0.92592593
|
|
0.82758621 0.88461538 0.80645161 0.92592593]
|
|
|
|
mean value: 0.8867909579811694
|
|
|
|
key: train_precision
|
|
value: [0.90458015 0.90188679 0.8996139 0.9 0.90114068 0.89353612
|
|
0.89312977 0.90076336 0.90530303 0.89655172]
|
|
|
|
mean value: 0.899650553503409
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.93103448 0.93103448 0.89285714 0.89285714
|
|
0.85714286 0.82142857 0.89285714 0.89285714]
|
|
|
|
mean value: 0.904064039408867
|
|
|
|
key: train_recall
|
|
value: [0.93307087 0.94094488 0.92094862 0.92490119 0.93307087 0.92519685
|
|
0.92125984 0.92913386 0.94094488 0.92125984]
|
|
|
|
mean value: 0.9290731692135321
|
|
|
|
key: test_roc_auc
|
|
value: [0.9476601 0.9476601 0.92980296 0.89408867 0.85714286 0.91071429
|
|
0.83928571 0.85714286 0.83928571 0.91071429]
|
|
|
|
mean value: 0.8933497536945814
|
|
|
|
key: train_roc_auc
|
|
value: [0.91712832 0.91908904 0.90929321 0.91126949 0.91535433 0.90748031
|
|
0.90551181 0.91338583 0.92125984 0.90748031]
|
|
|
|
mean value: 0.9127252497588
|
|
|
|
key: test_jcc
|
|
value: [0.9 0.9 0.87096774 0.81818182 0.75757576 0.83333333
|
|
0.72727273 0.74193548 0.73529412 0.83333333]
|
|
|
|
mean value: 0.811789431315048
|
|
|
|
key: train_jcc
|
|
value: [0.84946237 0.85357143 0.83512545 0.83870968 0.84642857 0.83333333
|
|
0.82978723 0.84285714 0.85663082 0.83274021]
|
|
|
|
mean value: 0.8418646239168347
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_config.py:163: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
ros_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_config.py:166: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
ros_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.11566854 0.13413811 0.27206016 0.20076489 0.19601989 0.19559526
|
|
0.2167592 0.19660378 0.19614172 0.1994133 ]
|
|
|
|
mean value: 0.19231648445129396
|
|
|
|
key: score_time
|
|
value: [0.01090479 0.02036548 0.0200057 0.02095532 0.0202384 0.01987505
|
|
0.0205245 0.01911926 0.01085591 0.01083922]
|
|
|
|
mean value: 0.017368364334106445
|
|
|
|
key: test_mcc
|
|
value: [0.85960591 0.8953202 0.85960591 0.82490815 0.75434227 0.82195294
|
|
0.71611487 0.71611487 0.68250015 0.85933785]
|
|
|
|
mean value: 0.7989803124894794
|
|
|
|
key: train_mcc
|
|
value: [0.86225372 0.86654135 0.85053095 0.85053095 0.87062545 0.85513299
|
|
0.83505996 0.85465533 0.86710997 0.8431734 ]
|
|
|
|
mean value: 0.8555614064171675
|
|
|
|
key: test_accuracy
|
|
value: [0.92982456 0.94736842 0.92982456 0.9122807 0.875 0.91071429
|
|
0.85714286 0.85714286 0.83928571 0.92857143]
|
|
|
|
mean value: 0.8987155388471177
|
|
|
|
key: train_accuracy
|
|
value: [0.93096647 0.93293886 0.92504931 0.92504931 0.93503937 0.92716535
|
|
0.91732283 0.92716535 0.93307087 0.92125984]
|
|
|
|
mean value: 0.927502756682042
|
|
|
|
key: test_fscore
|
|
value: [0.92857143 0.94736842 0.93103448 0.91525424 0.88135593 0.90909091
|
|
0.86206897 0.85185185 0.84745763 0.92592593]
|
|
|
|
mean value: 0.8999979781378779
|
|
|
|
key: train_fscore
|
|
value: [0.93203883 0.93436293 0.92607004 0.92607004 0.93617021 0.92870906
|
|
0.91860465 0.92815534 0.93461538 0.92277992]
|
|
|
|
mean value: 0.9287576414141969
|
|
|
|
key: test_precision
|
|
value: [0.92857143 0.93103448 0.93103448 0.9 0.83870968 0.92592593
|
|
0.83333333 0.88461538 0.80645161 0.96153846]
|
|
|
|
mean value: 0.8941214789824357
|
|
|
|
key: train_precision
|
|
value: [0.91954023 0.91666667 0.91187739 0.91187739 0.92015209 0.90943396
|
|
0.90458015 0.91570881 0.91353383 0.90530303]
|
|
|
|
mean value: 0.9128673569164447
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.96428571 0.93103448 0.93103448 0.92857143 0.89285714
|
|
0.89285714 0.82142857 0.89285714 0.89285714]
|
|
|
|
mean value: 0.9076354679802956
|
|
|
|
key: train_recall
|
|
value: [0.94488189 0.95275591 0.94071146 0.94071146 0.95275591 0.9488189
|
|
0.93307087 0.94094488 0.95669291 0.94094488]
|
|
|
|
mean value: 0.9452289066633469
|
|
|
|
key: test_roc_auc
|
|
value: [0.92980296 0.9476601 0.92980296 0.91194581 0.875 0.91071429
|
|
0.85714286 0.85714286 0.83928571 0.92857143]
|
|
|
|
mean value: 0.8987068965517242
|
|
|
|
key: train_roc_auc
|
|
value: [0.93093897 0.93289969 0.92508014 0.92508014 0.93503937 0.92716535
|
|
0.91732283 0.92716535 0.93307087 0.92125984]
|
|
|
|
mean value: 0.927502256387912
|
|
|
|
key: test_jcc
|
|
value: [0.86666667 0.9 0.87096774 0.84375 0.78787879 0.83333333
|
|
0.75757576 0.74193548 0.73529412 0.86206897]
|
|
|
|
mean value: 0.8199470854425297
|
|
|
|
key: train_jcc
|
|
value: [0.87272727 0.87681159 0.86231884 0.86231884 0.88 0.86690647
|
|
0.84946237 0.86594203 0.87725632 0.85663082]
|
|
|
|
mean value: 0.8670374559548931
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02059269 0.0403645 0.0256114 0.04780722 0.05848503 0.02307177
|
|
0.02302694 0.02297401 0.02418137 0.02429724]
|
|
|
|
mean value: 0.031041216850280762
|
|
|
|
key: score_time
|
|
value: [0.0107677 0.01078558 0.01102161 0.01087213 0.01087928 0.01069665
|
|
0.01075411 0.01068592 0.0107131 0.01078057]
|
|
|
|
mean value: 0.010795664787292481
|
|
|
|
key: test_mcc
|
|
value: [0.63745526 0.78410665 0.60000053 0.89139151 0.78410665 0.89139151
|
|
0.89153439 0.86334835 0.89139151 0.81854376]
|
|
|
|
mean value: 0.805327011589605
|
|
|
|
key: train_mcc
|
|
value: [0.83096715 0.83450632 0.8435716 0.82679606 0.83450632 0.8224719
|
|
0.83074746 0.83041633 0.82643766 0.82660248]
|
|
|
|
mean value: 0.830702327712044
|
|
|
|
key: test_accuracy
|
|
value: [0.81818182 0.89090909 0.8 0.94545455 0.89090909 0.94545455
|
|
0.94545455 0.92727273 0.94545455 0.90909091]
|
|
|
|
mean value: 0.9018181818181819
|
|
|
|
key: train_accuracy
|
|
value: [0.91515152 0.91717172 0.92121212 0.91313131 0.91717172 0.91111111
|
|
0.91515152 0.91515152 0.91313131 0.91313131]
|
|
|
|
mean value: 0.9151515151515152
|
|
|
|
key: test_fscore
|
|
value: [0.80769231 0.89285714 0.79245283 0.94339623 0.89285714 0.94736842
|
|
0.94545455 0.93333333 0.94736842 0.9122807 ]
|
|
|
|
mean value: 0.9015061072657895
|
|
|
|
key: train_fscore
|
|
value: [0.91699605 0.91816367 0.92337917 0.91485149 0.91816367 0.912
|
|
0.91633466 0.91566265 0.91382766 0.91417166]
|
|
|
|
mean value: 0.9163550676695618
|
|
|
|
key: test_precision
|
|
value: [0.84 0.86206897 0.80769231 0.96153846 0.86206897 0.93103448
|
|
0.96296296 0.875 0.93103448 0.89655172]
|
|
|
|
mean value: 0.8929952352883387
|
|
|
|
key: train_precision
|
|
value: [0.89922481 0.90909091 0.90038314 0.89883268 0.90909091 0.90118577
|
|
0.90196078 0.90836653 0.9047619 0.9015748 ]
|
|
|
|
mean value: 0.903447224781149
|
|
|
|
key: test_recall
|
|
value: [0.77777778 0.92592593 0.77777778 0.92592593 0.92592593 0.96428571
|
|
0.92857143 1. 0.96428571 0.92857143]
|
|
|
|
mean value: 0.9119047619047619
|
|
|
|
key: train_recall
|
|
value: [0.93548387 0.92741935 0.94758065 0.93145161 0.92741935 0.92307692
|
|
0.93117409 0.92307692 0.92307692 0.92712551]
|
|
|
|
mean value: 0.9296885203082147
|
|
|
|
key: test_roc_auc
|
|
value: [0.81746032 0.89153439 0.79960317 0.94510582 0.89153439 0.94510582
|
|
0.9457672 0.92592593 0.94510582 0.90873016]
|
|
|
|
mean value: 0.9015873015873016
|
|
|
|
key: train_roc_auc
|
|
value: [0.91511036 0.91715097 0.92115874 0.91309423 0.91715097 0.91113524
|
|
0.91518382 0.91516749 0.91315136 0.91315953]
|
|
|
|
mean value: 0.9151462713856602
|
|
|
|
key: test_jcc
|
|
value: [0.67741935 0.80645161 0.65625 0.89285714 0.80645161 0.9
|
|
0.89655172 0.875 0.9 0.83870968]
|
|
|
|
mean value: 0.8249691125059591
|
|
|
|
key: train_jcc
|
|
value: [0.84671533 0.84870849 0.85766423 0.84306569 0.84870849 0.83823529
|
|
0.84558824 0.84444444 0.84132841 0.84191176]
|
|
|
|
mean value: 0.8456370381490419
|
|
|
|
MCC on Blind test: 0.28
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.64936924 0.66852713 0.79406118 0.67012405 0.69154596 0.8323493
|
|
0.78064322 0.67432141 0.84689832 0.68367529]
|
|
|
|
mean value: 0.7291515111923218
|
|
|
|
key: score_time
|
|
value: [0.01181173 0.01206112 0.01192927 0.01092577 0.01228356 0.01233196
|
|
0.01223803 0.01151776 0.0125103 0.01225424]
|
|
|
|
mean value: 0.011986374855041504
|
|
|
|
key: test_mcc
|
|
value: [0.82269299 0.92980214 0.63745526 0.85449735 0.89642146 0.96423926
|
|
0.92980214 0.81878307 0.96423926 0.8565805 ]
|
|
|
|
mean value: 0.8674513428146936
|
|
|
|
key: train_mcc
|
|
value: [0.92727243 0.93132101 0.94355919 0.93535327 0.92730389 0.93131989
|
|
0.92324017 0.94355551 0.94346399 0.93538276]
|
|
|
|
mean value: 0.9341772124772538
|
|
|
|
key: test_accuracy
|
|
value: [0.90909091 0.96363636 0.81818182 0.92727273 0.94545455 0.98181818
|
|
0.96363636 0.90909091 0.98181818 0.92727273]
|
|
|
|
mean value: 0.9327272727272727
|
|
|
|
key: train_accuracy
|
|
value: [0.96363636 0.96565657 0.97171717 0.96767677 0.96363636 0.96565657
|
|
0.96161616 0.97171717 0.97171717 0.96767677]
|
|
|
|
mean value: 0.9670707070707071
|
|
|
|
key: test_fscore
|
|
value: [0.90196078 0.96428571 0.80769231 0.92592593 0.94736842 0.98245614
|
|
0.96296296 0.90909091 0.98245614 0.93103448]
|
|
|
|
mean value: 0.9315233788784553
|
|
|
|
key: train_fscore
|
|
value: [0.96370968 0.96565657 0.97154472 0.96774194 0.96356275 0.96551724
|
|
0.96161616 0.97142857 0.97154472 0.96747967]
|
|
|
|
mean value: 0.9669802011711329
|
|
|
|
key: test_precision
|
|
value: [0.95833333 0.93103448 0.84 0.92592593 0.9 0.96551724
|
|
1. 0.92592593 0.96551724 0.9 ]
|
|
|
|
mean value: 0.9312254150702427
|
|
|
|
key: train_precision
|
|
value: [0.96370968 0.96761134 0.9795082 0.96774194 0.96747967 0.96747967
|
|
0.95967742 0.97942387 0.9755102 0.97142857]
|
|
|
|
mean value: 0.9699570558428222
|
|
|
|
key: test_recall
|
|
value: [0.85185185 1. 0.77777778 0.92592593 1. 1.
|
|
0.92857143 0.89285714 1. 0.96428571]
|
|
|
|
mean value: 0.9341269841269841
|
|
|
|
key: train_recall
|
|
value: [0.96370968 0.96370968 0.96370968 0.96774194 0.95967742 0.96356275
|
|
0.96356275 0.96356275 0.96761134 0.96356275]
|
|
|
|
mean value: 0.9640410735274912
|
|
|
|
key: test_roc_auc
|
|
value: [0.90806878 0.96428571 0.81746032 0.92724868 0.94642857 0.98148148
|
|
0.96428571 0.90939153 0.98148148 0.9265873 ]
|
|
|
|
mean value: 0.9326719576719578
|
|
|
|
key: train_roc_auc
|
|
value: [0.96363622 0.96566051 0.97173338 0.96767664 0.96364438 0.96565234
|
|
0.96162009 0.97170073 0.97170889 0.96766847]
|
|
|
|
mean value: 0.9670701645553089
|
|
|
|
key: test_jcc
|
|
value: [0.82142857 0.93103448 0.67741935 0.86206897 0.9 0.96551724
|
|
0.92857143 0.83333333 0.96551724 0.87096774]
|
|
|
|
mean value: 0.8755858361142009
|
|
|
|
key: train_jcc
|
|
value: [0.92996109 0.93359375 0.94466403 0.9375 0.9296875 0.93333333
|
|
0.92607004 0.94444444 0.94466403 0.93700787]
|
|
|
|
mean value: 0.9360926093439301
|
|
|
|
MCC on Blind test: 0.23
|
|
|
|
Accuracy on Blind test: 0.65
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01078415 0.01023746 0.00817394 0.00776267 0.00750327 0.00742102
|
|
0.00754666 0.00760245 0.00785923 0.00745296]
|
|
|
|
mean value: 0.008234381675720215
|
|
|
|
key: score_time
|
|
value: [0.01067472 0.00926948 0.00836229 0.00813413 0.00799251 0.00794363
|
|
0.00806427 0.0079906 0.00824738 0.00798607]
|
|
|
|
mean value: 0.008466506004333496
|
|
|
|
key: test_mcc
|
|
value: [0.71588202 0.56841568 0.69419497 0.72546624 0.52715278 0.48393864
|
|
0.61131498 0.79069197 0.75878131 0.53758181]
|
|
|
|
mean value: 0.6413420391304201
|
|
|
|
key: train_mcc
|
|
value: [0.72945173 0.66755872 0.72778077 0.66639453 0.67908612 0.67326481
|
|
0.68618843 0.67555218 0.64604502 0.69582615]
|
|
|
|
mean value: 0.6847148445519045
|
|
|
|
key: test_accuracy
|
|
value: [0.85454545 0.78181818 0.83636364 0.85454545 0.76363636 0.72727273
|
|
0.8 0.89090909 0.87272727 0.76363636]
|
|
|
|
mean value: 0.8145454545454546
|
|
|
|
key: train_accuracy
|
|
value: [0.86464646 0.82626263 0.86060606 0.82626263 0.82828283 0.83030303
|
|
0.83636364 0.83030303 0.81616162 0.84242424]
|
|
|
|
mean value: 0.8361616161616161
|
|
|
|
key: test_fscore
|
|
value: [0.84 0.76 0.80851064 0.83333333 0.75471698 0.68085106
|
|
0.78431373 0.88461538 0.8627451 0.74509804]
|
|
|
|
mean value: 0.7954184263953551
|
|
|
|
key: train_fscore
|
|
value: [0.86354379 0.80630631 0.85097192 0.80717489 0.80369515 0.81165919
|
|
0.81797753 0.80995475 0.79458239 0.82666667]
|
|
|
|
mean value: 0.8192532586237161
|
|
|
|
key: test_precision
|
|
value: [0.91304348 0.82608696 0.95 0.95238095 0.76923077 0.84210526
|
|
0.86956522 0.95833333 0.95652174 0.82608696]
|
|
|
|
mean value: 0.8863354665929036
|
|
|
|
key: train_precision
|
|
value: [0.87242798 0.91326531 0.91627907 0.90909091 0.94054054 0.90954774
|
|
0.91919192 0.91794872 0.89795918 0.91625616]
|
|
|
|
mean value: 0.9112507526203477
|
|
|
|
key: test_recall
|
|
value: [0.77777778 0.7037037 0.7037037 0.74074074 0.74074074 0.57142857
|
|
0.71428571 0.82142857 0.78571429 0.67857143]
|
|
|
|
mean value: 0.7238095238095238
|
|
|
|
key: train_recall
|
|
value: [0.85483871 0.72177419 0.79435484 0.72580645 0.7016129 0.73279352
|
|
0.73684211 0.72469636 0.71255061 0.75303644]
|
|
|
|
mean value: 0.7458306125114275
|
|
|
|
key: test_roc_auc
|
|
value: [0.8531746 0.78042328 0.83399471 0.85251323 0.76322751 0.73015873
|
|
0.8015873 0.89219577 0.87433862 0.76521164]
|
|
|
|
mean value: 0.8146825396825397
|
|
|
|
key: train_roc_auc
|
|
value: [0.86466632 0.82647414 0.86074017 0.82646598 0.82853925 0.83010644
|
|
0.83616299 0.83009011 0.81595272 0.84224403]
|
|
|
|
mean value: 0.8361442144442993
|
|
|
|
key: test_jcc
|
|
value: [0.72413793 0.61290323 0.67857143 0.71428571 0.60606061 0.51612903
|
|
0.64516129 0.79310345 0.75862069 0.59375 ]
|
|
|
|
mean value: 0.6642723366270363
|
|
|
|
key: train_jcc
|
|
value: [0.75985663 0.6754717 0.7406015 0.67669173 0.67181467 0.68301887
|
|
0.69201521 0.68060837 0.65917603 0.70454545]
|
|
|
|
mean value: 0.6943800160411975
|
|
|
|
MCC on Blind test: 0.34
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00835395 0.00815177 0.00788021 0.00780869 0.00764942 0.00775027
|
|
0.00766587 0.0077424 0.00780225 0.00769472]
|
|
|
|
mean value: 0.007849955558776855
|
|
|
|
key: score_time
|
|
value: [0.00892568 0.00875401 0.00813055 0.00789833 0.00855637 0.00791192
|
|
0.0080204 0.00795603 0.00823951 0.00798535]
|
|
|
|
mean value: 0.008237814903259278
|
|
|
|
key: test_mcc
|
|
value: [0.53452248 0.78410665 0.63745526 0.85449735 0.68504815 0.7112589
|
|
0.85695439 0.78353876 0.85695439 0.63841116]
|
|
|
|
mean value: 0.7342747491731905
|
|
|
|
key: train_mcc
|
|
value: [0.75012681 0.77383014 0.7860094 0.72613214 0.77778141 0.74958366
|
|
0.76193358 0.74958366 0.72166787 0.74199798]
|
|
|
|
mean value: 0.7538646637900721
|
|
|
|
key: test_accuracy
|
|
value: [0.76363636 0.89090909 0.81818182 0.92727273 0.83636364 0.85454545
|
|
0.92727273 0.89090909 0.92727273 0.81818182]
|
|
|
|
mean value: 0.8654545454545455
|
|
|
|
key: train_accuracy
|
|
value: [0.87474747 0.88686869 0.89292929 0.86262626 0.88888889 0.87474747
|
|
0.88080808 0.87474747 0.86060606 0.87070707]
|
|
|
|
mean value: 0.8767676767676768
|
|
|
|
key: test_fscore
|
|
value: [0.73469388 0.89285714 0.80769231 0.92592593 0.84745763 0.85185185
|
|
0.92592593 0.89655172 0.92592593 0.81481481]
|
|
|
|
mean value: 0.862369712380149
|
|
|
|
key: train_fscore
|
|
value: [0.87242798 0.888 0.89421158 0.85950413 0.88933602 0.87346939
|
|
0.88223553 0.87346939 0.85773196 0.8677686 ]
|
|
|
|
mean value: 0.8758154566969916
|
|
|
|
key: test_precision
|
|
value: [0.81818182 0.86206897 0.84 0.92592593 0.78125 0.88461538
|
|
0.96153846 0.86666667 0.96153846 0.84615385]
|
|
|
|
mean value: 0.8747939530137806
|
|
|
|
key: train_precision
|
|
value: [0.8907563 0.88095238 0.88537549 0.88135593 0.8875502 0.88065844
|
|
0.87007874 0.88065844 0.87394958 0.88607595]
|
|
|
|
mean value: 0.8817411452335624
|
|
|
|
key: test_recall
|
|
value: [0.66666667 0.92592593 0.77777778 0.92592593 0.92592593 0.82142857
|
|
0.89285714 0.92857143 0.89285714 0.78571429]
|
|
|
|
mean value: 0.8543650793650793
|
|
|
|
key: train_recall
|
|
value: [0.85483871 0.89516129 0.90322581 0.83870968 0.89112903 0.86639676
|
|
0.89473684 0.86639676 0.84210526 0.85020243]
|
|
|
|
mean value: 0.8702902572809195
|
|
|
|
key: test_roc_auc
|
|
value: [0.76190476 0.89153439 0.81746032 0.92724868 0.83796296 0.85515873
|
|
0.92791005 0.89021164 0.92791005 0.81878307]
|
|
|
|
mean value: 0.8656084656084656
|
|
|
|
key: train_roc_auc
|
|
value: [0.87478778 0.8868519 0.89290845 0.86267468 0.88888435 0.87473064
|
|
0.88083616 0.87473064 0.86056876 0.87066573]
|
|
|
|
mean value: 0.8767639088415828
|
|
|
|
key: test_jcc
|
|
value: [0.58064516 0.80645161 0.67741935 0.86206897 0.73529412 0.74193548
|
|
0.86206897 0.8125 0.86206897 0.6875 ]
|
|
|
|
mean value: 0.7627952627102008
|
|
|
|
key: train_jcc
|
|
value: [0.77372263 0.79856115 0.80866426 0.75362319 0.80072464 0.77536232
|
|
0.78928571 0.77536232 0.75090253 0.76642336]
|
|
|
|
mean value: 0.7792632101538037
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00789285 0.00730848 0.00801301 0.00789571 0.0079782 0.0079236
|
|
0.00803757 0.00805783 0.00799203 0.00808692]
|
|
|
|
mean value: 0.007918620109558105
|
|
|
|
key: score_time
|
|
value: [0.01133084 0.0164237 0.01200867 0.01205802 0.01202822 0.01195431
|
|
0.01304603 0.01204181 0.01285148 0.0128901 ]
|
|
|
|
mean value: 0.012663316726684571
|
|
|
|
key: test_mcc
|
|
value: [0.63745526 0.63745526 0.56441351 0.85449735 0.61131498 0.81854376
|
|
0.81878307 0.86334835 0.89139151 0.63624339]
|
|
|
|
mean value: 0.7333446438524339
|
|
|
|
key: train_mcc
|
|
value: [0.80310724 0.76975822 0.81041362 0.76604064 0.82627008 0.79394672
|
|
0.7820578 0.77375802 0.77376541 0.78192653]
|
|
|
|
mean value: 0.7881044273086666
|
|
|
|
key: test_accuracy
|
|
value: [0.81818182 0.81818182 0.78181818 0.92727273 0.8 0.90909091
|
|
0.90909091 0.92727273 0.94545455 0.81818182]
|
|
|
|
mean value: 0.8654545454545455
|
|
|
|
key: train_accuracy
|
|
value: [0.9010101 0.88484848 0.90505051 0.88282828 0.91313131 0.8969697
|
|
0.89090909 0.88686869 0.88686869 0.89090909]
|
|
|
|
mean value: 0.8939393939393939
|
|
|
|
key: test_fscore
|
|
value: [0.80769231 0.80769231 0.76923077 0.92592593 0.81355932 0.9122807
|
|
0.90909091 0.93333333 0.94736842 0.82142857]
|
|
|
|
mean value: 0.864760256923504
|
|
|
|
key: train_fscore
|
|
value: [0.90373281 0.88438134 0.90656064 0.88492063 0.91313131 0.8969697
|
|
0.892 0.88617886 0.88709677 0.89156627]
|
|
|
|
mean value: 0.8946538330419604
|
|
|
|
key: test_precision
|
|
value: [0.84 0.84 0.8 0.92592593 0.75 0.89655172
|
|
0.92592593 0.875 0.93103448 0.82142857]
|
|
|
|
mean value: 0.8605866630176975
|
|
|
|
key: train_precision
|
|
value: [0.88122605 0.88979592 0.89411765 0.87109375 0.91497976 0.89516129
|
|
0.88142292 0.88979592 0.88353414 0.88446215]
|
|
|
|
mean value: 0.8885589547682757
|
|
|
|
key: test_recall
|
|
value: [0.77777778 0.77777778 0.74074074 0.92592593 0.88888889 0.92857143
|
|
0.89285714 1. 0.96428571 0.82142857]
|
|
|
|
mean value: 0.8718253968253968
|
|
|
|
key: train_recall
|
|
value: [0.92741935 0.87903226 0.91935484 0.89919355 0.91129032 0.89878543
|
|
0.90283401 0.88259109 0.89068826 0.89878543]
|
|
|
|
mean value: 0.9009974533106961
|
|
|
|
key: test_roc_auc
|
|
value: [0.81746032 0.81746032 0.78108466 0.92724868 0.8015873 0.90873016
|
|
0.90939153 0.92592593 0.94510582 0.81812169]
|
|
|
|
mean value: 0.8652116402116402
|
|
|
|
key: train_roc_auc
|
|
value: [0.90095664 0.88486026 0.90502155 0.88279515 0.91313504 0.89697336
|
|
0.89093313 0.88686006 0.88687639 0.89092497]
|
|
|
|
mean value: 0.893933655478647
|
|
|
|
key: test_jcc
|
|
value: [0.67741935 0.67741935 0.625 0.86206897 0.68571429 0.83870968
|
|
0.83333333 0.875 0.9 0.6969697 ]
|
|
|
|
mean value: 0.7671634668631332
|
|
|
|
key: train_jcc
|
|
value: [0.82437276 0.79272727 0.82909091 0.79359431 0.8401487 0.81318681
|
|
0.80505415 0.79562044 0.79710145 0.80434783]
|
|
|
|
mean value: 0.8095244624739278
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01772094 0.01750326 0.01661658 0.01747656 0.01727486 0.0177145
|
|
0.01769352 0.01763797 0.01765108 0.01759934]
|
|
|
|
mean value: 0.017488861083984376
|
|
|
|
key: score_time
|
|
value: [0.01001716 0.00997353 0.0098772 0.00995278 0.01002908 0.01001978
|
|
0.01004839 0.0100143 0.00999403 0.01007152]
|
|
|
|
mean value: 0.009999775886535644
|
|
|
|
key: test_mcc
|
|
value: [0.56841568 0.78410665 0.63745526 0.89153439 0.78410665 0.81854376
|
|
0.85695439 0.82269299 0.89139151 0.70899471]
|
|
|
|
mean value: 0.7764195999656158
|
|
|
|
key: train_mcc
|
|
value: [0.80232908 0.78999446 0.80646861 0.77792658 0.78999446 0.78602685
|
|
0.78224023 0.78184638 0.77794469 0.79000817]
|
|
|
|
mean value: 0.7884779517517934
|
|
|
|
key: test_accuracy
|
|
value: [0.78181818 0.89090909 0.81818182 0.94545455 0.89090909 0.90909091
|
|
0.92727273 0.90909091 0.94545455 0.85454545]
|
|
|
|
mean value: 0.8872727272727272
|
|
|
|
key: train_accuracy
|
|
value: [0.9010101 0.89494949 0.9030303 0.88888889 0.89494949 0.89292929
|
|
0.89090909 0.89090909 0.88888889 0.89494949]
|
|
|
|
mean value: 0.8941414141414141
|
|
|
|
key: test_fscore
|
|
value: [0.76 0.89285714 0.80769231 0.94545455 0.89285714 0.9122807
|
|
0.92592593 0.91525424 0.94736842 0.85714286]
|
|
|
|
mean value: 0.8856833282025075
|
|
|
|
key: train_fscore
|
|
value: [0.90258449 0.896 0.9047619 0.89021956 0.896 0.89378758
|
|
0.89243028 0.89112903 0.88977956 0.89558233]
|
|
|
|
mean value: 0.895227473341023
|
|
|
|
key: test_precision
|
|
value: [0.82608696 0.86206897 0.84 0.92857143 0.86206897 0.89655172
|
|
0.96153846 0.87096774 0.93103448 0.85714286]
|
|
|
|
mean value: 0.8836031583641004
|
|
|
|
key: train_precision
|
|
value: [0.89019608 0.88888889 0.890625 0.88142292 0.88888889 0.88492063
|
|
0.87843137 0.8875502 0.88095238 0.88844622]
|
|
|
|
mean value: 0.8860322585475027
|
|
|
|
key: test_recall
|
|
value: [0.7037037 0.92592593 0.77777778 0.96296296 0.92592593 0.92857143
|
|
0.89285714 0.96428571 0.96428571 0.85714286]
|
|
|
|
mean value: 0.8903439153439153
|
|
|
|
key: train_recall
|
|
value: [0.91532258 0.90322581 0.91935484 0.89919355 0.90322581 0.90283401
|
|
0.90688259 0.89473684 0.89878543 0.90283401]
|
|
|
|
mean value: 0.9046395455139088
|
|
|
|
key: test_roc_auc
|
|
value: [0.78042328 0.89153439 0.81746032 0.9457672 0.89153439 0.90873016
|
|
0.92791005 0.90806878 0.94510582 0.85449735]
|
|
|
|
mean value: 0.8871031746031747
|
|
|
|
key: train_roc_auc
|
|
value: [0.90098113 0.89493274 0.90299726 0.88886803 0.89493274 0.89294926
|
|
0.8909413 0.89091681 0.88890884 0.89496539]
|
|
|
|
mean value: 0.8941393496147316
|
|
|
|
key: test_jcc
|
|
value: [0.61290323 0.80645161 0.67741935 0.89655172 0.80645161 0.83870968
|
|
0.86206897 0.84375 0.9 0.75 ]
|
|
|
|
mean value: 0.799430617352614
|
|
|
|
key: train_jcc
|
|
value: [0.82246377 0.8115942 0.82608696 0.80215827 0.8115942 0.80797101
|
|
0.8057554 0.80363636 0.80144404 0.81090909]
|
|
|
|
mean value: 0.8103613311859039
|
|
|
|
MCC on Blind test: 0.22
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.48249555 1.45221186 1.51775455 1.50475144 1.40030074 1.4671185
|
|
1.77495503 1.3907702 1.52468777 1.48590899]
|
|
|
|
mean value: 1.5000954627990724
|
|
|
|
key: score_time
|
|
value: [0.01191258 0.01389217 0.0141511 0.01366138 0.01346612 0.01382709
|
|
0.01353741 0.01376367 0.01363826 0.01388812]
|
|
|
|
mean value: 0.013573789596557617
|
|
|
|
key: test_mcc
|
|
value: [0.82269299 0.89153439 0.67602163 0.78353876 0.74603175 0.89153439
|
|
0.89642146 0.82269299 0.86334835 0.7112589 ]
|
|
|
|
mean value: 0.8105075610922889
|
|
|
|
key: train_mcc
|
|
value: [0.96364438 0.95154681 0.95962779 0.96767664 0.96780409 0.96780199
|
|
0.96364378 0.97575748 0.96770771 0.96770771]
|
|
|
|
mean value: 0.9652918361453778
|
|
|
|
key: test_accuracy
|
|
value: [0.90909091 0.94545455 0.83636364 0.89090909 0.87272727 0.94545455
|
|
0.94545455 0.90909091 0.92727273 0.85454545]
|
|
|
|
mean value: 0.9036363636363636
|
|
|
|
key: train_accuracy
|
|
value: [0.98181818 0.97575758 0.97979798 0.98383838 0.98383838 0.98383838
|
|
0.98181818 0.98787879 0.98383838 0.98383838]
|
|
|
|
mean value: 0.9826262626262626
|
|
|
|
key: test_fscore
|
|
value: [0.90196078 0.94545455 0.82352941 0.88461538 0.87272727 0.94545455
|
|
0.94339623 0.91525424 0.93333333 0.85185185]
|
|
|
|
mean value: 0.9017577593218595
|
|
|
|
key: train_fscore
|
|
value: [0.98181818 0.9757085 0.97975709 0.98387097 0.98373984 0.98367347
|
|
0.98174442 0.98785425 0.98373984 0.98373984]
|
|
|
|
mean value: 0.9825646391106369
|
|
|
|
key: test_precision
|
|
value: [0.95833333 0.92857143 0.875 0.92 0.85714286 0.96296296
|
|
1. 0.87096774 0.875 0.88461538]
|
|
|
|
mean value: 0.9132593708561451
|
|
|
|
key: train_precision
|
|
value: [0.98380567 0.9796748 0.98373984 0.98387097 0.99180328 0.99176955
|
|
0.98373984 0.98785425 0.9877551 0.9877551 ]
|
|
|
|
mean value: 0.9861768388410251
|
|
|
|
key: test_recall
|
|
value: [0.85185185 0.96296296 0.77777778 0.85185185 0.88888889 0.92857143
|
|
0.89285714 0.96428571 1. 0.82142857]
|
|
|
|
mean value: 0.8940476190476191
|
|
|
|
key: train_recall
|
|
value: [0.97983871 0.97177419 0.97580645 0.98387097 0.97580645 0.9757085
|
|
0.97975709 0.98785425 0.97975709 0.97975709]
|
|
|
|
mean value: 0.9789930782290714
|
|
|
|
key: test_roc_auc
|
|
value: [0.90806878 0.9457672 0.83531746 0.89021164 0.87301587 0.9457672
|
|
0.94642857 0.90806878 0.92592593 0.85515873]
|
|
|
|
mean value: 0.9033730158730159
|
|
|
|
key: train_roc_auc
|
|
value: [0.98182219 0.97576564 0.97980606 0.98383832 0.98385464 0.98382199
|
|
0.98181403 0.98787874 0.98383016 0.98383016]
|
|
|
|
mean value: 0.9826261917199948
|
|
|
|
key: test_jcc
|
|
value: [0.82142857 0.89655172 0.7 0.79310345 0.77419355 0.89655172
|
|
0.89285714 0.84375 0.875 0.74193548]
|
|
|
|
mean value: 0.8235371643095503
|
|
|
|
key: train_jcc
|
|
value: [0.96428571 0.95256917 0.96031746 0.96825397 0.968 0.96787149
|
|
0.96414343 0.976 0.968 0.968 ]
|
|
|
|
mean value: 0.9657441225056214
|
|
|
|
MCC on Blind test: 0.26
|
|
|
|
Accuracy on Blind test: 0.65
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01426578 0.01250982 0.01104569 0.00998855 0.01048017 0.01040983
|
|
0.01004791 0.00998306 0.00995708 0.01082873]
|
|
|
|
mean value: 0.010951662063598632
|
|
|
|
key: score_time
|
|
value: [0.01068878 0.00831199 0.00807023 0.00795078 0.007833 0.00783944
|
|
0.00795031 0.00782537 0.00784492 0.00789261]
|
|
|
|
mean value: 0.008220744132995606
|
|
|
|
key: test_mcc
|
|
value: [0.86334835 0.89153439 0.85449735 0.74569602 0.71735629 0.92724868
|
|
0.92724868 0.82269299 0.8565805 0.89153439]
|
|
|
|
mean value: 0.8497737644332478
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.92727273 0.94545455 0.92727273 0.87272727 0.85454545 0.96363636
|
|
0.96363636 0.90909091 0.92727273 0.94545455]
|
|
|
|
mean value: 0.9236363636363636
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.92 0.94545455 0.92592593 0.86792453 0.86206897 0.96428571
|
|
0.96428571 0.91525424 0.93103448 0.94545455]
|
|
|
|
mean value: 0.924168865927233
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.92857143 0.92592593 0.88461538 0.80645161 0.96428571
|
|
0.96428571 0.87096774 0.9 0.96296296]
|
|
|
|
mean value: 0.9208066485485841
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.85185185 0.96296296 0.92592593 0.85185185 0.92592593 0.96428571
|
|
0.96428571 0.96428571 0.96428571 0.92857143]
|
|
|
|
mean value: 0.9304232804232804
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.92592593 0.9457672 0.92724868 0.8723545 0.85582011 0.96362434
|
|
0.96362434 0.90806878 0.9265873 0.9457672 ]
|
|
|
|
mean value: 0.9234788359788361
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.85185185 0.89655172 0.86206897 0.76666667 0.75757576 0.93103448
|
|
0.93103448 0.84375 0.87096774 0.89655172]
|
|
|
|
mean value: 0.8608053397340105
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.11
|
|
|
|
Accuracy on Blind test: 0.36
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10040808 0.10288572 0.1029706 0.10149503 0.09969401 0.10028243
|
|
0.10137939 0.10140562 0.10217285 0.10103154]
|
|
|
|
mean value: 0.10137252807617188
|
|
|
|
key: score_time
|
|
value: [0.01739192 0.01827312 0.01805353 0.01727128 0.01716781 0.01719832
|
|
0.01750255 0.01722026 0.01744318 0.01742435]
|
|
|
|
mean value: 0.017494630813598634
|
|
|
|
key: test_mcc
|
|
value: [0.78961518 0.82337971 0.71049701 0.92962225 0.72754449 0.8565805
|
|
0.96428571 0.86334835 0.89602867 0.78174603]
|
|
|
|
mean value: 0.8342647911704605
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.89090909 0.90909091 0.85454545 0.96363636 0.85454545 0.92727273
|
|
0.98181818 0.92727273 0.94545455 0.89090909]
|
|
|
|
mean value: 0.9145454545454546
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88 0.9122807 0.84615385 0.96153846 0.86666667 0.93103448
|
|
0.98181818 0.93333333 0.94915254 0.89285714]
|
|
|
|
mean value: 0.915483535925352
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.95652174 0.86666667 0.88 1. 0.78787879 0.9
|
|
1. 0.875 0.90322581 0.89285714]
|
|
|
|
mean value: 0.9062150142984645
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.81481481 0.96296296 0.81481481 0.92592593 0.96296296 0.96428571
|
|
0.96428571 1. 1. 0.89285714]
|
|
|
|
mean value: 0.9302910052910053
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.88955026 0.91005291 0.85383598 0.96296296 0.85648148 0.9265873
|
|
0.98214286 0.92592593 0.94444444 0.89087302]
|
|
|
|
mean value: 0.9142857142857143
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.78571429 0.83870968 0.73333333 0.92592593 0.76470588 0.87096774
|
|
0.96428571 0.875 0.90322581 0.80645161]
|
|
|
|
mean value: 0.8468319980321878
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.32
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00785303 0.00793576 0.00822353 0.00800967 0.00797462 0.00781012
|
|
0.00777817 0.00785923 0.00833726 0.0080986 ]
|
|
|
|
mean value: 0.00798799991607666
|
|
|
|
key: score_time
|
|
value: [0.00809956 0.00805473 0.00806642 0.00806236 0.00844526 0.00803256
|
|
0.00801802 0.00814533 0.0084455 0.00861764]
|
|
|
|
mean value: 0.008198738098144531
|
|
|
|
key: test_mcc
|
|
value: [0.67602163 0.86402765 0.49468252 0.67284827 0.79069197 0.89139151
|
|
0.92724868 0.81854376 0.81854376 0.34721618]
|
|
|
|
mean value: 0.7301215940014808
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.83636364 0.92727273 0.74545455 0.83636364 0.89090909 0.94545455
|
|
0.96363636 0.90909091 0.90909091 0.67272727]
|
|
|
|
mean value: 0.8636363636363636
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.82352941 0.93103448 0.72 0.83018868 0.89655172 0.94736842
|
|
0.96428571 0.9122807 0.9122807 0.7 ]
|
|
|
|
mean value: 0.8637519836753659
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.875 0.87096774 0.7826087 0.84615385 0.83870968 0.93103448
|
|
0.96428571 0.89655172 0.89655172 0.65625 ]
|
|
|
|
mean value: 0.8558113606481056
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.77777778 1. 0.66666667 0.81481481 0.96296296 0.96428571
|
|
0.96428571 0.92857143 0.92857143 0.75 ]
|
|
|
|
mean value: 0.8757936507936508
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.83531746 0.92857143 0.74404762 0.83597884 0.89219577 0.94510582
|
|
0.96362434 0.90873016 0.90873016 0.6712963 ]
|
|
|
|
mean value: 0.8633597883597883
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.7 0.87096774 0.5625 0.70967742 0.8125 0.9
|
|
0.93103448 0.83870968 0.83870968 0.53846154]
|
|
|
|
mean value: 0.7702560537349191
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.19
|
|
|
|
Accuracy on Blind test: 0.65
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.31766438 1.37345815 1.27461958 1.28742099 1.28492332 1.28699183
|
|
1.29451942 1.28101182 1.29123402 1.28577328]
|
|
|
|
mean value: 1.2977616786956787
|
|
|
|
key: score_time
|
|
value: [0.09975529 0.15991688 0.09050679 0.09112549 0.0906496 0.09106612
|
|
0.09059644 0.09088278 0.09059429 0.09115982]
|
|
|
|
mean value: 0.09862534999847412
|
|
|
|
key: test_mcc
|
|
value: [0.89602867 0.92980214 0.82269299 0.89139151 0.92980214 0.96423926
|
|
0.96428571 0.89602867 0.96423926 0.92724868]
|
|
|
|
mean value: 0.9185759034850024
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.94545455 0.96363636 0.90909091 0.94545455 0.96363636 0.98181818
|
|
0.98181818 0.94545455 0.98181818 0.96363636]
|
|
|
|
mean value: 0.9581818181818181
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94117647 0.96428571 0.90196078 0.94339623 0.96428571 0.98245614
|
|
0.98181818 0.94915254 0.98245614 0.96428571]
|
|
|
|
mean value: 0.9575273629067016
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.93103448 0.95833333 0.96153846 0.93103448 0.96551724
|
|
1. 0.90322581 0.96551724 0.96428571]
|
|
|
|
mean value: 0.9580486763884984
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.88888889 1. 0.85185185 0.92592593 1. 1.
|
|
0.96428571 1. 1. 0.96428571]
|
|
|
|
mean value: 0.9595238095238096
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.94444444 0.96428571 0.90806878 0.94510582 0.96428571 0.98148148
|
|
0.98214286 0.94444444 0.98148148 0.96362434]
|
|
|
|
mean value: 0.957936507936508
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.88888889 0.93103448 0.82142857 0.89285714 0.93103448 0.96551724
|
|
0.96428571 0.90322581 0.96551724 0.93103448]
|
|
|
|
mean value: 0.9194824054946413
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.51
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.91016841 0.90267611 0.96385503 0.89476395 0.87512565 0.92300725
|
|
0.92452002 0.88380647 0.98288059 0.92339325]
|
|
|
|
mean value: 0.9184196710586547
|
|
|
|
key: score_time
|
|
value: [0.24338746 0.25723505 0.2364409 0.19419646 0.26264167 0.20185113
|
|
0.21379876 0.27131295 0.2565093 0.25926137]
|
|
|
|
mean value: 0.2396635055541992
|
|
|
|
key: test_mcc
|
|
value: [0.89602867 0.92980214 0.82269299 0.89139151 0.92980214 0.92724868
|
|
0.96428571 0.89602867 0.96423926 0.92724868]
|
|
|
|
mean value: 0.914876845571549
|
|
|
|
key: train_mcc
|
|
value: [0.94766581 0.95154523 0.95574863 0.94766581 0.95163767 0.94754543
|
|
0.94371421 0.95556354 0.9395879 0.94767006]
|
|
|
|
mean value: 0.9488344285838511
|
|
|
|
key: test_accuracy
|
|
value: [0.94545455 0.96363636 0.90909091 0.94545455 0.96363636 0.96363636
|
|
0.98181818 0.94545455 0.98181818 0.96363636]
|
|
|
|
mean value: 0.9563636363636363
|
|
|
|
key: train_accuracy
|
|
value: [0.97373737 0.97575758 0.97777778 0.97373737 0.97575758 0.97373737
|
|
0.97171717 0.97777778 0.96969697 0.97373737]
|
|
|
|
mean value: 0.9743434343434344
|
|
|
|
key: test_fscore
|
|
value: [0.94117647 0.96428571 0.90196078 0.94339623 0.96428571 0.96428571
|
|
0.98181818 0.94915254 0.98245614 0.96428571]
|
|
|
|
mean value: 0.9557103203001853
|
|
|
|
key: train_fscore
|
|
value: [0.9740519 0.97590361 0.97804391 0.9740519 0.976 0.97384306
|
|
0.972 0.97777778 0.96993988 0.9739479 ]
|
|
|
|
mean value: 0.974555993072763
|
|
|
|
key: test_precision
|
|
value: [1. 0.93103448 0.95833333 0.96153846 0.93103448 0.96428571
|
|
1. 0.90322581 0.96551724 0.96428571]
|
|
|
|
mean value: 0.9579255236791389
|
|
|
|
key: train_precision
|
|
value: [0.96442688 0.972 0.96837945 0.96442688 0.96825397 0.968
|
|
0.96047431 0.97580645 0.96031746 0.96428571]
|
|
|
|
mean value: 0.9666371104351469
|
|
|
|
key: test_recall
|
|
value: [0.88888889 1. 0.85185185 0.92592593 1. 0.96428571
|
|
0.96428571 1. 1. 0.96428571]
|
|
|
|
mean value: 0.955952380952381
|
|
|
|
key: train_recall
|
|
value: [0.98387097 0.97983871 0.98790323 0.98387097 0.98387097 0.97975709
|
|
0.98380567 0.97975709 0.97975709 0.98380567]
|
|
|
|
mean value: 0.9826237429802795
|
|
|
|
key: test_roc_auc
|
|
value: [0.94444444 0.96428571 0.90806878 0.94510582 0.96428571 0.96362434
|
|
0.98214286 0.94444444 0.98148148 0.96362434]
|
|
|
|
mean value: 0.9561507936507937
|
|
|
|
key: train_roc_auc
|
|
value: [0.97371686 0.97574931 0.97775728 0.97371686 0.97574115 0.97374951
|
|
0.97174154 0.97778177 0.96971725 0.97375767]
|
|
|
|
mean value: 0.9743429215097297
|
|
|
|
key: test_jcc
|
|
value: [0.88888889 0.93103448 0.82142857 0.89285714 0.93103448 0.93103448
|
|
0.96428571 0.90322581 0.96551724 0.93103448]
|
|
|
|
mean value: 0.9160341296325724
|
|
|
|
key: train_jcc
|
|
value: [0.94941634 0.95294118 0.95703125 0.94941634 0.953125 0.94901961
|
|
0.94552529 0.95652174 0.94163424 0.94921875]
|
|
|
|
mean value: 0.9503849741342992
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.52
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01822448 0.00774431 0.00775194 0.00768995 0.0077107 0.00761104
|
|
0.00783753 0.00774765 0.00785279 0.00792027]
|
|
|
|
mean value: 0.008809065818786621
|
|
|
|
key: score_time
|
|
value: [0.01004004 0.00803638 0.00816011 0.00804186 0.00801206 0.00807762
|
|
0.00851011 0.00811744 0.0079689 0.00807619]
|
|
|
|
mean value: 0.008304071426391602
|
|
|
|
key: test_mcc
|
|
value: [0.53452248 0.78410665 0.63745526 0.85449735 0.68504815 0.7112589
|
|
0.85695439 0.78353876 0.85695439 0.63841116]
|
|
|
|
mean value: 0.7342747491731905
|
|
|
|
key: train_mcc
|
|
value: [0.75012681 0.77383014 0.7860094 0.72613214 0.77778141 0.74958366
|
|
0.76193358 0.74958366 0.72166787 0.74199798]
|
|
|
|
mean value: 0.7538646637900721
|
|
|
|
key: test_accuracy
|
|
value: [0.76363636 0.89090909 0.81818182 0.92727273 0.83636364 0.85454545
|
|
0.92727273 0.89090909 0.92727273 0.81818182]
|
|
|
|
mean value: 0.8654545454545455
|
|
|
|
key: train_accuracy
|
|
value: [0.87474747 0.88686869 0.89292929 0.86262626 0.88888889 0.87474747
|
|
0.88080808 0.87474747 0.86060606 0.87070707]
|
|
|
|
mean value: 0.8767676767676768
|
|
|
|
key: test_fscore
|
|
value: [0.73469388 0.89285714 0.80769231 0.92592593 0.84745763 0.85185185
|
|
0.92592593 0.89655172 0.92592593 0.81481481]
|
|
|
|
mean value: 0.862369712380149
|
|
|
|
key: train_fscore
|
|
value: [0.87242798 0.888 0.89421158 0.85950413 0.88933602 0.87346939
|
|
0.88223553 0.87346939 0.85773196 0.8677686 ]
|
|
|
|
mean value: 0.8758154566969916
|
|
|
|
key: test_precision
|
|
value: [0.81818182 0.86206897 0.84 0.92592593 0.78125 0.88461538
|
|
0.96153846 0.86666667 0.96153846 0.84615385]
|
|
|
|
mean value: 0.8747939530137806
|
|
|
|
key: train_precision
|
|
value: [0.8907563 0.88095238 0.88537549 0.88135593 0.8875502 0.88065844
|
|
0.87007874 0.88065844 0.87394958 0.88607595]
|
|
|
|
mean value: 0.8817411452335624
|
|
|
|
key: test_recall
|
|
value: [0.66666667 0.92592593 0.77777778 0.92592593 0.92592593 0.82142857
|
|
0.89285714 0.92857143 0.89285714 0.78571429]
|
|
|
|
mean value: 0.8543650793650793
|
|
|
|
key: train_recall
|
|
value: [0.85483871 0.89516129 0.90322581 0.83870968 0.89112903 0.86639676
|
|
0.89473684 0.86639676 0.84210526 0.85020243]
|
|
|
|
mean value: 0.8702902572809195
|
|
|
|
key: test_roc_auc
|
|
value: [0.76190476 0.89153439 0.81746032 0.92724868 0.83796296 0.85515873
|
|
0.92791005 0.89021164 0.92791005 0.81878307]
|
|
|
|
mean value: 0.8656084656084656
|
|
|
|
key: train_roc_auc
|
|
value: [0.87478778 0.8868519 0.89290845 0.86267468 0.88888435 0.87473064
|
|
0.88083616 0.87473064 0.86056876 0.87066573]
|
|
|
|
mean value: 0.8767639088415828
|
|
|
|
key: test_jcc
|
|
value: [0.58064516 0.80645161 0.67741935 0.86206897 0.73529412 0.74193548
|
|
0.86206897 0.8125 0.86206897 0.6875 ]
|
|
|
|
mean value: 0.7627952627102008
|
|
|
|
key: train_jcc
|
|
value: [0.77372263 0.79856115 0.80866426 0.75362319 0.80072464 0.77536232
|
|
0.78928571 0.77536232 0.75090253 0.76642336]
|
|
|
|
mean value: 0.7792632101538037
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.24260426 0.04918981 0.05251241 0.05346513 0.05343127 0.05336189
|
|
0.05402541 0.05282021 0.05996943 0.0555234 ]
|
|
|
|
mean value: 0.07269032001495361
|
|
|
|
key: score_time
|
|
value: [0.01044059 0.0108521 0.01054335 0.01029539 0.0097692 0.01019621
|
|
0.01065063 0.01008081 0.00973535 0.01028442]
|
|
|
|
mean value: 0.010284805297851562
|
|
|
|
key: test_mcc
|
|
value: [0.89602867 0.96428571 0.89139151 0.89139151 0.92724868 0.96423926
|
|
0.96423926 0.89602867 0.92962225 0.92724868]
|
|
|
|
mean value: 0.9251724200737174
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.94545455 0.98181818 0.94545455 0.94545455 0.96363636 0.98181818
|
|
0.98181818 0.94545455 0.96363636 0.96363636]
|
|
|
|
mean value: 0.9618181818181818
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94117647 0.98181818 0.94339623 0.94339623 0.96296296 0.98245614
|
|
0.98245614 0.94915254 0.96551724 0.96428571]
|
|
|
|
mean value: 0.9616617846939229
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.96428571 0.96153846 0.96153846 0.96296296 0.96551724
|
|
0.96551724 0.90322581 0.93333333 0.96428571]
|
|
|
|
mean value: 0.9582204937154881
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.88888889 1. 0.92592593 0.92592593 0.96296296 1.
|
|
1. 1. 1. 0.96428571]
|
|
|
|
mean value: 0.9667989417989418
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.94444444 0.98214286 0.94510582 0.94510582 0.96362434 0.98148148
|
|
0.98148148 0.94444444 0.96296296 0.96362434]
|
|
|
|
mean value: 0.9614417989417989
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.88888889 0.96428571 0.89285714 0.89285714 0.92857143 0.96551724
|
|
0.96551724 0.90322581 0.93333333 0.93103448]
|
|
|
|
mean value: 0.9266088422762505
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.07
|
|
|
|
Accuracy on Blind test: 0.38
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01531887 0.04107666 0.04202557 0.04105949 0.04171562 0.01796556
|
|
0.01756859 0.0427742 0.04265285 0.01785755]
|
|
|
|
mean value: 0.032001495361328125
|
|
|
|
key: score_time
|
|
value: [0.01036215 0.02154422 0.01961827 0.02058625 0.02116394 0.01096559
|
|
0.01085567 0.02103901 0.02116776 0.01119232]
|
|
|
|
mean value: 0.016849517822265625
|
|
|
|
key: test_mcc
|
|
value: [0.67284827 0.78410665 0.64214885 0.89153439 0.78410665 0.89139151
|
|
0.89153439 0.8565805 0.92724868 0.78353876]
|
|
|
|
mean value: 0.8125038646516862
|
|
|
|
key: train_mcc
|
|
value: [0.8435716 0.85067196 0.87981045 0.85892085 0.85478898 0.86365469
|
|
0.85916382 0.86702055 0.85107823 0.84299263]
|
|
|
|
mean value: 0.8571673756575152
|
|
|
|
key: test_accuracy
|
|
value: [0.83636364 0.89090909 0.81818182 0.94545455 0.89090909 0.94545455
|
|
0.94545455 0.92727273 0.96363636 0.89090909]
|
|
|
|
mean value: 0.9054545454545454
|
|
|
|
key: train_accuracy
|
|
value: [0.92121212 0.92525253 0.93939394 0.92929293 0.92727273 0.93131313
|
|
0.92929293 0.93333333 0.92525253 0.92121212]
|
|
|
|
mean value: 0.9282828282828283
|
|
|
|
key: test_fscore
|
|
value: [0.83018868 0.89285714 0.8 0.94545455 0.89285714 0.94736842
|
|
0.94545455 0.93103448 0.96428571 0.89655172]
|
|
|
|
mean value: 0.9046052398103557
|
|
|
|
key: train_fscore
|
|
value: [0.92337917 0.9261477 0.94094488 0.9304175 0.92828685 0.93280632
|
|
0.9304175 0.93413174 0.92644135 0.92246521]
|
|
|
|
mean value: 0.9295438225256318
|
|
|
|
key: test_precision
|
|
value: [0.84615385 0.86206897 0.86956522 0.92857143 0.86206897 0.93103448
|
|
0.96296296 0.9 0.96428571 0.86666667]
|
|
|
|
mean value: 0.8993378249825026
|
|
|
|
key: train_precision
|
|
value: [0.90038314 0.91699605 0.91923077 0.91764706 0.91732283 0.91119691
|
|
0.9140625 0.92125984 0.91015625 0.90625 ]
|
|
|
|
mean value: 0.9134505355609847
|
|
|
|
key: test_recall
|
|
value: [0.81481481 0.92592593 0.74074074 0.96296296 0.92592593 0.96428571
|
|
0.92857143 0.96428571 0.96428571 0.92857143]
|
|
|
|
mean value: 0.912037037037037
|
|
|
|
key: train_recall
|
|
value: [0.94758065 0.93548387 0.96370968 0.94354839 0.93951613 0.95546559
|
|
0.94736842 0.94736842 0.94331984 0.93927126]
|
|
|
|
mean value: 0.9462632231944625
|
|
|
|
key: test_roc_auc
|
|
value: [0.83597884 0.89153439 0.81679894 0.9457672 0.89153439 0.94510582
|
|
0.9457672 0.9265873 0.96362434 0.89021164]
|
|
|
|
mean value: 0.9052910052910054
|
|
|
|
key: train_roc_auc
|
|
value: [0.92115874 0.92523181 0.93934472 0.92926407 0.92724794 0.93136183
|
|
0.92932937 0.93336163 0.92528895 0.92124853]
|
|
|
|
mean value: 0.9282837599582081
|
|
|
|
key: test_jcc
|
|
value: [0.70967742 0.80645161 0.66666667 0.89655172 0.80645161 0.9
|
|
0.89655172 0.87096774 0.93103448 0.8125 ]
|
|
|
|
mean value: 0.8296852984797923
|
|
|
|
key: train_jcc
|
|
value: [0.85766423 0.86245353 0.88847584 0.86988848 0.866171 0.87407407
|
|
0.86988848 0.87640449 0.86296296 0.85608856]
|
|
|
|
mean value: 0.8684071649301385
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01084638 0.00925279 0.00854015 0.0082531 0.0082016 0.00828433
|
|
0.00811386 0.0079782 0.00824904 0.00820422]
|
|
|
|
mean value: 0.008592367172241211
|
|
|
|
key: score_time
|
|
value: [0.01097393 0.00891614 0.00880241 0.00852537 0.00860667 0.0081892
|
|
0.00847697 0.00831318 0.00843 0.00821924]
|
|
|
|
mean value: 0.008745312690734863
|
|
|
|
key: test_mcc
|
|
value: [0.60876172 0.78410665 0.63745526 0.89153439 0.71735629 0.81854376
|
|
0.85695439 0.82269299 0.89139151 0.70899471]
|
|
|
|
mean value: 0.7737791675722022
|
|
|
|
key: train_mcc
|
|
value: [0.79012008 0.77375802 0.79409222 0.76162335 0.77778141 0.77376541
|
|
0.76589215 0.76970043 0.76565561 0.78592069]
|
|
|
|
mean value: 0.7758309364223852
|
|
|
|
key: test_accuracy
|
|
value: [0.8 0.89090909 0.81818182 0.94545455 0.85454545 0.90909091
|
|
0.92727273 0.90909091 0.94545455 0.85454545]
|
|
|
|
mean value: 0.8854545454545455
|
|
|
|
key: train_accuracy
|
|
value: [0.89494949 0.88686869 0.8969697 0.88080808 0.88888889 0.88686869
|
|
0.88282828 0.88484848 0.88282828 0.89292929]
|
|
|
|
mean value: 0.8878787878787879
|
|
|
|
key: test_fscore
|
|
value: [0.7755102 0.89285714 0.80769231 0.94545455 0.86206897 0.9122807
|
|
0.92592593 0.91525424 0.94736842 0.85714286]
|
|
|
|
mean value: 0.8841555308766806
|
|
|
|
key: train_fscore
|
|
value: [0.89641434 0.8875502 0.89820359 0.88080808 0.88933602 0.88709677
|
|
0.884 0.88438134 0.88259109 0.89336016]
|
|
|
|
mean value: 0.8883741600170872
|
|
|
|
key: test_precision
|
|
value: [0.86363636 0.86206897 0.84 0.92857143 0.80645161 0.89655172
|
|
0.96153846 0.87096774 0.93103448 0.85714286]
|
|
|
|
mean value: 0.8817963638141614
|
|
|
|
key: train_precision
|
|
value: [0.88582677 0.884 0.88932806 0.88259109 0.8875502 0.88353414
|
|
0.87351779 0.88617886 0.88259109 0.888 ]
|
|
|
|
mean value: 0.8843118006828748
|
|
|
|
key: test_recall
|
|
value: [0.7037037 0.92592593 0.77777778 0.96296296 0.92592593 0.92857143
|
|
0.89285714 0.96428571 0.96428571 0.85714286]
|
|
|
|
mean value: 0.8903439153439153
|
|
|
|
key: train_recall
|
|
value: [0.90725806 0.89112903 0.90725806 0.87903226 0.89112903 0.89068826
|
|
0.89473684 0.88259109 0.88259109 0.89878543]
|
|
|
|
mean value: 0.8925199164163511
|
|
|
|
key: test_roc_auc
|
|
value: [0.79828042 0.89153439 0.81746032 0.9457672 0.85582011 0.90873016
|
|
0.92791005 0.90806878 0.94510582 0.85449735]
|
|
|
|
mean value: 0.8853174603174603
|
|
|
|
key: train_roc_auc
|
|
value: [0.89492458 0.88686006 0.89694887 0.88081168 0.88888435 0.88687639
|
|
0.88285229 0.88484393 0.8828278 0.8929411 ]
|
|
|
|
mean value: 0.8878771059161551
|
|
|
|
key: test_jcc
|
|
value: [0.63333333 0.80645161 0.67741935 0.89655172 0.75757576 0.83870968
|
|
0.86206897 0.84375 0.9 0.75 ]
|
|
|
|
mean value: 0.7965860425725554
|
|
|
|
key: train_jcc
|
|
value: [0.81227437 0.79783394 0.81521739 0.78700361 0.80072464 0.79710145
|
|
0.7921147 0.79272727 0.78985507 0.80727273]
|
|
|
|
mean value: 0.7992125159422541
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0109973 0.01174498 0.01200843 0.01177812 0.01327801 0.01348186
|
|
0.01264 0.01432228 0.02716994 0.01318049]
|
|
|
|
mean value: 0.014060139656066895
|
|
|
|
key: score_time
|
|
value: [0.00866151 0.01013803 0.01016569 0.01051307 0.01054215 0.01067781
|
|
0.01127958 0.01156712 0.02180862 0.01062226]
|
|
|
|
mean value: 0.011597585678100587
|
|
|
|
key: test_mcc
|
|
value: [0.75724019 0.78353876 0.67602163 0.75878131 0.71588202 0.89153439
|
|
0.96428571 0.89602867 0.92724868 0.92980214]
|
|
|
|
mean value: 0.830036349769513
|
|
|
|
key: train_mcc
|
|
value: [0.90767739 0.75673387 0.89988762 0.82550688 0.81837405 0.89599275
|
|
0.84921709 0.857966 0.89212884 0.86668482]
|
|
|
|
mean value: 0.857016930626917
|
|
|
|
key: test_accuracy
|
|
value: [0.87272727 0.89090909 0.83636364 0.87272727 0.85454545 0.94545455
|
|
0.98181818 0.94545455 0.96363636 0.96363636]
|
|
|
|
mean value: 0.9127272727272727
|
|
|
|
key: train_accuracy
|
|
value: [0.95353535 0.87474747 0.94949495 0.90909091 0.90505051 0.94747475
|
|
0.92323232 0.92727273 0.94545455 0.93131313]
|
|
|
|
mean value: 0.9266666666666666
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 0.88461538 0.82352941 0.88135593 0.84 0.94545455
|
|
0.98181818 0.94915254 0.96428571 0.96296296]
|
|
|
|
mean value: 0.9090317532620623
|
|
|
|
key: train_fscore
|
|
value: [0.95277207 0.86580087 0.94845361 0.91493384 0.89804772 0.94605809
|
|
0.91983122 0.93023256 0.94386694 0.92765957]
|
|
|
|
mean value: 0.9247656499131667
|
|
|
|
key: test_precision
|
|
value: [0.95454545 0.92 0.875 0.8125 0.91304348 0.96296296
|
|
1. 0.90322581 0.96428571 1. ]
|
|
|
|
mean value: 0.9305563416506615
|
|
|
|
key: train_precision
|
|
value: [0.9707113 0.93457944 0.97046414 0.86120996 0.97183099 0.97021277
|
|
0.96035242 0.89219331 0.97008547 0.97757848]
|
|
|
|
mean value: 0.9479218264509782
|
|
|
|
key: test_recall
|
|
value: [0.77777778 0.85185185 0.77777778 0.96296296 0.77777778 0.92857143
|
|
0.96428571 1. 0.96428571 0.92857143]
|
|
|
|
mean value: 0.8933862433862434
|
|
|
|
key: train_recall
|
|
value: [0.93548387 0.80645161 0.92741935 0.97580645 0.83467742 0.92307692
|
|
0.88259109 0.97165992 0.91902834 0.88259109]
|
|
|
|
mean value: 0.9058786078098472
|
|
|
|
key: test_roc_auc
|
|
value: [0.87103175 0.89021164 0.83531746 0.87433862 0.8531746 0.9457672
|
|
0.98214286 0.94444444 0.96362434 0.96428571]
|
|
|
|
mean value: 0.9124338624338624
|
|
|
|
key: train_roc_auc
|
|
value: [0.95357189 0.87488573 0.94953964 0.90895586 0.90519296 0.94742556
|
|
0.92315039 0.92736222 0.94540127 0.9312149 ]
|
|
|
|
mean value: 0.92667004048583
|
|
|
|
key: test_jcc
|
|
value: [0.75 0.79310345 0.7 0.78787879 0.72413793 0.89655172
|
|
0.96428571 0.90322581 0.93103448 0.92857143]
|
|
|
|
mean value: 0.837878932339444
|
|
|
|
key: train_jcc
|
|
value: [0.90980392 0.76335878 0.90196078 0.84320557 0.81496063 0.8976378
|
|
0.8515625 0.86956522 0.89370079 0.86507937]
|
|
|
|
mean value: 0.8610835354490294
|
|
|
|
MCC on Blind test: 0.23
|
|
|
|
Accuracy on Blind test: 0.64
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0145483 0.01222491 0.01315165 0.01257706 0.01379681 0.01266575
|
|
0.0136075 0.01250958 0.014539 0.01288724]
|
|
|
|
mean value: 0.013250780105590821
|
|
|
|
key: score_time
|
|
value: [0.01066113 0.01051998 0.01067948 0.01066899 0.01064396 0.01054311
|
|
0.01067638 0.01064086 0.01070881 0.01063132]
|
|
|
|
mean value: 0.010637402534484863
|
|
|
|
key: test_mcc
|
|
value: [0.81854376 0.82269299 0.71049701 0.83147942 0.74935731 0.83147942
|
|
0.8565805 0.80032673 0.85695439 0.83251448]
|
|
|
|
mean value: 0.8110426022713871
|
|
|
|
key: train_mcc
|
|
value: [0.89986978 0.78532023 0.91930903 0.90350829 0.79835384 0.80158821
|
|
0.82922447 0.76325368 0.8535924 0.78649322]
|
|
|
|
mean value: 0.8340513160102464
|
|
|
|
key: test_accuracy
|
|
value: [0.90909091 0.90909091 0.85454545 0.90909091 0.87272727 0.90909091
|
|
0.92727273 0.89090909 0.92727273 0.90909091]
|
|
|
|
mean value: 0.9018181818181817
|
|
|
|
key: train_accuracy
|
|
value: [0.94949495 0.88484848 0.95959596 0.95151515 0.89090909 0.89494949
|
|
0.91111111 0.87272727 0.92323232 0.88484848]
|
|
|
|
mean value: 0.9123232323232323
|
|
|
|
key: test_fscore
|
|
value: [0.90566038 0.90196078 0.84615385 0.89795918 0.8627451 0.91803279
|
|
0.93103448 0.90322581 0.92592593 0.90196078]
|
|
|
|
mean value: 0.8994659075873878
|
|
|
|
key: train_fscore
|
|
value: [0.95069034 0.87248322 0.96 0.95081967 0.87892377 0.90298507
|
|
0.91634981 0.88482633 0.91774892 0.87133183]
|
|
|
|
mean value: 0.9106158951845009
|
|
|
|
key: test_precision
|
|
value: [0.92307692 0.95833333 0.88 1. 0.91666667 0.84848485
|
|
0.9 0.82352941 0.96153846 1. ]
|
|
|
|
mean value: 0.9211629644864939
|
|
|
|
key: train_precision
|
|
value: [0.93050193 0.9798995 0.95238095 0.96666667 0.98989899 0.83737024
|
|
0.86379928 0.80666667 0.98604651 0.98469388]
|
|
|
|
mean value: 0.9297924618150225
|
|
|
|
key: test_recall
|
|
value: [0.88888889 0.85185185 0.81481481 0.81481481 0.81481481 1.
|
|
0.96428571 1. 0.89285714 0.82142857]
|
|
|
|
mean value: 0.8863756613756614
|
|
|
|
key: train_recall
|
|
value: [0.97177419 0.78629032 0.96774194 0.93548387 0.79032258 0.97975709
|
|
0.9757085 0.97975709 0.8582996 0.78137652]
|
|
|
|
mean value: 0.9026511688650908
|
|
|
|
key: test_roc_auc
|
|
value: [0.90873016 0.90806878 0.85383598 0.90740741 0.87169312 0.90740741
|
|
0.9265873 0.88888889 0.92791005 0.91071429]
|
|
|
|
mean value: 0.9011243386243386
|
|
|
|
key: train_roc_auc
|
|
value: [0.94944985 0.885048 0.95957947 0.9515476 0.89111271 0.89512048
|
|
0.91124135 0.87294306 0.92310141 0.88463987]
|
|
|
|
mean value: 0.9123783792608071
|
|
|
|
key: test_jcc
|
|
value: [0.82758621 0.82142857 0.73333333 0.81481481 0.75862069 0.84848485
|
|
0.87096774 0.82352941 0.86206897 0.82142857]
|
|
|
|
mean value: 0.8182263155259295
|
|
|
|
key: train_jcc
|
|
value: [0.90601504 0.77380952 0.92307692 0.90625 0.784 0.82312925
|
|
0.84561404 0.79344262 0.848 0.772 ]
|
|
|
|
mean value: 0.8375337394219651
|
|
|
|
MCC on Blind test: 0.23
|
|
|
|
Accuracy on Blind test: 0.65
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10882998 0.09430504 0.09354877 0.09708071 0.09502983 0.09874582
|
|
0.09948301 0.10101175 0.10146403 0.1005578 ]
|
|
|
|
mean value: 0.09900567531585694
|
|
|
|
key: score_time
|
|
value: [0.01448226 0.01464081 0.01445723 0.01509094 0.01563764 0.01575255
|
|
0.01563501 0.01564932 0.01564193 0.01550007]
|
|
|
|
mean value: 0.015248775482177734
|
|
|
|
key: test_mcc
|
|
value: [0.89602867 0.92980214 0.74569602 0.89139151 0.96428571 0.96423926
|
|
0.92724868 0.89602867 0.96423926 0.92724868]
|
|
|
|
mean value: 0.9106208597080531
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.94545455 0.96363636 0.87272727 0.94545455 0.98181818 0.98181818
|
|
0.96363636 0.94545455 0.98181818 0.96363636]
|
|
|
|
mean value: 0.9545454545454546
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94117647 0.96428571 0.86792453 0.94339623 0.98181818 0.98245614
|
|
0.96428571 0.94915254 0.98245614 0.96428571]
|
|
|
|
mean value: 0.9541237373055177
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.93103448 0.88461538 0.96153846 0.96428571 0.96551724
|
|
0.96428571 0.90322581 0.96551724 0.96428571]
|
|
|
|
mean value: 0.9504305760979843
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.88888889 1. 0.85185185 0.92592593 1. 1.
|
|
0.96428571 1. 1. 0.96428571]
|
|
|
|
mean value: 0.9595238095238096
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.94444444 0.96428571 0.8723545 0.94510582 0.98214286 0.98148148
|
|
0.96362434 0.94444444 0.98148148 0.96362434]
|
|
|
|
mean value: 0.9542989417989418
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.88888889 0.93103448 0.76666667 0.89285714 0.96428571 0.96551724
|
|
0.93103448 0.90322581 0.96551724 0.93103448]
|
|
|
|
mean value: 0.9140062150184508
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.13
|
|
|
|
Accuracy on Blind test: 0.4
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.04016852 0.0324285 0.02866578 0.0298512 0.03131413 0.04710269
|
|
0.03221202 0.03381968 0.03193164 0.03873563]
|
|
|
|
mean value: 0.03462297916412353
|
|
|
|
key: score_time
|
|
value: [0.02359104 0.02912378 0.0166347 0.03185606 0.01757717 0.01780367
|
|
0.02005625 0.01775956 0.02499628 0.03069973]
|
|
|
|
mean value: 0.023009824752807616
|
|
|
|
key: test_mcc
|
|
value: [0.96423926 0.92980214 0.8565805 0.85449735 0.92980214 0.96423926
|
|
1. 0.89602867 0.96423926 0.89153439]
|
|
|
|
mean value: 0.9250962972081643
|
|
|
|
key: train_mcc
|
|
value: [0.99195168 0.9838707 0.97980606 0.9878869 0.98383832 0.97980573
|
|
0.97980606 0.99596768 0.97172522 0.98795103]
|
|
|
|
mean value: 0.9842609370911118
|
|
|
|
key: test_accuracy
|
|
value: [0.98181818 0.96363636 0.92727273 0.92727273 0.96363636 0.98181818
|
|
1. 0.94545455 0.98181818 0.94545455]
|
|
|
|
mean value: 0.9618181818181818
|
|
|
|
key: train_accuracy
|
|
value: [0.9959596 0.99191919 0.98989899 0.99393939 0.99191919 0.98989899
|
|
0.98989899 0.9979798 0.98585859 0.99393939]
|
|
|
|
mean value: 0.9921212121212121
|
|
|
|
key: test_fscore
|
|
value: [0.98113208 0.96428571 0.92307692 0.92592593 0.96428571 0.98245614
|
|
1. 0.94915254 0.98245614 0.94545455]
|
|
|
|
mean value: 0.9618225721575157
|
|
|
|
key: train_fscore
|
|
value: [0.99595142 0.99190283 0.98989899 0.99393939 0.99193548 0.98985801
|
|
0.98989899 0.9979716 0.98585859 0.99389002]
|
|
|
|
mean value: 0.9921105329450134
|
|
|
|
key: test_precision
|
|
value: [1. 0.93103448 0.96 0.92592593 0.93103448 0.96551724
|
|
1. 0.90322581 0.96551724 0.96296296]
|
|
|
|
mean value: 0.9545218143616364
|
|
|
|
key: train_precision
|
|
value: [1. 0.99593496 0.99190283 0.99595142 0.99193548 0.99186992
|
|
0.98790323 1. 0.98387097 1. ]
|
|
|
|
mean value: 0.9939368806480281
|
|
|
|
key: test_recall
|
|
value: [0.96296296 1. 0.88888889 0.92592593 1. 1.
|
|
1. 1. 1. 0.92857143]
|
|
|
|
mean value: 0.9706349206349206
|
|
|
|
key: train_recall
|
|
value: [0.99193548 0.98790323 0.98790323 0.99193548 0.99193548 0.98785425
|
|
0.99190283 0.99595142 0.98785425 0.98785425]
|
|
|
|
mean value: 0.990302990727439
|
|
|
|
key: test_roc_auc
|
|
value: [0.98148148 0.96428571 0.9265873 0.92724868 0.96428571 0.98148148
|
|
1. 0.94444444 0.98148148 0.9457672 ]
|
|
|
|
mean value: 0.9617063492063492
|
|
|
|
key: train_roc_auc
|
|
value: [0.99596774 0.99192732 0.98990303 0.99394345 0.99191916 0.98989487
|
|
0.98990303 0.99797571 0.98586261 0.99392713]
|
|
|
|
mean value: 0.9921224043359018
|
|
|
|
key: test_jcc
|
|
value: [0.96296296 0.93103448 0.85714286 0.86206897 0.93103448 0.96551724
|
|
1. 0.90322581 0.96551724 0.89655172]
|
|
|
|
mean value: 0.9275055764488467
|
|
|
|
key: train_jcc
|
|
value: [0.99193548 0.98393574 0.98 0.98795181 0.984 0.97991968
|
|
0.98 0.99595142 0.97211155 0.98785425]
|
|
|
|
mean value: 0.9843659934587685
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.34
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.09521818 0.14583254 0.16052413 0.17061996 0.16926599 0.16544843
|
|
0.14057255 0.18005562 0.18195176 0.14595151]
|
|
|
|
mean value: 0.15554406642913818
|
|
|
|
key: score_time
|
|
value: [0.01243162 0.02033305 0.01982188 0.02640724 0.02747393 0.01361799
|
|
0.02945495 0.02985644 0.0272851 0.0134213 ]
|
|
|
|
mean value: 0.02201035022735596
|
|
|
|
key: test_mcc
|
|
value: [0.64214885 0.78410665 0.60000053 0.89139151 0.68504815 0.8565805
|
|
0.85695439 0.86334835 0.89139151 0.74569602]
|
|
|
|
mean value: 0.781666645397211
|
|
|
|
key: train_mcc
|
|
value: [0.86751154 0.84656958 0.86751154 0.85498218 0.85478898 0.83883199
|
|
0.84716822 0.85466123 0.85085332 0.85500107]
|
|
|
|
mean value: 0.8537879647323043
|
|
|
|
key: test_accuracy
|
|
value: [0.81818182 0.89090909 0.8 0.94545455 0.83636364 0.92727273
|
|
0.92727273 0.92727273 0.94545455 0.87272727]
|
|
|
|
mean value: 0.889090909090909
|
|
|
|
key: train_accuracy
|
|
value: [0.93333333 0.92323232 0.93333333 0.92727273 0.92727273 0.91919192
|
|
0.92323232 0.92727273 0.92525253 0.92727273]
|
|
|
|
mean value: 0.9266666666666666
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.89285714 0.79245283 0.94339623 0.84745763 0.93103448
|
|
0.92592593 0.93333333 0.94736842 0.87719298]
|
|
|
|
mean value: 0.8891018972106213
|
|
|
|
key: train_fscore
|
|
value: [0.93491124 0.924 0.93491124 0.92857143 0.92828685 0.92031873
|
|
0.92460317 0.92771084 0.9261477 0.92828685]
|
|
|
|
mean value: 0.92777480666249
|
|
|
|
key: test_precision
|
|
value: [0.86956522 0.86206897 0.80769231 0.96153846 0.78125 0.9
|
|
0.96153846 0.875 0.93103448 0.86206897]
|
|
|
|
mean value: 0.8811756861953639
|
|
|
|
key: train_precision
|
|
value: [0.91505792 0.91666667 0.91505792 0.9140625 0.91732283 0.90588235
|
|
0.90661479 0.92031873 0.91338583 0.91372549]
|
|
|
|
mean value: 0.9138095012428894
|
|
|
|
key: test_recall
|
|
value: [0.74074074 0.92592593 0.77777778 0.92592593 0.92592593 0.96428571
|
|
0.89285714 1. 0.96428571 0.89285714]
|
|
|
|
mean value: 0.9010582010582011
|
|
|
|
key: train_recall
|
|
value: [0.95564516 0.93145161 0.95564516 0.94354839 0.93951613 0.93522267
|
|
0.94331984 0.93522267 0.93927126 0.94331984]
|
|
|
|
mean value: 0.9422162726916548
|
|
|
|
key: test_roc_auc
|
|
value: [0.81679894 0.89153439 0.79960317 0.94510582 0.83796296 0.9265873
|
|
0.92791005 0.92592593 0.94510582 0.8723545 ]
|
|
|
|
mean value: 0.8888888888888888
|
|
|
|
key: train_roc_auc
|
|
value: [0.93328817 0.92321568 0.93328817 0.92723978 0.92724794 0.91922424
|
|
0.92327282 0.92728876 0.92528079 0.92730508]
|
|
|
|
mean value: 0.9266651430063993
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.80645161 0.65625 0.89285714 0.73529412 0.87096774
|
|
0.86206897 0.875 0.9 0.78125 ]
|
|
|
|
mean value: 0.804680624752682
|
|
|
|
key: train_jcc
|
|
value: [0.87777778 0.85873606 0.87777778 0.86666667 0.866171 0.85239852
|
|
0.8597786 0.86516854 0.86245353 0.866171 ]
|
|
|
|
mean value: 0.8653099481832294
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.25400209 0.25350809 0.25353813 0.25287437 0.24922609 0.24631763
|
|
0.25787902 0.25717902 0.25663543 0.25559354]
|
|
|
|
mean value: 0.25367534160614014
|
|
|
|
key: score_time
|
|
value: [0.00894332 0.00885177 0.00907326 0.00875664 0.00890183 0.00927663
|
|
0.00877619 0.00920916 0.00972295 0.00899959]
|
|
|
|
mean value: 0.009051132202148437
|
|
|
|
key: test_mcc
|
|
value: [0.92962225 0.96428571 0.89139151 0.89139151 0.89153439 0.96423926
|
|
0.96423926 0.89602867 0.96423926 0.92724868]
|
|
|
|
mean value: 0.9284220498029769
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96363636 0.98181818 0.94545455 0.94545455 0.94545455 0.98181818
|
|
0.98181818 0.94545455 0.98181818 0.96363636]
|
|
|
|
mean value: 0.9636363636363636
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96153846 0.98181818 0.94339623 0.94339623 0.94545455 0.98245614
|
|
0.98245614 0.94915254 0.98245614 0.96428571]
|
|
|
|
mean value: 0.9636410319352604
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.96428571 0.96153846 0.96153846 0.92857143 0.96551724
|
|
0.96551724 0.90322581 0.96551724 0.96428571]
|
|
|
|
mean value: 0.9579997310809324
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.92592593 1. 0.92592593 0.92592593 0.96296296 1.
|
|
1. 1. 1. 0.96428571]
|
|
|
|
mean value: 0.9705026455026455
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96296296 0.98214286 0.94510582 0.94510582 0.9457672 0.98148148
|
|
0.98148148 0.94444444 0.98148148 0.96362434]
|
|
|
|
mean value: 0.9633597883597884
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.92592593 0.96428571 0.89285714 0.89285714 0.89655172 0.96551724
|
|
0.96551724 0.90322581 0.96551724 0.93103448]
|
|
|
|
mean value: 0.9303289663412022
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.3
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01195788 0.0138123 0.01438642 0.01398134 0.014189 0.01624131
|
|
0.01444888 0.0144527 0.01394033 0.0147655 ]
|
|
|
|
mean value: 0.014217567443847657
|
|
|
|
key: score_time
|
|
value: [0.01118398 0.01099157 0.01105928 0.01112914 0.01111221 0.01204634
|
|
0.01111531 0.01175618 0.01196051 0.01118875]
|
|
|
|
mean value: 0.011354327201843262
|
|
|
|
key: test_mcc
|
|
value: [0.35634832 0.71735629 0.65060574 0.68300095 0.6005291 0.47230166
|
|
0.70899471 0.71735629 0.69688314 0.49734925]
|
|
|
|
mean value: 0.6100725452940903
|
|
|
|
key: train_mcc
|
|
value: [0.68737636 0.74945491 0.78935739 0.77491061 0.8187082 0.79126011
|
|
0.79359843 0.78561297 0.78312126 0.78838114]
|
|
|
|
mean value: 0.7761781373175624
|
|
|
|
key: test_accuracy
|
|
value: [0.65454545 0.85454545 0.81818182 0.83636364 0.8 0.72727273
|
|
0.85454545 0.85454545 0.83636364 0.74545455]
|
|
|
|
mean value: 0.7981818181818182
|
|
|
|
key: train_accuracy
|
|
value: [0.82626263 0.87070707 0.89090909 0.88282828 0.90707071 0.89292929
|
|
0.89494949 0.88888889 0.88888889 0.89090909]
|
|
|
|
mean value: 0.8834343434343435
|
|
|
|
key: test_fscore
|
|
value: [0.71641791 0.86206897 0.79166667 0.81632653 0.8 0.69387755
|
|
0.85714286 0.84615385 0.81632653 0.73076923]
|
|
|
|
mean value: 0.7930750088942501
|
|
|
|
key: train_fscore
|
|
value: [0.85017422 0.86086957 0.88311688 0.87336245 0.90212766 0.88602151
|
|
0.8893617 0.88017429 0.88172043 0.88311688]
|
|
|
|
mean value: 0.8790045582018875
|
|
|
|
key: test_precision
|
|
value: [0.6 0.80645161 0.9047619 0.90909091 0.78571429 0.80952381
|
|
0.85714286 0.91666667 0.95238095 0.79166667]
|
|
|
|
mean value: 0.8333399664851278
|
|
|
|
key: train_precision
|
|
value: [0.74846626 0.93396226 0.95327103 0.95238095 0.95495495 0.94495413
|
|
0.93721973 0.95283019 0.94036697 0.94883721]
|
|
|
|
mean value: 0.9267243687033652
|
|
|
|
key: test_recall
|
|
value: [0.88888889 0.92592593 0.7037037 0.74074074 0.81481481 0.60714286
|
|
0.85714286 0.78571429 0.71428571 0.67857143]
|
|
|
|
mean value: 0.7716931216931217
|
|
|
|
key: train_recall
|
|
value: [0.98387097 0.7983871 0.82258065 0.80645161 0.85483871 0.8340081
|
|
0.84615385 0.81781377 0.82995951 0.82591093]
|
|
|
|
mean value: 0.8419975186104218
|
|
|
|
key: test_roc_auc
|
|
value: [0.65873016 0.85582011 0.81613757 0.83465608 0.80026455 0.72949735
|
|
0.85449735 0.85582011 0.83862434 0.74669312]
|
|
|
|
mean value: 0.799074074074074
|
|
|
|
key: train_roc_auc
|
|
value: [0.82594358 0.87085347 0.89104741 0.88298289 0.90717644 0.8928105
|
|
0.89485112 0.88874559 0.88877008 0.89077805]
|
|
|
|
mean value: 0.8833959122371686
|
|
|
|
key: test_jcc
|
|
value: [0.55813953 0.75757576 0.65517241 0.68965517 0.66666667 0.53125
|
|
0.75 0.73333333 0.68965517 0.57575758]
|
|
|
|
mean value: 0.6607205626837744
|
|
|
|
key: train_jcc
|
|
value: [0.73939394 0.75572519 0.79069767 0.7751938 0.82170543 0.7953668
|
|
0.80076628 0.78599222 0.78846154 0.79069767]
|
|
|
|
mean value: 0.7844000539129116
|
|
|
|
MCC on Blind test: 0.3
|
|
|
|
Accuracy on Blind test: 0.74
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02006578 0.03406477 0.0319221 0.02957749 0.03143239 0.0272727
|
|
0.02576208 0.02969098 0.03094196 0.0119288 ]
|
|
|
|
mean value: 0.02726590633392334
|
|
|
|
key: score_time
|
|
value: [0.01922798 0.01971698 0.03047991 0.01091385 0.0175004 0.01788068
|
|
0.02109575 0.01848054 0.02028871 0.0110805 ]
|
|
|
|
mean value: 0.018666529655456544
|
|
|
|
key: test_mcc
|
|
value: [0.63745526 0.78410665 0.56841568 0.89153439 0.78410665 0.8565805
|
|
0.85695439 0.89602867 0.89139151 0.74569602]
|
|
|
|
mean value: 0.7912269728112449
|
|
|
|
key: train_mcc
|
|
value: [0.82706373 0.81437091 0.83515329 0.8265827 0.84258914 0.82682144
|
|
0.81851887 0.81438908 0.81873585 0.81457838]
|
|
|
|
mean value: 0.8238803386905313
|
|
|
|
key: test_accuracy
|
|
value: [0.81818182 0.89090909 0.78181818 0.94545455 0.89090909 0.92727273
|
|
0.92727273 0.94545455 0.94545455 0.87272727]
|
|
|
|
mean value: 0.8945454545454545
|
|
|
|
key: train_accuracy
|
|
value: [0.91313131 0.90707071 0.91717172 0.91313131 0.92121212 0.91313131
|
|
0.90909091 0.90707071 0.90909091 0.90707071]
|
|
|
|
mean value: 0.9117171717171717
|
|
|
|
key: test_fscore
|
|
value: [0.80769231 0.89285714 0.76 0.94545455 0.89285714 0.93103448
|
|
0.92592593 0.94915254 0.94736842 0.87719298]
|
|
|
|
mean value: 0.8929535493427339
|
|
|
|
key: train_fscore
|
|
value: [0.91518738 0.90836653 0.91913215 0.91451292 0.92215569 0.91451292
|
|
0.91017964 0.908 0.91053678 0.90836653]
|
|
|
|
mean value: 0.9130950547952092
|
|
|
|
key: test_precision
|
|
value: [0.84 0.86206897 0.82608696 0.92857143 0.86206897 0.9
|
|
0.96153846 0.90322581 0.93103448 0.86206897]
|
|
|
|
mean value: 0.8876664032393586
|
|
|
|
key: train_precision
|
|
value: [0.8957529 0.8976378 0.8996139 0.90196078 0.91304348 0.8984375
|
|
0.8976378 0.8972332 0.89453125 0.89411765]
|
|
|
|
mean value: 0.8989966247132423
|
|
|
|
key: test_recall
|
|
value: [0.77777778 0.92592593 0.7037037 0.96296296 0.92592593 0.96428571
|
|
0.89285714 1. 0.96428571 0.89285714]
|
|
|
|
mean value: 0.9010582010582011
|
|
|
|
key: train_recall
|
|
value: [0.93548387 0.91935484 0.93951613 0.92741935 0.93145161 0.93117409
|
|
0.92307692 0.91902834 0.92712551 0.92307692]
|
|
|
|
mean value: 0.9276707587828131
|
|
|
|
key: test_roc_auc
|
|
value: [0.81746032 0.89153439 0.78042328 0.9457672 0.89153439 0.9265873
|
|
0.92791005 0.94444444 0.94510582 0.8723545 ]
|
|
|
|
mean value: 0.8943121693121694
|
|
|
|
key: train_roc_auc
|
|
value: [0.91308607 0.90704584 0.91712649 0.91310239 0.92119139 0.91316769
|
|
0.90911911 0.90709482 0.90912727 0.90710298]
|
|
|
|
mean value: 0.9117164032911061
|
|
|
|
key: test_jcc
|
|
value: [0.67741935 0.80645161 0.61290323 0.89655172 0.80645161 0.87096774
|
|
0.86206897 0.90322581 0.9 0.78125 ]
|
|
|
|
mean value: 0.8117290044493882
|
|
|
|
key: train_jcc
|
|
value: [0.84363636 0.83211679 0.85036496 0.84249084 0.85555556 0.84249084
|
|
0.83516484 0.83150183 0.83576642 0.83211679]
|
|
|
|
mean value: 0.840120523434392
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_config.py:183: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rus_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_config.py:186: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rus_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.1858604 0.14565659 0.11904073 0.20053315 0.21871543 0.19877052
|
|
0.23351502 0.23374367 0.09664297 0.18243432]
|
|
|
|
mean value: 0.18149127960205078
|
|
|
|
key: score_time
|
|
value: [0.02734494 0.01139116 0.02203178 0.01164579 0.0204699 0.02237654
|
|
0.02139163 0.02069759 0.01105142 0.02118349]
|
|
|
|
mean value: 0.018958425521850585
|
|
|
|
key: test_mcc
|
|
value: [0.67284827 0.78410665 0.64214885 0.89153439 0.78410665 0.8565805
|
|
0.89153439 0.8565805 0.92724868 0.78353876]
|
|
|
|
mean value: 0.809022763950793
|
|
|
|
key: train_mcc
|
|
value: [0.8393547 0.81437091 0.86751154 0.85478898 0.85067196 0.86365469
|
|
0.85916382 0.86308561 0.84716822 0.83883199]
|
|
|
|
mean value: 0.8498602425342839
|
|
|
|
key: test_accuracy
|
|
value: [0.83636364 0.89090909 0.81818182 0.94545455 0.89090909 0.92727273
|
|
0.94545455 0.92727273 0.96363636 0.89090909]
|
|
|
|
mean value: 0.9036363636363636
|
|
|
|
key: train_accuracy
|
|
value: [0.91919192 0.90707071 0.93333333 0.92727273 0.92525253 0.93131313
|
|
0.92929293 0.93131313 0.92323232 0.91919192]
|
|
|
|
mean value: 0.9246464646464646
|
|
|
|
key: test_fscore
|
|
value: [0.83018868 0.89285714 0.8 0.94545455 0.89285714 0.93103448
|
|
0.94545455 0.93103448 0.96428571 0.89655172]
|
|
|
|
mean value: 0.9029718459809546
|
|
|
|
key: train_fscore
|
|
value: [0.92125984 0.90836653 0.93491124 0.92828685 0.9261477 0.93280632
|
|
0.9304175 0.93227092 0.92460317 0.92031873]
|
|
|
|
mean value: 0.9259388811346168
|
|
|
|
key: test_precision
|
|
value: [0.84615385 0.86206897 0.86956522 0.92857143 0.86206897 0.9
|
|
0.96296296 0.9 0.96428571 0.86666667]
|
|
|
|
mean value: 0.8962343767066405
|
|
|
|
key: train_precision
|
|
value: [0.9 0.8976378 0.91505792 0.91732283 0.91699605 0.91119691
|
|
0.9140625 0.91764706 0.90661479 0.90588235]
|
|
|
|
mean value: 0.910241820136384
|
|
|
|
key: test_recall
|
|
value: [0.81481481 0.92592593 0.74074074 0.96296296 0.92592593 0.96428571
|
|
0.92857143 0.96428571 0.96428571 0.92857143]
|
|
|
|
mean value: 0.912037037037037
|
|
|
|
key: train_recall
|
|
value: [0.94354839 0.91935484 0.95564516 0.93951613 0.93548387 0.95546559
|
|
0.94736842 0.94736842 0.94331984 0.93522267]
|
|
|
|
mean value: 0.942229332636803
|
|
|
|
key: test_roc_auc
|
|
value: [0.83597884 0.89153439 0.81679894 0.9457672 0.89153439 0.9265873
|
|
0.9457672 0.9265873 0.96362434 0.89021164]
|
|
|
|
mean value: 0.9034391534391535
|
|
|
|
key: train_roc_auc
|
|
value: [0.91914261 0.90704584 0.93328817 0.92724794 0.92523181 0.93136183
|
|
0.92932937 0.9313455 0.92327282 0.91922424]
|
|
|
|
mean value: 0.9246490139741413
|
|
|
|
key: test_jcc
|
|
value: [0.70967742 0.80645161 0.66666667 0.89655172 0.80645161 0.87096774
|
|
0.89655172 0.87096774 0.93103448 0.8125 ]
|
|
|
|
mean value: 0.8267820726733407
|
|
|
|
key: train_jcc
|
|
value: [0.8540146 0.83211679 0.87777778 0.866171 0.86245353 0.87407407
|
|
0.86988848 0.87313433 0.8597786 0.85239852]
|
|
|
|
mean value: 0.8621807699995009
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03266501 0.02544355 0.02698326 0.03438282 0.02959514 0.02234364
|
|
0.02767587 0.02475405 0.0223968 0.02460122]
|
|
|
|
mean value: 0.02708413600921631
|
|
|
|
key: score_time
|
|
value: [0.01088929 0.01079488 0.01119494 0.01080465 0.01083374 0.01082158
|
|
0.01083565 0.01080179 0.01084828 0.01076126]
|
|
|
|
mean value: 0.010858607292175294
|
|
|
|
key: test_mcc
|
|
value: [0.86189955 0.76689254 0.75462449 0.9321832 0.75047877 0.89342711
|
|
0.85933785 0.82195294 0.71611487 0.82195294]
|
|
|
|
mean value: 0.8178864271069979
|
|
|
|
key: train_mcc
|
|
value: [0.83842049 0.85032927 0.83456039 0.8314851 0.84698856 0.80724303
|
|
0.8154727 0.81912621 0.84698856 0.82718204]
|
|
|
|
mean value: 0.8317796354275596
|
|
|
|
key: test_accuracy
|
|
value: [0.92982456 0.87719298 0.87719298 0.96491228 0.875 0.94642857
|
|
0.92857143 0.91071429 0.85714286 0.91071429]
|
|
|
|
mean value: 0.9077694235588972
|
|
|
|
key: train_accuracy
|
|
value: [0.91913215 0.92504931 0.91715976 0.91518738 0.92322835 0.90354331
|
|
0.90748031 0.90944882 0.92322835 0.91338583]
|
|
|
|
mean value: 0.9156843560235444
|
|
|
|
key: test_fscore
|
|
value: [0.93103448 0.8852459 0.88135593 0.96428571 0.87719298 0.94545455
|
|
0.92592593 0.90909091 0.86206897 0.9122807 ]
|
|
|
|
mean value: 0.9093936061086217
|
|
|
|
key: train_fscore
|
|
value: [0.92007797 0.92607004 0.91796875 0.91714836 0.9245648 0.90448343
|
|
0.90909091 0.91050584 0.9245648 0.91472868]
|
|
|
|
mean value: 0.9169203576302117
|
|
|
|
key: test_precision
|
|
value: [0.9 0.81818182 0.86666667 1. 0.86206897 0.96296296
|
|
0.96153846 0.92592593 0.83333333 0.89655172]
|
|
|
|
mean value: 0.9027229858264341
|
|
|
|
key: train_precision
|
|
value: [0.91119691 0.91538462 0.90733591 0.89473684 0.90874525 0.8957529
|
|
0.89353612 0.9 0.90874525 0.90076336]
|
|
|
|
mean value: 0.90361971465238
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.89655172 0.93103448 0.89285714 0.92857143
|
|
0.89285714 0.89285714 0.89285714 0.92857143]
|
|
|
|
mean value: 0.918472906403941
|
|
|
|
key: train_recall
|
|
value: [0.92913386 0.93700787 0.92885375 0.94071146 0.94094488 0.91338583
|
|
0.92519685 0.92125984 0.94094488 0.92913386]
|
|
|
|
mean value: 0.9306573091407052
|
|
|
|
key: test_roc_auc
|
|
value: [0.93041872 0.87869458 0.87684729 0.96551724 0.875 0.94642857
|
|
0.92857143 0.91071429 0.85714286 0.91071429]
|
|
|
|
mean value: 0.9080049261083745
|
|
|
|
key: train_roc_auc
|
|
value: [0.91911238 0.92502568 0.91718278 0.91523762 0.92322835 0.90354331
|
|
0.90748031 0.90944882 0.92322835 0.91338583]
|
|
|
|
mean value: 0.9156873424418785
|
|
|
|
key: test_jcc
|
|
value: [0.87096774 0.79411765 0.78787879 0.93103448 0.78125 0.89655172
|
|
0.86206897 0.83333333 0.75757576 0.83870968]
|
|
|
|
mean value: 0.8353488117615334
|
|
|
|
key: train_jcc
|
|
value: [0.85198556 0.86231884 0.84837545 0.84697509 0.85971223 0.82562278
|
|
0.83333333 0.83571429 0.85971223 0.84285714]
|
|
|
|
mean value: 0.8466606938515135
|
|
|
|
MCC on Blind test: 0.28
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.76443863 0.82875395 0.6898253 0.68586373 0.77263117 0.67486453
|
|
0.67990541 0.83567691 0.67201447 0.71364689]
|
|
|
|
mean value: 0.7317620992660523
|
|
|
|
key: score_time
|
|
value: [0.01203632 0.01211286 0.0111289 0.01215601 0.01238728 0.02110219
|
|
0.01246333 0.01241684 0.01234293 0.01244378]
|
|
|
|
mean value: 0.013059043884277343
|
|
|
|
key: test_mcc
|
|
value: [0.85960591 0.86189955 0.85960591 0.9321832 0.85933785 0.96490128
|
|
0.96490128 0.85714286 0.82195294 0.89342711]
|
|
|
|
mean value: 0.8874957891973448
|
|
|
|
key: train_mcc
|
|
value: [0.94480322 0.92902382 0.93691156 0.93691156 0.95287407 0.93712408
|
|
0.94491118 0.94095217 0.93703692 0.93703692]
|
|
|
|
mean value: 0.9397585529859461
|
|
|
|
key: test_accuracy
|
|
value: [0.92982456 0.92982456 0.92982456 0.96491228 0.92857143 0.98214286
|
|
0.98214286 0.92857143 0.91071429 0.94642857]
|
|
|
|
mean value: 0.943295739348371
|
|
|
|
key: train_accuracy
|
|
value: [0.97238659 0.96449704 0.96844181 0.96844181 0.97637795 0.96850394
|
|
0.97244094 0.97047244 0.96850394 0.96850394]
|
|
|
|
mean value: 0.9698570407988942
|
|
|
|
key: test_fscore
|
|
value: [0.92857143 0.93103448 0.93103448 0.96428571 0.93103448 0.98245614
|
|
0.98181818 0.92857143 0.9122807 0.94545455]
|
|
|
|
mean value: 0.9436541589082423
|
|
|
|
key: train_fscore
|
|
value: [0.97233202 0.96442688 0.96825397 0.96825397 0.97619048 0.96825397
|
|
0.97233202 0.9704142 0.96837945 0.96837945]
|
|
|
|
mean value: 0.9697216384507354
|
|
|
|
key: test_precision
|
|
value: [0.92857143 0.9 0.93103448 1. 0.9 0.96551724
|
|
1. 0.92857143 0.89655172 0.96296296]
|
|
|
|
mean value: 0.9413209268381683
|
|
|
|
key: train_precision
|
|
value: [0.97619048 0.96825397 0.97211155 0.97211155 0.984 0.976
|
|
0.97619048 0.97233202 0.97222222 0.97222222]
|
|
|
|
mean value: 0.9741634488459363
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.96428571 0.93103448 0.93103448 0.96428571 1.
|
|
0.96428571 0.92857143 0.92857143 0.92857143]
|
|
|
|
mean value: 0.9469211822660099
|
|
|
|
key: train_recall
|
|
value: [0.96850394 0.96062992 0.96442688 0.96442688 0.96850394 0.96062992
|
|
0.96850394 0.96850394 0.96456693 0.96456693]
|
|
|
|
mean value: 0.9653263203759609
|
|
|
|
key: test_roc_auc
|
|
value: [0.92980296 0.93041872 0.92980296 0.96551724 0.92857143 0.98214286
|
|
0.98214286 0.92857143 0.91071429 0.94642857]
|
|
|
|
mean value: 0.9434113300492611
|
|
|
|
key: train_roc_auc
|
|
value: [0.97239426 0.96450468 0.96843391 0.96843391 0.97637795 0.96850394
|
|
0.97244094 0.97047244 0.96850394 0.96850394]
|
|
|
|
mean value: 0.9698569916902681
|
|
|
|
key: test_jcc
|
|
value: [0.86666667 0.87096774 0.87096774 0.93103448 0.87096774 0.96551724
|
|
0.96428571 0.86666667 0.83870968 0.89655172]
|
|
|
|
mean value: 0.8942335399120717
|
|
|
|
key: train_jcc
|
|
value: [0.94615385 0.93129771 0.93846154 0.93846154 0.95348837 0.93846154
|
|
0.94615385 0.94252874 0.93869732 0.93869732]
|
|
|
|
mean value: 0.9412401761356505
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.59
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01082301 0.01014185 0.00848889 0.00828099 0.00836778 0.00830722
|
|
0.00840569 0.00863934 0.00839567 0.00852156]
|
|
|
|
mean value: 0.008837199211120606
|
|
|
|
key: score_time
|
|
value: [0.01088572 0.00903606 0.0087676 0.00838327 0.0087471 0.00870657
|
|
0.00868726 0.00890255 0.00878382 0.00867343]
|
|
|
|
mean value: 0.008957338333129884
|
|
|
|
key: test_mcc
|
|
value: [0.50927421 0.65018988 0.7366424 0.64889453 0.65814518 0.58501794
|
|
0.80439967 0.61706091 0.64951905 0.61706091]
|
|
|
|
mean value: 0.6476204667782233
|
|
|
|
key: train_mcc
|
|
value: [0.67420459 0.6683308 0.66925612 0.67734922 0.69555499 0.6527166
|
|
0.65044798 0.70356186 0.67461719 0.68157216]
|
|
|
|
mean value: 0.6747611503976669
|
|
|
|
key: test_accuracy
|
|
value: [0.75438596 0.8245614 0.85964912 0.80701754 0.82142857 0.78571429
|
|
0.89285714 0.80357143 0.82142857 0.80357143]
|
|
|
|
mean value: 0.8174185463659148
|
|
|
|
key: train_accuracy
|
|
value: [0.82840237 0.82642998 0.82642998 0.83234714 0.84251969 0.81889764
|
|
0.81692913 0.8484252 0.83070866 0.83464567]
|
|
|
|
mean value: 0.830573545170759
|
|
|
|
key: test_fscore
|
|
value: [0.74074074 0.81481481 0.84615385 0.7755102 0.8 0.76
|
|
0.88 0.78431373 0.80769231 0.78431373]
|
|
|
|
mean value: 0.7993539364463734
|
|
|
|
key: train_fscore
|
|
value: [0.80709534 0.8061674 0.80444444 0.81400438 0.82758621 0.79735683
|
|
0.79379157 0.8372093 0.81222707 0.8173913 ]
|
|
|
|
mean value: 0.8117273855652805
|
|
|
|
key: test_precision
|
|
value: [0.76923077 0.84615385 0.95652174 0.95 0.90909091 0.86363636
|
|
1. 0.86956522 0.875 0.86956522]
|
|
|
|
mean value: 0.8908764062024932
|
|
|
|
key: train_precision
|
|
value: [0.92385787 0.915 0.91878173 0.91176471 0.91428571 0.905
|
|
0.90862944 0.90410959 0.91176471 0.91262136]
|
|
|
|
mean value: 0.9125815109847812
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.78571429 0.75862069 0.65517241 0.71428571 0.67857143
|
|
0.78571429 0.71428571 0.75 0.71428571]
|
|
|
|
mean value: 0.7270935960591133
|
|
|
|
key: train_recall
|
|
value: [0.71653543 0.72047244 0.71541502 0.73517787 0.75590551 0.71259843
|
|
0.70472441 0.77952756 0.73228346 0.74015748]
|
|
|
|
mean value: 0.7312797609784942
|
|
|
|
key: test_roc_auc
|
|
value: [0.75369458 0.82389163 0.8614532 0.80972906 0.82142857 0.78571429
|
|
0.89285714 0.80357143 0.82142857 0.80357143]
|
|
|
|
mean value: 0.8177339901477833
|
|
|
|
key: train_roc_auc
|
|
value: [0.82862345 0.82663938 0.82621145 0.83215586 0.84251969 0.81889764
|
|
0.81692913 0.8484252 0.83070866 0.83464567]
|
|
|
|
mean value: 0.8305756123369954
|
|
|
|
key: test_jcc
|
|
value: [0.58823529 0.6875 0.73333333 0.63333333 0.66666667 0.61290323
|
|
0.78571429 0.64516129 0.67741935 0.64516129]
|
|
|
|
mean value: 0.6675428074455588
|
|
|
|
key: train_jcc
|
|
value: [0.67657993 0.67527675 0.67286245 0.68634686 0.70588235 0.66300366
|
|
0.65808824 0.72 0.68382353 0.69117647]
|
|
|
|
mean value: 0.6833040246657276
|
|
|
|
MCC on Blind test: 0.34
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00908518 0.00873065 0.00850201 0.00850201 0.00844526 0.00834846
|
|
0.00829315 0.00879455 0.00849795 0.00861597]
|
|
|
|
mean value: 0.00858151912689209
|
|
|
|
key: score_time
|
|
value: [0.00892878 0.0087173 0.00866246 0.00867057 0.00865889 0.00869465
|
|
0.0087862 0.00887084 0.00837755 0.00877047]
|
|
|
|
mean value: 0.008713769912719726
|
|
|
|
key: test_mcc
|
|
value: [0.79778885 0.72706729 0.79110556 0.66755025 0.71611487 0.78772636
|
|
0.79385662 0.75047877 0.67900461 0.75047877]
|
|
|
|
mean value: 0.7461171974035183
|
|
|
|
key: train_mcc
|
|
value: [0.77122271 0.76334013 0.76731664 0.68276748 0.78361641 0.76800824
|
|
0.76819892 0.77588525 0.78361641 0.77574087]
|
|
|
|
mean value: 0.763971305717051
|
|
|
|
key: test_accuracy
|
|
value: [0.89473684 0.85964912 0.89473684 0.80701754 0.85714286 0.89285714
|
|
0.89285714 0.875 0.83928571 0.875 ]
|
|
|
|
mean value: 0.868828320802005
|
|
|
|
key: train_accuracy
|
|
value: [0.88560158 0.8816568 0.88362919 0.84023669 0.89173228 0.88385827
|
|
0.88385827 0.88779528 0.89173228 0.88779528]
|
|
|
|
mean value: 0.8817895913898336
|
|
|
|
key: test_fscore
|
|
value: [0.9 0.86666667 0.9 0.76595745 0.86206897 0.88888889
|
|
0.88461538 0.87272727 0.84210526 0.87719298]
|
|
|
|
mean value: 0.8660222870838
|
|
|
|
key: train_fscore
|
|
value: [0.88627451 0.88142292 0.88408644 0.83298969 0.89278752 0.88543689
|
|
0.88588008 0.88932039 0.89278752 0.88888889]
|
|
|
|
mean value: 0.8819874865979285
|
|
|
|
key: test_precision
|
|
value: [0.84375 0.8125 0.87096774 1. 0.83333333 0.92307692
|
|
0.95833333 0.88888889 0.82758621 0.86206897]
|
|
|
|
mean value: 0.8820505392981756
|
|
|
|
key: train_precision
|
|
value: [0.8828125 0.88492063 0.87890625 0.87068966 0.88416988 0.87356322
|
|
0.87072243 0.87739464 0.88416988 0.88030888]
|
|
|
|
mean value: 0.8787657976607903
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.92857143 0.93103448 0.62068966 0.89285714 0.85714286
|
|
0.82142857 0.85714286 0.85714286 0.89285714]
|
|
|
|
mean value: 0.8623152709359606
|
|
|
|
key: train_recall
|
|
value: [0.88976378 0.87795276 0.88932806 0.79841897 0.9015748 0.8976378
|
|
0.9015748 0.9015748 0.9015748 0.8976378 ]
|
|
|
|
mean value: 0.8857038374155799
|
|
|
|
key: test_roc_auc
|
|
value: [0.89593596 0.86083744 0.89408867 0.81034483 0.85714286 0.89285714
|
|
0.89285714 0.875 0.83928571 0.875 ]
|
|
|
|
mean value: 0.8693349753694581
|
|
|
|
key: train_roc_auc
|
|
value: [0.88559335 0.88166412 0.88364041 0.84015437 0.89173228 0.88385827
|
|
0.88385827 0.88779528 0.89173228 0.88779528]
|
|
|
|
mean value: 0.881782390837509
|
|
|
|
key: test_jcc
|
|
value: [0.81818182 0.76470588 0.81818182 0.62068966 0.75757576 0.8
|
|
0.79310345 0.77419355 0.72727273 0.78125 ]
|
|
|
|
mean value: 0.7655154655400436
|
|
|
|
key: train_jcc
|
|
value: [0.79577465 0.78798587 0.79225352 0.71378092 0.80633803 0.79442509
|
|
0.79513889 0.8006993 0.80633803 0.8 ]
|
|
|
|
mean value: 0.7892734286500613
|
|
|
|
MCC on Blind test: 0.28
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.0078218 0.00819182 0.00788093 0.00713396 0.00716805 0.00795126
|
|
0.00721955 0.00793576 0.00724411 0.00745344]
|
|
|
|
mean value: 0.007600069046020508
|
|
|
|
key: score_time
|
|
value: [0.01292777 0.01269674 0.01303506 0.01127744 0.01387811 0.01285839
|
|
0.01185989 0.0118773 0.0108037 0.01251173]
|
|
|
|
mean value: 0.012372612953186035
|
|
|
|
key: test_mcc
|
|
value: [0.72706729 0.68850906 0.71921182 0.8953202 0.71611487 0.68250015
|
|
0.79385662 0.75047877 0.67900461 0.75047877]
|
|
|
|
mean value: 0.7402542168196266
|
|
|
|
key: train_mcc
|
|
value: [0.79496359 0.79887642 0.78334713 0.77932046 0.79951627 0.76777009
|
|
0.79936749 0.79163927 0.80324922 0.79530025]
|
|
|
|
mean value: 0.7913350175140176
|
|
|
|
key: test_accuracy
|
|
value: [0.85964912 0.84210526 0.85964912 0.94736842 0.85714286 0.83928571
|
|
0.89285714 0.875 0.83928571 0.875 ]
|
|
|
|
mean value: 0.868734335839599
|
|
|
|
key: train_accuracy
|
|
value: [0.8974359 0.89940828 0.89151874 0.88954635 0.8996063 0.88385827
|
|
0.8996063 0.89566929 0.9015748 0.8976378 ]
|
|
|
|
mean value: 0.8955862026122474
|
|
|
|
key: test_fscore
|
|
value: [0.86666667 0.84745763 0.86206897 0.94736842 0.86206897 0.83018868
|
|
0.88461538 0.87272727 0.84210526 0.87719298]
|
|
|
|
mean value: 0.86924602280744
|
|
|
|
key: train_fscore
|
|
value: [0.8984375 0.8990099 0.89278752 0.890625 0.90097087 0.88454012
|
|
0.9005848 0.89708738 0.90234375 0.89803922]
|
|
|
|
mean value: 0.8964426056208497
|
|
|
|
key: test_precision
|
|
value: [0.8125 0.80645161 0.86206897 0.96428571 0.83333333 0.88
|
|
0.95833333 0.88888889 0.82758621 0.86206897]
|
|
|
|
mean value: 0.869551702067553
|
|
|
|
key: train_precision
|
|
value: [0.89147287 0.90438247 0.88076923 0.88030888 0.88888889 0.87937743
|
|
0.89189189 0.88505747 0.89534884 0.89453125]
|
|
|
|
mean value: 0.8892029220575753
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.89285714 0.86206897 0.93103448 0.89285714 0.78571429
|
|
0.82142857 0.85714286 0.85714286 0.89285714]
|
|
|
|
mean value: 0.8721674876847291
|
|
|
|
key: train_recall
|
|
value: [0.90551181 0.89370079 0.90513834 0.90118577 0.91338583 0.88976378
|
|
0.90944882 0.90944882 0.90944882 0.9015748 ]
|
|
|
|
mean value: 0.9038607575238866
|
|
|
|
key: test_roc_auc
|
|
value: [0.86083744 0.8429803 0.85960591 0.9476601 0.85714286 0.83928571
|
|
0.89285714 0.875 0.83928571 0.875 ]
|
|
|
|
mean value: 0.8689655172413794
|
|
|
|
key: train_roc_auc
|
|
value: [0.89741994 0.89941956 0.89154555 0.88956926 0.8996063 0.88385827
|
|
0.8996063 0.89566929 0.9015748 0.8976378 ]
|
|
|
|
mean value: 0.8955907067940618
|
|
|
|
key: test_jcc
|
|
value: [0.76470588 0.73529412 0.75757576 0.9 0.75757576 0.70967742
|
|
0.79310345 0.77419355 0.72727273 0.78125 ]
|
|
|
|
mean value: 0.770064865844204
|
|
|
|
key: train_jcc
|
|
value: [0.81560284 0.81654676 0.80633803 0.8028169 0.81978799 0.79298246
|
|
0.81914894 0.81338028 0.82206406 0.81494662]
|
|
|
|
mean value: 0.8123614865069838
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01474333 0.01426578 0.01444244 0.01453424 0.01460671 0.01486588
|
|
0.01472378 0.01466084 0.01448703 0.01444364]
|
|
|
|
mean value: 0.014577364921569825
|
|
|
|
key: score_time
|
|
value: [0.00919628 0.00896859 0.00908899 0.00891948 0.00902748 0.00907803
|
|
0.00912905 0.00929427 0.00912786 0.00900006]
|
|
|
|
mean value: 0.009083008766174317
|
|
|
|
key: test_mcc
|
|
value: [0.82942474 0.76689254 0.79110556 0.89988258 0.71611487 0.78772636
|
|
0.79385662 0.78772636 0.67900461 0.71428571]
|
|
|
|
mean value: 0.776601995146589
|
|
|
|
key: train_mcc
|
|
value: [0.78308641 0.79093074 0.78708603 0.77160078 0.79537422 0.77974514
|
|
0.78395685 0.78779242 0.79537422 0.78351922]
|
|
|
|
mean value: 0.7858466034660538
|
|
|
|
key: test_accuracy
|
|
value: [0.9122807 0.87719298 0.89473684 0.94736842 0.85714286 0.89285714
|
|
0.89285714 0.89285714 0.83928571 0.85714286]
|
|
|
|
mean value: 0.8863721804511278
|
|
|
|
key: train_accuracy
|
|
value: [0.89151874 0.89546351 0.89349112 0.88560158 0.8976378 0.88976378
|
|
0.89173228 0.89370079 0.8976378 0.89173228]
|
|
|
|
mean value: 0.8928279675099784
|
|
|
|
key: test_fscore
|
|
value: [0.91525424 0.8852459 0.9 0.94545455 0.86206897 0.88888889
|
|
0.88461538 0.88888889 0.84210526 0.85714286]
|
|
|
|
mean value: 0.8869664932593181
|
|
|
|
key: train_fscore
|
|
value: [0.89236791 0.89587426 0.89411765 0.88715953 0.8984375 0.89105058
|
|
0.89361702 0.89534884 0.8984375 0.89236791]
|
|
|
|
mean value: 0.8938778697670609
|
|
|
|
key: test_precision
|
|
value: [0.87096774 0.81818182 0.87096774 1. 0.83333333 0.92307692
|
|
0.95833333 0.92307692 0.82758621 0.85714286]
|
|
|
|
mean value: 0.8882666878912708
|
|
|
|
key: train_precision
|
|
value: [0.88715953 0.89411765 0.88715953 0.87356322 0.89147287 0.88076923
|
|
0.878327 0.88167939 0.89147287 0.88715953]
|
|
|
|
mean value: 0.8852880817385453
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.93103448 0.89655172 0.89285714 0.85714286
|
|
0.82142857 0.85714286 0.85714286 0.85714286]
|
|
|
|
mean value: 0.8899014778325123
|
|
|
|
key: train_recall
|
|
value: [0.8976378 0.8976378 0.90118577 0.90118577 0.90551181 0.9015748
|
|
0.90944882 0.90944882 0.90551181 0.8976378 ]
|
|
|
|
mean value: 0.9026780990320874
|
|
|
|
key: test_roc_auc
|
|
value: [0.91317734 0.87869458 0.89408867 0.94827586 0.85714286 0.89285714
|
|
0.89285714 0.89285714 0.83928571 0.85714286]
|
|
|
|
mean value: 0.8866379310344827
|
|
|
|
key: train_roc_auc
|
|
value: [0.89150664 0.89545921 0.89350627 0.88563226 0.8976378 0.88976378
|
|
0.89173228 0.89370079 0.8976378 0.89173228]
|
|
|
|
mean value: 0.8928309109582646
|
|
|
|
key: test_jcc
|
|
value: [0.84375 0.79411765 0.81818182 0.89655172 0.75757576 0.8
|
|
0.79310345 0.8 0.72727273 0.75 ]
|
|
|
|
mean value: 0.798055312250292
|
|
|
|
key: train_jcc
|
|
value: [0.80565371 0.8113879 0.80851064 0.7972028 0.81560284 0.80350877
|
|
0.80769231 0.81052632 0.81560284 0.80565371]
|
|
|
|
mean value: 0.8081341825521712
|
|
|
|
MCC on Blind test: 0.22
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.39773488 1.62088108 1.47775269 1.50360155 1.53373218 1.47553325
|
|
1.55788469 1.53873181 1.53706622 1.49211836]
|
|
|
|
mean value: 1.5135036706924438
|
|
|
|
key: score_time
|
|
value: [0.01128125 0.01326585 0.01378894 0.01334476 0.01340508 0.01640296
|
|
0.01380563 0.01374269 0.0163722 0.01363611]
|
|
|
|
mean value: 0.013904547691345215
|
|
|
|
key: test_mcc
|
|
value: [0.8951918 0.86189955 0.82880708 0.82490815 0.78772636 0.85933785
|
|
0.96490128 0.85714286 0.82195294 0.85714286]
|
|
|
|
mean value: 0.8559010729919259
|
|
|
|
key: train_mcc
|
|
value: [0.96067294 0.96450468 0.96847232 0.96844169 0.9645744 0.9645744
|
|
0.9645744 0.97244848 0.9606597 0.98032256]
|
|
|
|
mean value: 0.9669245592847295
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.92982456 0.9122807 0.9122807 0.89285714 0.92857143
|
|
0.98214286 0.92857143 0.91071429 0.92857143]
|
|
|
|
mean value: 0.9273182957393483
|
|
|
|
key: train_accuracy
|
|
value: [0.98027613 0.98224852 0.98422091 0.98422091 0.98228346 0.98228346
|
|
0.98228346 0.98622047 0.98031496 0.99015748]
|
|
|
|
mean value: 0.9834509776514623
|
|
|
|
key: test_fscore
|
|
value: [0.94545455 0.93103448 0.91803279 0.91525424 0.89655172 0.92592593
|
|
0.98181818 0.92857143 0.9122807 0.92857143]
|
|
|
|
mean value: 0.928349544316583
|
|
|
|
key: train_fscore
|
|
value: [0.98015873 0.98224852 0.98425197 0.98418972 0.98224852 0.98224852
|
|
0.98224852 0.98619329 0.98023715 0.99017682]
|
|
|
|
mean value: 0.9834201770147663
|
|
|
|
key: test_precision
|
|
value: [0.96296296 0.9 0.875 0.9 0.86666667 0.96153846
|
|
1. 0.92857143 0.89655172 0.92857143]
|
|
|
|
mean value: 0.921986267244888
|
|
|
|
key: train_precision
|
|
value: [0.988 0.98418972 0.98039216 0.98418972 0.98418972 0.98418972
|
|
0.98418972 0.98814229 0.98412698 0.98823529]
|
|
|
|
mean value: 0.9849845344198285
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.96428571 0.96551724 0.93103448 0.92857143 0.89285714
|
|
0.96428571 0.92857143 0.92857143 0.92857143]
|
|
|
|
mean value: 0.9360837438423646
|
|
|
|
key: train_recall
|
|
value: [0.97244094 0.98031496 0.98814229 0.98418972 0.98031496 0.98031496
|
|
0.98031496 0.98425197 0.97637795 0.99212598]
|
|
|
|
mean value: 0.9818788708723662
|
|
|
|
key: test_roc_auc
|
|
value: [0.94704433 0.93041872 0.91133005 0.91194581 0.89285714 0.92857143
|
|
0.98214286 0.92857143 0.91071429 0.92857143]
|
|
|
|
mean value: 0.9272167487684729
|
|
|
|
key: train_roc_auc
|
|
value: [0.98029162 0.98225234 0.98422863 0.98422085 0.98228346 0.98228346
|
|
0.98228346 0.98622047 0.98031496 0.99015748]
|
|
|
|
mean value: 0.9834536740219726
|
|
|
|
key: test_jcc
|
|
value: [0.89655172 0.87096774 0.84848485 0.84375 0.8125 0.86206897
|
|
0.96428571 0.86666667 0.83870968 0.86666667]
|
|
|
|
mean value: 0.8670652005113907
|
|
|
|
key: train_jcc
|
|
value: [0.96108949 0.96511628 0.96899225 0.9688716 0.96511628 0.96511628
|
|
0.96511628 0.97276265 0.96124031 0.98054475]
|
|
|
|
mean value: 0.9673966156908878
|
|
|
|
MCC on Blind test: 0.26
|
|
|
|
Accuracy on Blind test: 0.66
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0136857 0.01279187 0.01128888 0.01076555 0.01062655 0.01041436
|
|
0.01054716 0.01079988 0.0110817 0.01173425]
|
|
|
|
mean value: 0.011373591423034669
|
|
|
|
key: score_time
|
|
value: [0.01080513 0.00837135 0.00845194 0.00823951 0.0084095 0.0082202
|
|
0.00809073 0.00809741 0.00841331 0.0083375 ]
|
|
|
|
mean value: 0.008543658256530761
|
|
|
|
key: test_mcc
|
|
value: [0.92980296 0.8953202 0.82942474 0.96551724 0.75047877 0.89342711
|
|
0.89342711 0.85933785 0.96490128 0.92857143]
|
|
|
|
mean value: 0.891020869070053
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.94736842 0.9122807 0.98245614 0.875 0.94642857
|
|
0.94642857 0.92857143 0.98214286 0.96428571]
|
|
|
|
mean value: 0.9449874686716792
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96428571 0.94736842 0.90909091 0.98245614 0.87719298 0.94736842
|
|
0.94545455 0.92592593 0.98181818 0.96428571]
|
|
|
|
mean value: 0.9445246955773272
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96428571 0.93103448 0.96153846 1. 0.86206897 0.93103448
|
|
0.96296296 0.96153846 1. 0.96428571]
|
|
|
|
mean value: 0.9538749245645797
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.86206897 0.96551724 0.89285714 0.96428571
|
|
0.92857143 0.89285714 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9363300492610838
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96490148 0.9476601 0.91317734 0.98275862 0.875 0.94642857
|
|
0.94642857 0.92857143 0.98214286 0.96428571]
|
|
|
|
mean value: 0.9451354679802957
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.93103448 0.9 0.83333333 0.96551724 0.78125 0.9
|
|
0.89655172 0.86206897 0.96428571 0.93103448]
|
|
|
|
mean value: 0.8965075944170772
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.11
|
|
|
|
Accuracy on Blind test: 0.36
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10608721 0.10524035 0.10447693 0.1020844 0.10217381 0.10169768
|
|
0.10213351 0.1026423 0.10438013 0.10096812]
|
|
|
|
mean value: 0.10318844318389893
|
|
|
|
key: score_time
|
|
value: [0.01834702 0.01700187 0.01766229 0.0172255 0.01845121 0.0170753
|
|
0.0172298 0.01727653 0.01692057 0.01812077]
|
|
|
|
mean value: 0.01753108501434326
|
|
|
|
key: test_mcc
|
|
value: [0.82942474 0.86189955 0.8615634 0.8953202 0.78772636 0.93094934
|
|
0.89802651 0.78772636 0.75047877 0.85933785]
|
|
|
|
mean value: 0.84624530759693
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9122807 0.92982456 0.92982456 0.94736842 0.89285714 0.96428571
|
|
0.94642857 0.89285714 0.875 0.92857143]
|
|
|
|
mean value: 0.9219298245614035
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.91525424 0.93103448 0.93333333 0.94736842 0.89655172 0.96296296
|
|
0.94339623 0.88888889 0.87719298 0.93103448]
|
|
|
|
mean value: 0.9227017742052359
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.87096774 0.9 0.90322581 0.96428571 0.86666667 1.
|
|
1. 0.92307692 0.86206897 0.9 ]
|
|
|
|
mean value: 0.9190291817933642
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.96551724 0.93103448 0.92857143 0.92857143
|
|
0.89285714 0.85714286 0.89285714 0.96428571]
|
|
|
|
mean value: 0.9289408866995074
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.91317734 0.93041872 0.92918719 0.9476601 0.89285714 0.96428571
|
|
0.94642857 0.89285714 0.875 0.92857143]
|
|
|
|
mean value: 0.9220443349753695
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.84375 0.87096774 0.875 0.9 0.8125 0.92857143
|
|
0.89285714 0.8 0.78125 0.87096774]
|
|
|
|
mean value: 0.8575864055299539
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.33
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00839448 0.00779438 0.00789499 0.00782299 0.00828099 0.00764513
|
|
0.0078218 0.00805545 0.00872326 0.00792527]
|
|
|
|
mean value: 0.008035874366760254
|
|
|
|
key: score_time
|
|
value: [0.0083375 0.00801182 0.00785446 0.0080626 0.0083189 0.00805974
|
|
0.00803876 0.00792432 0.00809288 0.00801706]
|
|
|
|
mean value: 0.008071804046630859
|
|
|
|
key: test_mcc
|
|
value: [0.79161589 0.68850906 0.72133224 0.54592083 0.4645821 0.61065803
|
|
0.79385662 0.68250015 0.64285714 0.62705445]
|
|
|
|
mean value: 0.65688865057222
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.89473684 0.84210526 0.85964912 0.77192982 0.73214286 0.80357143
|
|
0.89285714 0.83928571 0.82142857 0.80357143]
|
|
|
|
mean value: 0.8261278195488722
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.89655172 0.84745763 0.85714286 0.78688525 0.72727273 0.79245283
|
|
0.88461538 0.83018868 0.82142857 0.7755102 ]
|
|
|
|
mean value: 0.821950585113335
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.86666667 0.80645161 0.88888889 0.75 0.74074074 0.84
|
|
0.95833333 0.88 0.82142857 0.9047619 ]
|
|
|
|
mean value: 0.8457271718723331
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.89285714 0.82758621 0.82758621 0.71428571 0.75
|
|
0.82142857 0.78571429 0.82142857 0.67857143]
|
|
|
|
mean value: 0.8048029556650247
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.8953202 0.8429803 0.86022167 0.77093596 0.73214286 0.80357143
|
|
0.89285714 0.83928571 0.82142857 0.80357143]
|
|
|
|
mean value: 0.826231527093596
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8125 0.73529412 0.75 0.64864865 0.57142857 0.65625
|
|
0.79310345 0.70967742 0.6969697 0.63333333]
|
|
|
|
mean value: 0.7007205235658011
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.23
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.3137598 1.31105161 1.31397271 1.34022093 1.33495617 1.32320976
|
|
1.32298803 1.31964326 1.33588552 1.33314967]
|
|
|
|
mean value: 1.3248837471008301
|
|
|
|
key: score_time
|
|
value: [0.09023738 0.0960989 0.0929544 0.09687686 0.09749436 0.092448
|
|
0.09450769 0.09734035 0.09717226 0.09090662]
|
|
|
|
mean value: 0.09460368156433105
|
|
|
|
key: test_mcc
|
|
value: [0.92980296 0.92980296 0.8951918 0.9321832 0.85933785 0.96490128
|
|
0.96490128 0.92857143 0.89342711 0.89342711]
|
|
|
|
mean value: 0.9191546978182543
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.96491228 0.94736842 0.96491228 0.92857143 0.98214286
|
|
0.98214286 0.96428571 0.94642857 0.94642857]
|
|
|
|
mean value: 0.9592105263157895
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96428571 0.96428571 0.94915254 0.96428571 0.93103448 0.98245614
|
|
0.98181818 0.96428571 0.94545455 0.94736842]
|
|
|
|
mean value: 0.9594427170950596
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96428571 0.96428571 0.93333333 1. 0.9 0.96551724
|
|
1. 0.96428571 0.96296296 0.93103448]
|
|
|
|
mean value: 0.958570516329137
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.96551724 0.93103448 0.96428571 1.
|
|
0.96428571 0.96428571 0.92857143 0.96428571]
|
|
|
|
mean value: 0.9610837438423645
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96490148 0.96490148 0.94704433 0.96551724 0.92857143 0.98214286
|
|
0.98214286 0.96428571 0.94642857 0.94642857]
|
|
|
|
mean value: 0.9592364532019705
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.93103448 0.93103448 0.90322581 0.93103448 0.87096774 0.96551724
|
|
0.96428571 0.93103448 0.89655172 0.9 ]
|
|
|
|
mean value: 0.9224686159224535
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.5
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.87169266 0.93242407 0.9023416 1.00182915 0.90367198 0.91859269
|
|
0.90991735 0.90380979 0.88591456 0.89217138]
|
|
|
|
mean value: 0.9122365236282348
|
|
|
|
key: score_time
|
|
value: [0.22929215 0.26355243 0.24770474 0.25745416 0.18990588 0.25744367
|
|
0.27588391 0.26097345 0.2340591 0.21919918]
|
|
|
|
mean value: 0.24354686737060546
|
|
|
|
key: test_mcc
|
|
value: [0.8953202 0.92980296 0.8951918 0.9321832 0.85933785 0.96490128
|
|
0.96490128 0.96490128 0.89342711 0.89342711]
|
|
|
|
mean value: 0.919339407234444
|
|
|
|
key: train_mcc
|
|
value: [0.95679178 0.94890036 0.94878539 0.94089544 0.9606597 0.94900279
|
|
0.94112724 0.94499908 0.94888508 0.95687833]
|
|
|
|
mean value: 0.9496925191019135
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.96491228 0.94736842 0.96491228 0.92857143 0.98214286
|
|
0.98214286 0.98214286 0.94642857 0.94642857]
|
|
|
|
mean value: 0.9592418546365915
|
|
|
|
key: train_accuracy
|
|
value: [0.97830375 0.97435897 0.97435897 0.9704142 0.98031496 0.97440945
|
|
0.97047244 0.97244094 0.97440945 0.97834646]
|
|
|
|
mean value: 0.9747829598223299
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 0.96428571 0.94915254 0.96428571 0.93103448 0.98245614
|
|
0.98181818 0.98181818 0.94545455 0.94736842]
|
|
|
|
mean value: 0.959504234524998
|
|
|
|
key: train_fscore
|
|
value: [0.9785575 0.97465887 0.97445972 0.97053045 0.98039216 0.97465887
|
|
0.97076023 0.97265625 0.97455969 0.9785575 ]
|
|
|
|
mean value: 0.9749791253024628
|
|
|
|
key: test_precision
|
|
value: [0.93103448 0.96428571 0.93333333 1. 0.9 0.96551724
|
|
1. 1. 0.96296296 0.93103448]
|
|
|
|
mean value: 0.9588168217478562
|
|
|
|
key: train_precision
|
|
value: [0.96911197 0.96525097 0.96875 0.96484375 0.9765625 0.96525097
|
|
0.96138996 0.96511628 0.9688716 0.96911197]
|
|
|
|
mean value: 0.9674259954516337
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.96551724 0.93103448 0.96428571 1.
|
|
0.96428571 0.96428571 0.92857143 0.96428571]
|
|
|
|
mean value: 0.9610837438423645
|
|
|
|
key: train_recall
|
|
value: [0.98818898 0.98425197 0.98023715 0.97628458 0.98425197 0.98425197
|
|
0.98031496 0.98031496 0.98031496 0.98818898]
|
|
|
|
mean value: 0.9826600479287916
|
|
|
|
key: test_roc_auc
|
|
value: [0.9476601 0.96490148 0.94704433 0.96551724 0.92857143 0.98214286
|
|
0.98214286 0.98214286 0.94642857 0.94642857]
|
|
|
|
mean value: 0.9592980295566503
|
|
|
|
key: train_roc_auc
|
|
value: [0.97828421 0.97433942 0.97437055 0.97042576 0.98031496 0.97440945
|
|
0.97047244 0.97244094 0.97440945 0.97834646]
|
|
|
|
mean value: 0.9747813637919767
|
|
|
|
key: test_jcc
|
|
value: [0.9 0.93103448 0.90322581 0.93103448 0.87096774 0.96551724
|
|
0.96428571 0.96428571 0.89655172 0.9 ]
|
|
|
|
mean value: 0.9226902907993009
|
|
|
|
key: train_jcc
|
|
value: [0.95801527 0.95057034 0.95019157 0.94274809 0.96153846 0.95057034
|
|
0.94318182 0.94676806 0.95038168 0.95801527]
|
|
|
|
mean value: 0.9511980901192165
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.5
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02086926 0.00807834 0.00846243 0.00790358 0.00789189 0.0084095
|
|
0.00794506 0.00821877 0.00814724 0.00783157]
|
|
|
|
mean value: 0.009375762939453126
|
|
|
|
key: score_time
|
|
value: [0.01121163 0.00862956 0.00874233 0.00838256 0.00837135 0.00809836
|
|
0.00858903 0.00843906 0.00876284 0.00867867]
|
|
|
|
mean value: 0.008790540695190429
|
|
|
|
key: test_mcc
|
|
value: [0.79778885 0.72706729 0.79110556 0.66755025 0.71611487 0.78772636
|
|
0.79385662 0.75047877 0.67900461 0.75047877]
|
|
|
|
mean value: 0.7461171974035183
|
|
|
|
key: train_mcc
|
|
value: [0.77122271 0.76334013 0.76731664 0.68276748 0.78361641 0.76800824
|
|
0.76819892 0.77588525 0.78361641 0.77574087]
|
|
|
|
mean value: 0.763971305717051
|
|
|
|
key: test_accuracy
|
|
value: [0.89473684 0.85964912 0.89473684 0.80701754 0.85714286 0.89285714
|
|
0.89285714 0.875 0.83928571 0.875 ]
|
|
|
|
mean value: 0.868828320802005
|
|
|
|
key: train_accuracy
|
|
value: [0.88560158 0.8816568 0.88362919 0.84023669 0.89173228 0.88385827
|
|
0.88385827 0.88779528 0.89173228 0.88779528]
|
|
|
|
mean value: 0.8817895913898336
|
|
|
|
key: test_fscore
|
|
value: [0.9 0.86666667 0.9 0.76595745 0.86206897 0.88888889
|
|
0.88461538 0.87272727 0.84210526 0.87719298]
|
|
|
|
mean value: 0.8660222870838
|
|
|
|
key: train_fscore
|
|
value: [0.88627451 0.88142292 0.88408644 0.83298969 0.89278752 0.88543689
|
|
0.88588008 0.88932039 0.89278752 0.88888889]
|
|
|
|
mean value: 0.8819874865979285
|
|
|
|
key: test_precision
|
|
value: [0.84375 0.8125 0.87096774 1. 0.83333333 0.92307692
|
|
0.95833333 0.88888889 0.82758621 0.86206897]
|
|
|
|
mean value: 0.8820505392981756
|
|
|
|
key: train_precision
|
|
value: [0.8828125 0.88492063 0.87890625 0.87068966 0.88416988 0.87356322
|
|
0.87072243 0.87739464 0.88416988 0.88030888]
|
|
|
|
mean value: 0.8787657976607903
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.92857143 0.93103448 0.62068966 0.89285714 0.85714286
|
|
0.82142857 0.85714286 0.85714286 0.89285714]
|
|
|
|
mean value: 0.8623152709359606
|
|
|
|
key: train_recall
|
|
value: [0.88976378 0.87795276 0.88932806 0.79841897 0.9015748 0.8976378
|
|
0.9015748 0.9015748 0.9015748 0.8976378 ]
|
|
|
|
mean value: 0.8857038374155799
|
|
|
|
key: test_roc_auc
|
|
value: [0.89593596 0.86083744 0.89408867 0.81034483 0.85714286 0.89285714
|
|
0.89285714 0.875 0.83928571 0.875 ]
|
|
|
|
mean value: 0.8693349753694581
|
|
|
|
key: train_roc_auc
|
|
value: [0.88559335 0.88166412 0.88364041 0.84015437 0.89173228 0.88385827
|
|
0.88385827 0.88779528 0.89173228 0.88779528]
|
|
|
|
mean value: 0.881782390837509
|
|
|
|
key: test_jcc
|
|
value: [0.81818182 0.76470588 0.81818182 0.62068966 0.75757576 0.8
|
|
0.79310345 0.77419355 0.72727273 0.78125 ]
|
|
|
|
mean value: 0.7655154655400436
|
|
|
|
key: train_jcc
|
|
value: [0.79577465 0.78798587 0.79225352 0.71378092 0.80633803 0.79442509
|
|
0.79513889 0.8006993 0.80633803 0.8 ]
|
|
|
|
mean value: 0.7892734286500613
|
|
|
|
MCC on Blind test: 0.28
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.06513762 0.05543566 0.05926275 0.05869985 0.05539632 0.05809283
|
|
0.05878782 0.06239796 0.0591898 0.21499252]
|
|
|
|
mean value: 0.07473931312561036
|
|
|
|
key: score_time
|
|
value: [0.01001787 0.00966692 0.00963521 0.00965786 0.0098114 0.0097878
|
|
0.00974226 0.00981474 0.00976157 0.01011968]
|
|
|
|
mean value: 0.009801530838012695
|
|
|
|
key: test_mcc
|
|
value: [0.92980296 0.92980296 0.92980296 0.96551724 0.82618439 0.93094934
|
|
1. 0.92857143 0.96490128 0.89342711]
|
|
|
|
mean value: 0.9298959656239084
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.96491228 0.96491228 0.98245614 0.91071429 0.96428571
|
|
1. 0.96428571 0.98214286 0.94642857]
|
|
|
|
mean value: 0.9645050125313284
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96428571 0.96428571 0.96551724 0.98245614 0.91525424 0.96551724
|
|
1. 0.96428571 0.98181818 0.94736842]
|
|
|
|
mean value: 0.965078860612559
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96428571 0.96428571 0.96551724 1. 0.87096774 0.93333333
|
|
1. 0.96428571 1. 0.93103448]
|
|
|
|
mean value: 0.9593709942263892
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.96551724 0.96551724 0.96428571 1.
|
|
1. 0.96428571 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9716748768472907
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96490148 0.96490148 0.96490148 0.98275862 0.91071429 0.96428571
|
|
1. 0.96428571 0.98214286 0.94642857]
|
|
|
|
mean value: 0.9645320197044336
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.93103448 0.93103448 0.93333333 0.96551724 0.84375 0.93333333
|
|
1. 0.93103448 0.96428571 0.9 ]
|
|
|
|
mean value: 0.9333323070607553
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.37
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.0158906 0.04133749 0.04191256 0.04180241 0.04169393 0.04146218
|
|
0.04147553 0.03954792 0.04155207 0.04123378]
|
|
|
|
mean value: 0.03879084587097168
|
|
|
|
key: score_time
|
|
value: [0.010324 0.01102901 0.0110786 0.01924562 0.02003503 0.02148247
|
|
0.01090479 0.02042723 0.02186942 0.01970553]
|
|
|
|
mean value: 0.016610169410705568
|
|
|
|
key: test_mcc
|
|
value: [0.82512315 0.76689254 0.79110556 0.9321832 0.75434227 0.82195294
|
|
0.89802651 0.85933785 0.67900461 0.82195294]
|
|
|
|
mean value: 0.8149921569819407
|
|
|
|
key: train_mcc
|
|
value: [0.87014673 0.87419439 0.85823465 0.85931426 0.87499279 0.85486752
|
|
0.83910959 0.86274648 0.87089581 0.85105352]
|
|
|
|
mean value: 0.8615555753216068
|
|
|
|
key: test_accuracy
|
|
value: [0.9122807 0.87719298 0.89473684 0.96491228 0.875 0.91071429
|
|
0.94642857 0.92857143 0.83928571 0.91071429]
|
|
|
|
mean value: 0.905983709273183
|
|
|
|
key: train_accuracy
|
|
value: [0.93491124 0.93688363 0.92899408 0.92899408 0.93700787 0.92716535
|
|
0.91929134 0.93110236 0.93503937 0.92519685]
|
|
|
|
mean value: 0.9304586187081645
|
|
|
|
key: test_fscore
|
|
value: [0.9122807 0.8852459 0.9 0.96428571 0.88135593 0.90909091
|
|
0.94339623 0.92592593 0.84210526 0.90909091]
|
|
|
|
mean value: 0.9072777483563568
|
|
|
|
key: train_fscore
|
|
value: [0.93592233 0.9379845 0.9296875 0.93076923 0.93846154 0.92843327
|
|
0.92069632 0.93230174 0.93641618 0.92664093]
|
|
|
|
mean value: 0.9317313541686736
|
|
|
|
key: test_precision
|
|
value: [0.89655172 0.81818182 0.87096774 1. 0.83870968 0.92592593
|
|
1. 0.96153846 0.82758621 0.92592593]
|
|
|
|
mean value: 0.9065387481961453
|
|
|
|
key: train_precision
|
|
value: [0.92337165 0.92366412 0.91891892 0.90636704 0.91729323 0.91254753
|
|
0.90494297 0.91634981 0.91698113 0.90909091]
|
|
|
|
mean value: 0.9149527308196
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.96428571 0.93103448 0.93103448 0.92857143 0.89285714
|
|
0.89285714 0.89285714 0.85714286 0.89285714]
|
|
|
|
mean value: 0.9112068965517242
|
|
|
|
key: train_recall
|
|
value: [0.9488189 0.95275591 0.94071146 0.95652174 0.96062992 0.94488189
|
|
0.93700787 0.9488189 0.95669291 0.94488189]
|
|
|
|
mean value: 0.9491721390557406
|
|
|
|
key: test_roc_auc
|
|
value: [0.91256158 0.87869458 0.89408867 0.96551724 0.875 0.91071429
|
|
0.94642857 0.92857143 0.83928571 0.91071429]
|
|
|
|
mean value: 0.9061576354679803
|
|
|
|
key: train_roc_auc
|
|
value: [0.93488376 0.93685226 0.92901715 0.92904827 0.93700787 0.92716535
|
|
0.91929134 0.93110236 0.93503937 0.92519685]
|
|
|
|
mean value: 0.9304604587470044
|
|
|
|
key: test_jcc
|
|
value: [0.83870968 0.79411765 0.81818182 0.93103448 0.78787879 0.83333333
|
|
0.89285714 0.86206897 0.72727273 0.83333333]
|
|
|
|
mean value: 0.8318787915611183
|
|
|
|
key: train_jcc
|
|
value: [0.87956204 0.88321168 0.86861314 0.8705036 0.88405797 0.86642599
|
|
0.85304659 0.87318841 0.88043478 0.86330935]
|
|
|
|
mean value: 0.8722353558136309
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01021671 0.01008439 0.00848007 0.00817442 0.00810814 0.00807619
|
|
0.00805545 0.00815272 0.00819159 0.00817704]
|
|
|
|
mean value: 0.008571672439575195
|
|
|
|
key: score_time
|
|
value: [0.01060987 0.00964761 0.0087111 0.00849438 0.00838518 0.00844193
|
|
0.00843549 0.00848222 0.0084908 0.00846648]
|
|
|
|
mean value: 0.00881650447845459
|
|
|
|
key: test_mcc
|
|
value: [0.79778885 0.72706729 0.79110556 0.89988258 0.71611487 0.78772636
|
|
0.79385662 0.75047877 0.67900461 0.75047877]
|
|
|
|
mean value: 0.7693504297544995
|
|
|
|
key: train_mcc
|
|
value: [0.76726164 0.78700923 0.77122983 0.7514861 0.77955173 0.77186893
|
|
0.77203657 0.77574087 0.78351922 0.77174925]
|
|
|
|
mean value: 0.7731453348144388
|
|
|
|
key: test_accuracy
|
|
value: [0.89473684 0.85964912 0.89473684 0.94736842 0.85714286 0.89285714
|
|
0.89285714 0.875 0.83928571 0.875 ]
|
|
|
|
mean value: 0.8828634085213033
|
|
|
|
key: train_accuracy
|
|
value: [0.88362919 0.89349112 0.88560158 0.87573964 0.88976378 0.88582677
|
|
0.88582677 0.88779528 0.89173228 0.88582677]
|
|
|
|
mean value: 0.8865233192004845
|
|
|
|
key: test_fscore
|
|
value: [0.9 0.86666667 0.9 0.94545455 0.86206897 0.88888889
|
|
0.88461538 0.87272727 0.84210526 0.87719298]
|
|
|
|
mean value: 0.8839719969484034
|
|
|
|
key: train_fscore
|
|
value: [0.88408644 0.89328063 0.88582677 0.87573964 0.89019608 0.88715953
|
|
0.8875969 0.88888889 0.89236791 0.88671875]
|
|
|
|
mean value: 0.8871861548728417
|
|
|
|
key: test_precision
|
|
value: [0.84375 0.8125 0.87096774 1. 0.83333333 0.92307692
|
|
0.95833333 0.88888889 0.82758621 0.86206897]
|
|
|
|
mean value: 0.8820505392981756
|
|
|
|
key: train_precision
|
|
value: [0.88235294 0.8968254 0.88235294 0.87401575 0.88671875 0.87692308
|
|
0.8740458 0.88030888 0.88715953 0.87984496]
|
|
|
|
mean value: 0.8820548030282749
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.92857143 0.93103448 0.89655172 0.89285714 0.85714286
|
|
0.82142857 0.85714286 0.85714286 0.89285714]
|
|
|
|
mean value: 0.8899014778325123
|
|
|
|
key: train_recall
|
|
value: [0.88582677 0.88976378 0.88932806 0.87747036 0.89370079 0.8976378
|
|
0.9015748 0.8976378 0.8976378 0.89370079]
|
|
|
|
mean value: 0.8924278733932962
|
|
|
|
key: test_roc_auc
|
|
value: [0.89593596 0.86083744 0.89408867 0.94827586 0.85714286 0.89285714
|
|
0.89285714 0.875 0.83928571 0.875 ]
|
|
|
|
mean value: 0.883128078817734
|
|
|
|
key: train_roc_auc
|
|
value: [0.88362485 0.89349849 0.88560891 0.87574305 0.88976378 0.88582677
|
|
0.88582677 0.88779528 0.89173228 0.88582677]
|
|
|
|
mean value: 0.8865246957766643
|
|
|
|
key: test_jcc
|
|
value: [0.81818182 0.76470588 0.81818182 0.89655172 0.75757576 0.8
|
|
0.79310345 0.77419355 0.72727273 0.78125 ]
|
|
|
|
mean value: 0.7931016724365952
|
|
|
|
key: train_jcc
|
|
value: [0.79225352 0.80714286 0.795053 0.77894737 0.80212014 0.7972028
|
|
0.79790941 0.8 0.80565371 0.79649123]
|
|
|
|
mean value: 0.7972774034752823
|
|
|
|
MCC on Blind test: 0.28
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01202488 0.01277637 0.0122776 0.01319098 0.013026 0.01311898
|
|
0.01343989 0.01440072 0.01275349 0.01293349]
|
|
|
|
mean value: 0.01299424171447754
|
|
|
|
key: score_time
|
|
value: [0.00864363 0.00991464 0.0099988 0.01055336 0.01052094 0.01076031
|
|
0.01056218 0.01061869 0.01053238 0.0105021 ]
|
|
|
|
mean value: 0.010260701179504395
|
|
|
|
key: test_mcc
|
|
value: [0.7589669 0.82942474 0.30469361 0.9321832 0.26997462 0.6882472
|
|
0.26997462 0.76225171 0.82195294 0.85933785]
|
|
|
|
mean value: 0.649700739821353
|
|
|
|
key: train_mcc
|
|
value: [0.88439556 0.87825675 0.35307124 0.8935508 0.46259784 0.65176051
|
|
0.33210739 0.86516672 0.88616336 0.86094079]
|
|
|
|
mean value: 0.7068010966004633
|
|
|
|
key: test_accuracy
|
|
value: [0.87719298 0.9122807 0.57894737 0.96491228 0.58928571 0.82142857
|
|
0.58928571 0.875 0.91071429 0.92857143]
|
|
|
|
mean value: 0.8047619047619048
|
|
|
|
key: train_accuracy
|
|
value: [0.9408284 0.93885602 0.61143984 0.94674556 0.68110236 0.8011811
|
|
0.6023622 0.93110236 0.94291339 0.92913386]
|
|
|
|
mean value: 0.8325665098075758
|
|
|
|
key: test_fscore
|
|
value: [0.88135593 0.91525424 0.29411765 0.96428571 0.7012987 0.84848485
|
|
0.7012987 0.8852459 0.9122807 0.93103448]
|
|
|
|
mean value: 0.8034656868070665
|
|
|
|
key: train_fscore
|
|
value: [0.94318182 0.94003868 0.36245955 0.94632207 0.75675676 0.83305785
|
|
0.71468927 0.93383743 0.94211577 0.93181818]
|
|
|
|
mean value: 0.8304277370347289
|
|
|
|
key: test_precision
|
|
value: [0.83870968 0.87096774 1. 1. 0.55102041 0.73684211
|
|
0.55102041 0.81818182 0.89655172 0.9 ]
|
|
|
|
mean value: 0.8163293883264277
|
|
|
|
key: train_precision
|
|
value: [0.90875912 0.92395437 1. 0.952 0.61165049 0.71794872
|
|
0.55726872 0.89818182 0.95546559 0.89781022]
|
|
|
|
mean value: 0.8423039046768191
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.96428571 0.17241379 0.93103448 0.96428571 1.
|
|
0.96428571 0.96428571 0.92857143 0.96428571]
|
|
|
|
mean value: 0.8782019704433498
|
|
|
|
key: train_recall
|
|
value: [0.98031496 0.95669291 0.22134387 0.94071146 0.99212598 0.99212598
|
|
0.99606299 0.97244094 0.92913386 0.96850394]
|
|
|
|
mean value: 0.8949456910771529
|
|
|
|
key: test_roc_auc
|
|
value: [0.87807882 0.91317734 0.5862069 0.96551724 0.58928571 0.82142857
|
|
0.58928571 0.875 0.91071429 0.92857143]
|
|
|
|
mean value: 0.8057266009852218
|
|
|
|
key: train_roc_auc
|
|
value: [0.94075037 0.93882076 0.61067194 0.94673368 0.68110236 0.8011811
|
|
0.6023622 0.93110236 0.94291339 0.92913386]
|
|
|
|
mean value: 0.832477202701441
|
|
|
|
key: test_jcc
|
|
value: [0.78787879 0.84375 0.17241379 0.93103448 0.54 0.73684211
|
|
0.54 0.79411765 0.83870968 0.87096774]
|
|
|
|
mean value: 0.7055714235417677
|
|
|
|
key: train_jcc
|
|
value: [0.89247312 0.88686131 0.22134387 0.89811321 0.60869565 0.71388102
|
|
0.55604396 0.87588652 0.89056604 0.87234043]
|
|
|
|
mean value: 0.7416205129351496
|
|
|
|
MCC on Blind test: 0.17
|
|
|
|
Accuracy on Blind test: 0.53
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01449275 0.01378703 0.01423025 0.01384568 0.01323628 0.01458144
|
|
0.01555753 0.01341605 0.01425099 0.01479554]
|
|
|
|
mean value: 0.014219355583190919
|
|
|
|
key: score_time
|
|
value: [0.01101327 0.01102638 0.01097178 0.01104283 0.01102424 0.01096249
|
|
0.01104045 0.01096034 0.01094913 0.01095819]
|
|
|
|
mean value: 0.010994911193847656
|
|
|
|
key: test_mcc
|
|
value: [0.85960591 0.92980296 0.8615634 0.9321832 0.76225171 0.96490128
|
|
0.93094934 0.89342711 0.78772636 0.82618439]
|
|
|
|
mean value: 0.8748595658423624
|
|
|
|
key: train_mcc
|
|
value: [0.90933143 0.9215681 0.86053354 0.89231105 0.86150531 0.91030286
|
|
0.90951226 0.87252327 0.91349911 0.86883933]
|
|
|
|
mean value: 0.8919926257747095
|
|
|
|
key: test_accuracy
|
|
value: [0.92982456 0.96491228 0.92982456 0.96491228 0.875 0.98214286
|
|
0.96428571 0.94642857 0.89285714 0.91071429]
|
|
|
|
mean value: 0.9360902255639098
|
|
|
|
key: train_accuracy
|
|
value: [0.95463511 0.96055227 0.9270217 0.94477318 0.92913386 0.95472441
|
|
0.95472441 0.93503937 0.95669291 0.93307087]
|
|
|
|
mean value: 0.9450368075292364
|
|
|
|
key: test_fscore
|
|
value: [0.92857143 0.96428571 0.93333333 0.96428571 0.8852459 0.98181818
|
|
0.96296296 0.94736842 0.89655172 0.91525424]
|
|
|
|
mean value: 0.9379677619375378
|
|
|
|
key: train_fscore
|
|
value: [0.95499022 0.96 0.9310987 0.94238683 0.93207547 0.95372233
|
|
0.95445545 0.9373814 0.95703125 0.93560606]
|
|
|
|
mean value: 0.9458747709029058
|
|
|
|
key: test_precision
|
|
value: [0.92857143 0.96428571 0.90322581 1. 0.81818182 1.
|
|
1. 0.93103448 0.86666667 0.87096774]
|
|
|
|
mean value: 0.9282933658851346
|
|
|
|
key: train_precision
|
|
value: [0.94941634 0.97560976 0.88028169 0.98283262 0.89492754 0.97530864
|
|
0.96015936 0.9047619 0.9496124 0.90145985]
|
|
|
|
mean value: 0.937437010931088
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.96428571 0.96551724 0.93103448 0.96428571 0.96428571
|
|
0.92857143 0.96428571 0.92857143 0.96428571]
|
|
|
|
mean value: 0.9503694581280788
|
|
|
|
key: train_recall
|
|
value: [0.96062992 0.94488189 0.98814229 0.90513834 0.97244094 0.93307087
|
|
0.9488189 0.97244094 0.96456693 0.97244094]
|
|
|
|
mean value: 0.9562571970993744
|
|
|
|
key: test_roc_auc
|
|
value: [0.92980296 0.96490148 0.92918719 0.96551724 0.875 0.98214286
|
|
0.96428571 0.94642857 0.89285714 0.91071429]
|
|
|
|
mean value: 0.9360837438423646
|
|
|
|
key: train_roc_auc
|
|
value: [0.95462326 0.96058324 0.92714201 0.94469515 0.92913386 0.95472441
|
|
0.95472441 0.93503937 0.95669291 0.93307087]
|
|
|
|
mean value: 0.9450429491768074
|
|
|
|
key: test_jcc
|
|
value: [0.86666667 0.93103448 0.875 0.93103448 0.79411765 0.96428571
|
|
0.92857143 0.9 0.8125 0.84375 ]
|
|
|
|
mean value: 0.8846960422099874
|
|
|
|
key: train_jcc
|
|
value: [0.91385768 0.92307692 0.87108014 0.89105058 0.87279152 0.91153846
|
|
0.91287879 0.88214286 0.917603 0.87900356]
|
|
|
|
mean value: 0.8975023504978233
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.42
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.11413288 0.1020143 0.10187316 0.10205579 0.10220885 0.10228324
|
|
0.10212898 0.10206866 0.10222936 0.10228419]
|
|
|
|
mean value: 0.10332794189453125
|
|
|
|
key: score_time
|
|
value: [0.01537633 0.01542163 0.01549554 0.01547527 0.01547194 0.01569366
|
|
0.01554465 0.01544762 0.01546764 0.01553702]
|
|
|
|
mean value: 0.015493130683898926
|
|
|
|
key: test_mcc
|
|
value: [0.92980296 0.8951918 0.96547546 0.96551724 0.82618439 1.
|
|
1. 0.92857143 0.92857143 0.89342711]
|
|
|
|
mean value: 0.9332741814992628
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.94736842 0.98245614 0.98245614 0.91071429 1.
|
|
1. 0.96428571 0.96428571 0.94642857]
|
|
|
|
mean value: 0.9662907268170426
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96428571 0.94545455 0.98305085 0.98245614 0.91525424 1.
|
|
1. 0.96428571 0.96428571 0.94736842]
|
|
|
|
mean value: 0.966644133446096
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96428571 0.96296296 0.96666667 1. 0.87096774 1.
|
|
1. 0.96428571 0.96428571 0.93103448]
|
|
|
|
mean value: 0.9624488997180877
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.92857143 1. 0.96551724 0.96428571 1.
|
|
1. 0.96428571 0.96428571 0.96428571]
|
|
|
|
mean value: 0.971551724137931
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96490148 0.94704433 0.98214286 0.98275862 0.91071429 1.
|
|
1. 0.96428571 0.96428571 0.94642857]
|
|
|
|
mean value: 0.966256157635468
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.93103448 0.89655172 0.96666667 0.96551724 0.84375 1.
|
|
1. 0.93103448 0.93103448 0.9 ]
|
|
|
|
mean value: 0.936558908045977
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.39
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03648138 0.03345275 0.03663468 0.03281045 0.03574181 0.03963566
|
|
0.04988575 0.04006696 0.03826404 0.04667592]
|
|
|
|
mean value: 0.03896493911743164
|
|
|
|
key: score_time
|
|
value: [0.0171628 0.0221951 0.02012706 0.02416539 0.02962255 0.03259301
|
|
0.0243876 0.07890892 0.02576041 0.01878405]
|
|
|
|
mean value: 0.029370689392089845
|
|
|
|
key: test_mcc
|
|
value: [0.92980296 0.8951918 0.8951918 1. 0.82195294 0.89802651
|
|
0.96490128 0.92857143 0.92857143 0.89342711]
|
|
|
|
mean value: 0.915563726523076
|
|
|
|
key: train_mcc
|
|
value: [0.99606293 0.98425123 0.97636129 0.99606299 0.98819663 0.99607071
|
|
0.99212598 0.98819663 0.98819663 0.99607071]
|
|
|
|
mean value: 0.9901595758977872
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.94736842 0.94736842 1. 0.91071429 0.94642857
|
|
0.98214286 0.96428571 0.96428571 0.94642857]
|
|
|
|
mean value: 0.9573934837092731
|
|
|
|
key: train_accuracy
|
|
value: [0.99802761 0.99211045 0.98816568 0.99802761 0.99409449 0.9980315
|
|
0.99606299 0.99409449 0.99409449 0.9980315 ]
|
|
|
|
mean value: 0.9950740809765644
|
|
|
|
key: test_fscore
|
|
value: [0.96428571 0.94545455 0.94915254 1. 0.9122807 0.94915254
|
|
0.98181818 0.96428571 0.96428571 0.94736842]
|
|
|
|
mean value: 0.957808407768265
|
|
|
|
key: train_fscore
|
|
value: [0.99803536 0.99215686 0.98809524 0.99802761 0.99410609 0.99803536
|
|
0.99606299 0.99408284 0.99410609 0.99802761]
|
|
|
|
mean value: 0.9950736067689547
|
|
|
|
key: test_precision
|
|
value: [0.96428571 0.96296296 0.93333333 1. 0.89655172 0.90322581
|
|
1. 0.96428571 0.96428571 0.93103448]
|
|
|
|
mean value: 0.9519965452501604
|
|
|
|
key: train_precision
|
|
value: [0.99607843 0.98828125 0.99203187 0.99606299 0.99215686 0.99607843
|
|
0.99606299 0.99604743 0.99215686 1. ]
|
|
|
|
mean value: 0.9944957125827263
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.92857143 0.96551724 1. 0.92857143 1.
|
|
0.96428571 0.96428571 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9644088669950739
|
|
|
|
key: train_recall
|
|
value: [1. 0.99606299 0.98418972 1. 0.99606299 1.
|
|
0.99606299 0.99212598 0.99606299 0.99606299]
|
|
|
|
mean value: 0.9956630668202048
|
|
|
|
key: test_roc_auc
|
|
value: [0.96490148 0.94704433 0.94704433 1. 0.91071429 0.94642857
|
|
0.98214286 0.96428571 0.96428571 0.94642857]
|
|
|
|
mean value: 0.9573275862068966
|
|
|
|
key: train_roc_auc
|
|
value: [0.99802372 0.99210264 0.98815785 0.9980315 0.99409449 0.9980315
|
|
0.99606299 0.99409449 0.99409449 0.9980315 ]
|
|
|
|
mean value: 0.9950725156391025
|
|
|
|
key: test_jcc
|
|
value: [0.93103448 0.89655172 0.90322581 1. 0.83870968 0.90322581
|
|
0.96428571 0.93103448 0.93103448 0.9 ]
|
|
|
|
mean value: 0.9199102177022088
|
|
|
|
key: train_jcc
|
|
value: [0.99607843 0.9844358 0.97647059 0.99606299 0.98828125 0.99607843
|
|
0.99215686 0.98823529 0.98828125 0.99606299]
|
|
|
|
mean value: 0.9902143889760475
|
|
|
|
MCC on Blind test: 0.09
|
|
|
|
Accuracy on Blind test: 0.38
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.16053271 0.1920464 0.18025184 0.14875555 0.09875679 0.1012013
|
|
0.16610289 0.17586374 0.18154573 0.15119648]
|
|
|
|
mean value: 0.1556253433227539
|
|
|
|
key: score_time
|
|
value: [0.02007532 0.02150774 0.02181888 0.01334596 0.02305841 0.01333928
|
|
0.02570271 0.02890587 0.01332211 0.02642632]
|
|
|
|
mean value: 0.020750260353088378
|
|
|
|
key: test_mcc
|
|
value: [0.76689254 0.79778885 0.75462449 0.8953202 0.71611487 0.78772636
|
|
0.82618439 0.75047877 0.71611487 0.78772636]
|
|
|
|
mean value: 0.7798971716204695
|
|
|
|
key: train_mcc
|
|
value: [0.84667632 0.85019923 0.85012683 0.84728344 0.85850727 0.84698856
|
|
0.8231473 0.84725158 0.8742597 0.86237183]
|
|
|
|
mean value: 0.8506812043667749
|
|
|
|
key: test_accuracy
|
|
value: [0.87719298 0.89473684 0.87719298 0.94736842 0.85714286 0.89285714
|
|
0.91071429 0.875 0.85714286 0.89285714]
|
|
|
|
mean value: 0.8882205513784461
|
|
|
|
key: train_accuracy
|
|
value: [0.92307692 0.92504931 0.92504931 0.92307692 0.92913386 0.92322835
|
|
0.91141732 0.92322835 0.93700787 0.93110236]
|
|
|
|
mean value: 0.9251370575719455
|
|
|
|
key: test_fscore
|
|
value: [0.8852459 0.9 0.88135593 0.94736842 0.86206897 0.88888889
|
|
0.90566038 0.87272727 0.86206897 0.89655172]
|
|
|
|
mean value: 0.8901936449042431
|
|
|
|
key: train_fscore
|
|
value: [0.9245648 0.92578125 0.92519685 0.92485549 0.92996109 0.9245648
|
|
0.91262136 0.92485549 0.93774319 0.93177388]
|
|
|
|
mean value: 0.9261918195384349
|
|
|
|
key: test_precision
|
|
value: [0.81818182 0.84375 0.86666667 0.96428571 0.83333333 0.92307692
|
|
0.96 0.88888889 0.83333333 0.86666667]
|
|
|
|
mean value: 0.8798183344433345
|
|
|
|
key: train_precision
|
|
value: [0.90874525 0.91860465 0.92156863 0.90225564 0.91923077 0.90874525
|
|
0.90038314 0.90566038 0.92692308 0.92277992]
|
|
|
|
mean value: 0.9134896700062805
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.89655172 0.93103448 0.89285714 0.85714286
|
|
0.85714286 0.85714286 0.89285714 0.92857143]
|
|
|
|
mean value: 0.9041871921182266
|
|
|
|
key: train_recall
|
|
value: [0.94094488 0.93307087 0.92885375 0.9486166 0.94094488 0.94094488
|
|
0.92519685 0.94488189 0.9488189 0.94094488]
|
|
|
|
mean value: 0.9393218387227288
|
|
|
|
key: test_roc_auc
|
|
value: [0.87869458 0.89593596 0.87684729 0.9476601 0.85714286 0.89285714
|
|
0.91071429 0.875 0.85714286 0.89285714]
|
|
|
|
mean value: 0.8884852216748769
|
|
|
|
key: train_roc_auc
|
|
value: [0.92304161 0.92503346 0.9250568 0.9231272 0.92913386 0.92322835
|
|
0.91141732 0.92322835 0.93700787 0.93110236]
|
|
|
|
mean value: 0.9251377174691109
|
|
|
|
key: test_jcc
|
|
value: [0.79411765 0.81818182 0.78787879 0.9 0.75757576 0.8
|
|
0.82758621 0.77419355 0.75757576 0.8125 ]
|
|
|
|
mean value: 0.8029609523554593
|
|
|
|
key: train_jcc
|
|
value: [0.85971223 0.86181818 0.86080586 0.86021505 0.86909091 0.85971223
|
|
0.83928571 0.86021505 0.88278388 0.87226277]
|
|
|
|
mean value: 0.8625901890465713
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.27303624 0.25770164 0.24925447 0.25132322 0.24955368 0.25238132
|
|
0.25456405 0.25274968 0.25270486 0.25955772]
|
|
|
|
mean value: 0.25528268814086913
|
|
|
|
key: score_time
|
|
value: [0.00924158 0.0084126 0.00870824 0.00866604 0.0086503 0.0085566
|
|
0.00878334 0.00932741 0.00889683 0.00852871]
|
|
|
|
mean value: 0.008777165412902832
|
|
|
|
key: test_mcc
|
|
value: [0.92980296 0.92980296 0.8951918 1. 0.82195294 0.93094934
|
|
1. 0.89342711 0.96490128 0.92857143]
|
|
|
|
mean value: 0.9294599815844486
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.96491228 0.94736842 1. 0.91071429 0.96428571
|
|
1. 0.94642857 0.98214286 0.96428571]
|
|
|
|
mean value: 0.9645050125313284
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96428571 0.96428571 0.94915254 1. 0.9122807 0.96551724
|
|
1. 0.94545455 0.98181818 0.96428571]
|
|
|
|
mean value: 0.9647080355636448
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96428571 0.96428571 0.93333333 1. 0.89655172 0.93333333
|
|
1. 0.96296296 1. 0.96428571]
|
|
|
|
mean value: 0.9619038496624703
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.96551724 1. 0.92857143 1.
|
|
1. 0.92857143 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9679802955665024
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96490148 0.96490148 0.94704433 1. 0.91071429 0.96428571
|
|
1. 0.94642857 0.98214286 0.96428571]
|
|
|
|
mean value: 0.9644704433497537
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.93103448 0.93103448 0.90322581 1. 0.83870968 0.93333333
|
|
1. 0.89655172 0.96428571 0.93103448]
|
|
|
|
mean value: 0.9329209703903808
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.3
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01269507 0.0144012 0.01376104 0.01396441 0.01427913 0.01633024
|
|
0.01445484 0.01412106 0.01710892 0.01414371]
|
|
|
|
mean value: 0.014525961875915528
|
|
|
|
key: score_time
|
|
value: [0.01109529 0.01073503 0.01095819 0.01101351 0.01113129 0.01230645
|
|
0.01111698 0.01105785 0.01199865 0.01175356]
|
|
|
|
mean value: 0.011316680908203125
|
|
|
|
key: test_mcc
|
|
value: [0.5149026 0.65634573 0.65634573 0.76689254 0.67900461 0.57735027
|
|
0.83484711 0.64285714 0.56573571 0.71611487]
|
|
|
|
mean value: 0.6610396314952282
|
|
|
|
key: train_mcc
|
|
value: [0.76157807 0.80278863 0.76582615 0.80208917 0.81501748 0.8019582
|
|
0.81112421 0.82360735 0.73708689 0.76803489]
|
|
|
|
mean value: 0.78891110313645
|
|
|
|
key: test_accuracy
|
|
value: [0.75438596 0.8245614 0.8245614 0.87719298 0.83928571 0.78571429
|
|
0.91071429 0.82142857 0.76785714 0.85714286]
|
|
|
|
mean value: 0.8262844611528822
|
|
|
|
key: train_accuracy
|
|
value: [0.87573964 0.90138067 0.87968442 0.89940828 0.90748031 0.8996063
|
|
0.90551181 0.91141732 0.86417323 0.87992126]
|
|
|
|
mean value: 0.8924323253971952
|
|
|
|
key: test_fscore
|
|
value: [0.76666667 0.83333333 0.81481481 0.86792453 0.84210526 0.76923077
|
|
0.90196078 0.82142857 0.8 0.85185185]
|
|
|
|
mean value: 0.8269316583099514
|
|
|
|
key: train_fscore
|
|
value: [0.86509636 0.90118577 0.87103594 0.89440994 0.90693069 0.89527721
|
|
0.9047619 0.90945674 0.8738574 0.87048832]
|
|
|
|
mean value: 0.8892500281591235
|
|
|
|
key: test_precision
|
|
value: [0.71875 0.78125 0.88 0.95833333 0.82758621 0.83333333
|
|
1. 0.82142857 0.7027027 0.88461538]
|
|
|
|
mean value: 0.8407999532309878
|
|
|
|
key: train_precision
|
|
value: [0.94835681 0.9047619 0.93636364 0.93913043 0.9123506 0.93562232
|
|
0.912 0.93004115 0.81569966 0.94470046]
|
|
|
|
mean value: 0.9179026970421955
|
|
|
|
key: test_recall
|
|
value: [0.82142857 0.89285714 0.75862069 0.79310345 0.85714286 0.71428571
|
|
0.82142857 0.82142857 0.92857143 0.82142857]
|
|
|
|
mean value: 0.8230295566502464
|
|
|
|
key: train_recall
|
|
value: [0.79527559 0.8976378 0.81422925 0.85375494 0.9015748 0.85826772
|
|
0.8976378 0.88976378 0.94094488 0.80708661]
|
|
|
|
mean value: 0.8656173166101273
|
|
|
|
key: test_roc_auc
|
|
value: [0.75554187 0.82573892 0.82573892 0.87869458 0.83928571 0.78571429
|
|
0.91071429 0.82142857 0.76785714 0.85714286]
|
|
|
|
mean value: 0.8267857142857143
|
|
|
|
key: train_roc_auc
|
|
value: [0.87589866 0.90138807 0.87955557 0.89931842 0.90748031 0.8996063
|
|
0.90551181 0.91141732 0.86417323 0.87992126]
|
|
|
|
mean value: 0.892427095328499
|
|
|
|
key: test_jcc
|
|
value: [0.62162162 0.71428571 0.6875 0.76666667 0.72727273 0.625
|
|
0.82142857 0.6969697 0.66666667 0.74193548]
|
|
|
|
mean value: 0.7069347148782633
|
|
|
|
key: train_jcc
|
|
value: [0.76226415 0.82014388 0.77153558 0.80898876 0.82971014 0.81040892
|
|
0.82608696 0.83394834 0.77597403 0.77067669]
|
|
|
|
mean value: 0.8009737460973876
|
|
|
|
MCC on Blind test: 0.31
|
|
|
|
Accuracy on Blind test: 0.68
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01170778 0.01146626 0.03049469 0.03104663 0.02517843 0.0244348
|
|
0.03004813 0.03791165 0.03372884 0.02444196]
|
|
|
|
mean value: 0.026045918464660645
|
|
|
|
key: score_time
|
|
value: [0.01076579 0.01079035 0.01894236 0.01367116 0.02186847 0.02198577
|
|
0.01771045 0.02187347 0.02229476 0.02252173]
|
|
|
|
mean value: 0.018242430686950684
|
|
|
|
key: test_mcc
|
|
value: [0.82942474 0.76689254 0.79110556 0.9321832 0.71611487 0.82195294
|
|
0.85933785 0.78772636 0.67900461 0.75047877]
|
|
|
|
mean value: 0.7934221443680324
|
|
|
|
key: train_mcc
|
|
value: [0.8266528 0.81876065 0.82265144 0.82358593 0.83910959 0.81142619
|
|
0.80377277 0.81930411 0.83123063 0.80759374]
|
|
|
|
mean value: 0.8204087868380765
|
|
|
|
key: test_accuracy
|
|
value: [0.9122807 0.87719298 0.89473684 0.96491228 0.85714286 0.91071429
|
|
0.92857143 0.89285714 0.83928571 0.875 ]
|
|
|
|
mean value: 0.8952694235588973
|
|
|
|
key: train_accuracy
|
|
value: [0.91321499 0.90927022 0.9112426 0.9112426 0.91929134 0.90551181
|
|
0.9015748 0.90944882 0.91535433 0.90354331]
|
|
|
|
mean value: 0.9099694823650002
|
|
|
|
key: test_fscore
|
|
value: [0.91525424 0.8852459 0.9 0.96428571 0.86206897 0.90909091
|
|
0.92592593 0.88888889 0.84210526 0.87719298]
|
|
|
|
mean value: 0.8970058788250195
|
|
|
|
key: train_fscore
|
|
value: [0.91439689 0.91050584 0.91193738 0.9132948 0.92069632 0.90697674
|
|
0.9034749 0.91085271 0.91682785 0.90522244]
|
|
|
|
mean value: 0.9114185875040357
|
|
|
|
key: test_precision
|
|
value: [0.87096774 0.81818182 0.87096774 1. 0.83333333 0.92592593
|
|
0.96153846 0.92307692 0.82758621 0.86206897]
|
|
|
|
mean value: 0.8893647118341224
|
|
|
|
key: train_precision
|
|
value: [0.90384615 0.9 0.90310078 0.89097744 0.90494297 0.89312977
|
|
0.88636364 0.89694656 0.90114068 0.88973384]
|
|
|
|
mean value: 0.8970181835384771
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.93103448 0.93103448 0.89285714 0.89285714
|
|
0.89285714 0.85714286 0.85714286 0.89285714]
|
|
|
|
mean value: 0.9076354679802956
|
|
|
|
key: train_recall
|
|
value: [0.92519685 0.92125984 0.92094862 0.93675889 0.93700787 0.92125984
|
|
0.92125984 0.92519685 0.93307087 0.92125984]
|
|
|
|
mean value: 0.9263219320905045
|
|
|
|
key: test_roc_auc
|
|
value: [0.91317734 0.87869458 0.89408867 0.96551724 0.85714286 0.91071429
|
|
0.92857143 0.89285714 0.83928571 0.875 ]
|
|
|
|
mean value: 0.8955049261083744
|
|
|
|
key: train_roc_auc
|
|
value: [0.91319131 0.90924652 0.91126171 0.91129283 0.91929134 0.90551181
|
|
0.9015748 0.90944882 0.91535433 0.90354331]
|
|
|
|
mean value: 0.9099716784413806
|
|
|
|
key: test_jcc
|
|
value: [0.84375 0.79411765 0.81818182 0.93103448 0.75757576 0.83333333
|
|
0.86206897 0.8 0.72727273 0.78125 ]
|
|
|
|
mean value: 0.8148584731698322
|
|
|
|
key: train_jcc
|
|
value: [0.84229391 0.83571429 0.8381295 0.84042553 0.85304659 0.82978723
|
|
0.82394366 0.83629893 0.84642857 0.82685512]
|
|
|
|
mean value: 0.837292333932638
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_config.py:203: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rouC_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_config.py:206: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rouC_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist', 'rsa',
|
|
'kd_values', 'rd_values', 'electro_rr', 'electro_mm', '...
|
|
'volumetric_mm', 'volumetric_ss', 'consurf_score', 'snap2_score',
|
|
'provean_score', 'maf', 'logorI', 'lineage_proportion',
|
|
'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.18650627 0.19624949 0.19074392 0.19366646 0.20337486 0.19410229
|
|
0.19437337 0.18854046 0.24514914 0.20517421]
|
|
|
|
mean value: 0.19978804588317872
|
|
|
|
key: score_time
|
|
value: [0.01944613 0.01078582 0.01317906 0.01077127 0.01899004 0.02023435
|
|
0.02145123 0.01528096 0.01077437 0.02091956]
|
|
|
|
mean value: 0.01618328094482422
|
|
|
|
key: test_mcc
|
|
value: [0.82942474 0.76689254 0.79110556 0.9321832 0.75434227 0.82195294
|
|
0.85933785 0.85933785 0.67900461 0.78571429]
|
|
|
|
mean value: 0.8079295836885118
|
|
|
|
key: train_mcc
|
|
value: [0.8266528 0.86654135 0.82265144 0.85931426 0.86681377 0.85105352
|
|
0.80377277 0.85922715 0.86681377 0.85105352]
|
|
|
|
mean value: 0.847389438313628
|
|
|
|
key: test_accuracy
|
|
value: [0.9122807 0.87719298 0.89473684 0.96491228 0.875 0.91071429
|
|
0.92857143 0.92857143 0.83928571 0.89285714]
|
|
|
|
mean value: 0.9024122807017544
|
|
|
|
key: train_accuracy
|
|
value: [0.91321499 0.93293886 0.9112426 0.92899408 0.93307087 0.92519685
|
|
0.9015748 0.92913386 0.93307087 0.92519685]
|
|
|
|
mean value: 0.9233634627032568
|
|
|
|
key: test_fscore
|
|
value: [0.91525424 0.8852459 0.9 0.96428571 0.88135593 0.90909091
|
|
0.92592593 0.92592593 0.84210526 0.89285714]
|
|
|
|
mean value: 0.9042046952374383
|
|
|
|
key: train_fscore
|
|
value: [0.91439689 0.93436293 0.91193738 0.93076923 0.93436293 0.92664093
|
|
0.9034749 0.93076923 0.93436293 0.92664093]
|
|
|
|
mean value: 0.9247718286234357
|
|
|
|
key: test_precision
|
|
value: [0.87096774 0.81818182 0.87096774 1. 0.83870968 0.92592593
|
|
0.96153846 0.96153846 0.82758621 0.89285714]
|
|
|
|
mean value: 0.8968273178228685
|
|
|
|
key: train_precision
|
|
value: [0.90384615 0.91666667 0.90310078 0.90636704 0.91666667 0.90909091
|
|
0.88636364 0.90977444 0.91666667 0.90909091]
|
|
|
|
mean value: 0.9077633860874135
|
|
|
|
key: test_recall
|
|
value: [0.96428571 0.96428571 0.93103448 0.93103448 0.92857143 0.89285714
|
|
0.89285714 0.89285714 0.85714286 0.89285714]
|
|
|
|
mean value: 0.9147783251231527
|
|
|
|
key: train_recall
|
|
value: [0.92519685 0.95275591 0.92094862 0.95652174 0.95275591 0.94488189
|
|
0.92125984 0.95275591 0.95275591 0.94488189]
|
|
|
|
mean value: 0.9424714450219415
|
|
|
|
key: test_roc_auc
|
|
value: [0.91317734 0.87869458 0.89408867 0.96551724 0.875 0.91071429
|
|
0.92857143 0.92857143 0.83928571 0.89285714]
|
|
|
|
mean value: 0.9026477832512315
|
|
|
|
key: train_roc_auc
|
|
value: [0.91319131 0.93289969 0.91126171 0.92904827 0.93307087 0.92519685
|
|
0.9015748 0.92913386 0.93307087 0.92519685]
|
|
|
|
mean value: 0.9233645077962094
|
|
|
|
key: test_jcc
|
|
value: [0.84375 0.79411765 0.81818182 0.93103448 0.78787879 0.83333333
|
|
0.86206897 0.86206897 0.72727273 0.80645161]
|
|
|
|
mean value: 0.826615834042182
|
|
|
|
key: train_jcc
|
|
value: [0.84229391 0.87681159 0.8381295 0.8705036 0.87681159 0.86330935
|
|
0.82394366 0.8705036 0.87681159 0.86330935]
|
|
|
|
mean value: 0.8602427747074015
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.7
|