19445 lines
960 KiB
Text
19445 lines
960 KiB
Text
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_data.py:550: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
mask_check.sort_values(by = ['ligand_distance'], ascending = True, inplace = True)
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
|
|
from pandas import MultiIndex, Int64Index
|
|
1.22.4
|
|
1.4.1
|
|
|
|
aaindex_df contains non-numerical data
|
|
|
|
Total no. of non-numerial columns: 2
|
|
|
|
Selecting numerical data only
|
|
|
|
PASS: successfully selected numerical columns only for aaindex_df
|
|
|
|
Now checking for NA in the remaining aaindex_cols
|
|
|
|
Counting aaindex_df cols with NA
|
|
ncols with NA: 4 columns
|
|
Dropping these...
|
|
Original ncols: 127
|
|
|
|
Revised df ncols: 123
|
|
|
|
Checking NA in revised df...
|
|
|
|
PASS: cols with NA successfully dropped from aaindex_df
|
|
Proceeding with combining aa_df with other features_df
|
|
|
|
PASS: ncols match
|
|
Expected ncols: 123
|
|
Got: 123
|
|
|
|
Total no. of columns in clean aa_df: 123
|
|
|
|
Proceeding to merge, expected nrows in merged_df: 858
|
|
|
|
PASS: my_features_df and aa_df successfully combined
|
|
nrows: 858
|
|
ncols: 269
|
|
count of NULL values before imputation
|
|
|
|
or_mychisq 244
|
|
log10_or_mychisq 244
|
|
dtype: int64
|
|
count of NULL values AFTER imputation
|
|
|
|
mutationinformation 0
|
|
or_rawI 0
|
|
logorI 0
|
|
dtype: int64
|
|
|
|
PASS: OR values imputed, data ready for ML
|
|
|
|
No. of numerical features: 45
|
|
No. of categorical features: 7
|
|
|
|
index: 0
|
|
ind: 1
|
|
|
|
Mask count check: True
|
|
|
|
index: 1
|
|
ind: 2
|
|
|
|
Mask count check: False
|
|
Original Data
|
|
Counter({0: 353, 1: 95}) Data dim: (448, 52)
|
|
|
|
-------------------------------------------------------------
|
|
Successfully split data: UQ [no aa_index but active site included] training
|
|
actual values: training set
|
|
imputed values: blind test set
|
|
Train data size: (448, 52)
|
|
Test data size: (410, 52)
|
|
y_train numbers: Counter({0: 353, 1: 95})
|
|
y_train ratio: 3.7157894736842105
|
|
|
|
y_test_numbers: Counter({0: 385, 1: 25})
|
|
y_test ratio: 15.4
|
|
-------------------------------------------------------------
|
|
Simple Random OverSampling
|
|
Counter({1: 353, 0: 353})
|
|
(706, 52)
|
|
Simple Random UnderSampling
|
|
Counter({0: 95, 1: 95})
|
|
(190, 52)
|
|
Simple Combined Over and UnderSampling
|
|
Counter({0: 353, 1: 353})
|
|
(706, 52)
|
|
SMOTE_NC OverSampling
|
|
Counter({1: 353, 0: 353})
|
|
(706, 52)
|
|
|
|
#####################################################################
|
|
|
|
Running ML analysis: UQ [without AA index but with active site annotations]
|
|
Gene name: embB
|
|
Drug name: ethambutol
|
|
|
|
Output directory: /home/tanu/git/Data/ethambutol/output/ml/uq_v1/
|
|
|
|
Sanity checks:
|
|
Total input features: 52
|
|
|
|
Training data size: (448, 52)
|
|
Test data size: (410, 52)
|
|
|
|
Target feature numbers (training data): Counter({0: 353, 1: 95})
|
|
Target features ratio (training data: 3.7157894736842105
|
|
|
|
Target feature numbers (test data): Counter({0: 385, 1: 25})
|
|
Target features ratio (test data): 15.4
|
|
|
|
#####################################################################
|
|
|
|
|
|
================================================================
|
|
|
|
Strucutral features (n): 36
|
|
These are:
|
|
Common stablity features: ['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist']
|
|
FoldX columns: ['electro_rr', 'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr', 'disulfide_mm', 'disulfide_sm', 'disulfide_ss', 'hbonds_rr', 'hbonds_mm', 'hbonds_sm', 'hbonds_ss', 'partcov_rr', 'partcov_mm', 'partcov_sm', 'partcov_ss', 'vdwclashes_rr', 'vdwclashes_mm', 'vdwclashes_sm', 'vdwclashes_ss', 'volumetric_rr', 'volumetric_mm', 'volumetric_ss']
|
|
Other struc columns: ['rsa', 'kd_values', 'rd_values']
|
|
================================================================
|
|
|
|
Evolutionary features (n): 3
|
|
These are:
|
|
['consurf_score', 'snap2_score', 'provean_score']
|
|
================================================================
|
|
|
|
Genomic features (n): 6
|
|
These are:
|
|
['maf', 'logorI']
|
|
['lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique']
|
|
================================================================
|
|
|
|
Categorical features (n): 7
|
|
These are:
|
|
['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site']
|
|
================================================================
|
|
|
|
|
|
Pass: No. of features match
|
|
|
|
#####################################################################
|
|
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01869178 0.03414488 0.02512193 0.02518272 0.02411675 0.02404666
|
|
0.02404714 0.02150297 0.02328777 0.0233357 ]
|
|
|
|
mean value: 0.024347829818725585
|
|
|
|
key: score_time
|
|
value: [0.01090193 0.01089644 0.0107584 0.01090932 0.0108211 0.01080489
|
|
0.0108552 0.01080036 0.01080394 0.01080608]
|
|
|
|
mean value: 0.010835766792297363
|
|
|
|
key: test_mcc
|
|
value: [0.56660974 0.66143783 0.74285714 0.80295507 0.80295507 0.63936201
|
|
0.78446454 0.70511024 0.78360391 0.70370542]
|
|
|
|
mean value: 0.7193060978385479
|
|
|
|
key: train_mcc
|
|
value: [0.82306415 0.783378 0.78225437 0.78270798 0.80735444 0.83325019
|
|
0.77718904 0.80913415 0.81084447 0.80068593]
|
|
|
|
mean value: 0.8009862707554779
|
|
|
|
key: test_accuracy
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[0.84444444 0.88888889 0.91111111 0.93333333 0.93333333 0.88888889
|
|
0.93333333 0.91111111 0.93181818 0.90909091]
|
|
|
|
mean value: 0.9085353535353535
|
|
|
|
key: train_accuracy
|
|
value: [0.94292804 0.93052109 0.93052109 0.93052109 0.93796526 0.94540943
|
|
0.9280397 0.93796526 0.93811881 0.93564356]
|
|
|
|
mean value: 0.9357633343979559
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.66666667 0.8 0.82352941 0.82352941 0.70588235
|
|
0.8 0.75 0.8 0.75 ]
|
|
|
|
mean value: 0.7586274509803922
|
|
|
|
key: train_fscore
|
|
value: [0.85534591 0.82278481 0.81818182 0.82051282 0.8427673 0.86585366
|
|
0.81761006 0.8447205 0.84848485 0.83544304]
|
|
|
|
mean value: 0.8371704761151999
|
|
|
|
key: test_precision
|
|
value: [0.63636364 1. 0.8 1. 1. 0.75
|
|
1. 0.85714286 1. 0.85714286]
|
|
|
|
mean value: 0.890064935064935
|
|
|
|
key: train_precision
|
|
value: [0.91891892 0.89041096 0.91304348 0.90140845 0.90540541 0.91025641
|
|
0.89041096 0.90666667 0.88607595 0.91666667]
|
|
|
|
mean value: 0.9039263864054471
|
|
|
|
key: test_recall
|
|
value: [0.7 0.5 0.8 0.7 0.7 0.66666667
|
|
0.66666667 0.66666667 0.66666667 0.66666667]
|
|
|
|
mean value: 0.6733333333333333
|
|
|
|
key: train_recall
|
|
value: [0.8 0.76470588 0.74117647 0.75294118 0.78823529 0.8255814
|
|
0.75581395 0.79069767 0.81395349 0.76744186]
|
|
|
|
mean value: 0.7800547195622435
|
|
|
|
key: test_roc_auc
|
|
value: [0.79285714 0.75 0.87142857 0.85 0.85 0.80555556
|
|
0.83333333 0.81944444 0.83333333 0.81904762]
|
|
|
|
mean value: 0.8225
|
|
|
|
key: train_roc_auc
|
|
value: [0.89056604 0.86977432 0.86115427 0.8654643 0.88311136 0.90174969
|
|
0.86528868 0.88430783 0.8928258 0.87428697]
|
|
|
|
mean value: 0.8788529257196572
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.5 0.66666667 0.7 0.7 0.54545455
|
|
0.66666667 0.6 0.66666667 0.6 ]
|
|
|
|
mean value: 0.6145454545454545
|
|
|
|
key: train_jcc
|
|
value: [0.74725275 0.69892473 0.69230769 0.69565217 0.72826087 0.76344086
|
|
0.69148936 0.7311828 0.73684211 0.7173913 ]
|
|
|
|
mean value: 0.7202744641448586
|
|
|
|
MCC on Blind test: 0.31
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.70116019 0.80249429 0.71138048 0.69454575 0.80476427 0.65118718
|
|
0.66926408 0.86205721 0.67393637 0.69592953]
|
|
|
|
mean value: 0.7266719341278076
|
|
|
|
key: score_time
|
|
value: [0.01424193 0.0140202 0.01410961 0.01113176 0.01438403 0.01496172
|
|
0.01431608 0.01441383 0.01441216 0.01464701]
|
|
|
|
mean value: 0.014063835144042969
|
|
|
|
key: test_mcc
|
|
value: [0.64465837 0.73010948 0.76553182 0.93541435 0.86991767 0.55182541
|
|
0.92998111 0.87904907 0.78360391 0.86031746]
|
|
|
|
mean value: 0.7950408639956386
|
|
|
|
key: train_mcc
|
|
value: [0.91766928 0.89482822 0.90269496 0.91054384 0.91837573 0.94087008
|
|
0.89652263 0.89748849 0.91136463 0.89659207]
|
|
|
|
mean value: 0.9086949939856763
|
|
|
|
key: test_accuracy
|
|
value: [0.86666667 0.91111111 0.91111111 0.97777778 0.95555556 0.86666667
|
|
0.97777778 0.95555556 0.93181818 0.95454545]
|
|
|
|
mean value: 0.9308585858585858
|
|
|
|
key: train_accuracy
|
|
value: [0.97270471 0.96526055 0.96774194 0.97022333 0.97270471 0.98014888
|
|
0.96526055 0.96526055 0.97029703 0.96534653]
|
|
|
|
mean value: 0.969494877527455
|
|
|
|
key: test_fscore
|
|
value: [0.72727273 0.77777778 0.81818182 0.94736842 0.88888889 0.625
|
|
0.94117647 0.9 0.8 0.88888889]
|
|
|
|
mean value: 0.8314554992650968
|
|
|
|
key: train_fscore
|
|
value: [0.93491124 0.91666667 0.92307692 0.92941176 0.93567251 0.95348837
|
|
0.91860465 0.91954023 0.93023256 0.91860465]
|
|
|
|
mean value: 0.9280209574116103
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.875 0.75 1. 1. 0.71428571
|
|
1. 0.81818182 1. 0.88888889]
|
|
|
|
mean value: 0.8713023088023089
|
|
|
|
key: train_precision
|
|
value: [0.94047619 0.92771084 0.92857143 0.92941176 0.93023256 0.95348837
|
|
0.91860465 0.90909091 0.93023256 0.91860465]
|
|
|
|
mean value: 0.9286423926915579
|
|
|
|
key: test_recall
|
|
value: [0.8 0.7 0.9 0.9 0.8 0.55555556
|
|
0.88888889 1. 0.66666667 0.88888889]
|
|
|
|
mean value: 0.81
|
|
|
|
key: train_recall
|
|
value: [0.92941176 0.90588235 0.91764706 0.92941176 0.94117647 0.95348837
|
|
0.91860465 0.93023256 0.93023256 0.91860465]
|
|
|
|
mean value: 0.927469220246238
|
|
|
|
key: test_roc_auc
|
|
value: [0.84285714 0.83571429 0.90714286 0.95 0.9 0.75
|
|
0.94444444 0.97222222 0.83333333 0.93015873]
|
|
|
|
mean value: 0.8865873015873016
|
|
|
|
key: train_roc_auc
|
|
value: [0.95684425 0.94350721 0.94938957 0.95527192 0.96115427 0.97043504
|
|
0.94826132 0.95249798 0.95568232 0.94829604]
|
|
|
|
mean value: 0.9541339911123459
|
|
|
|
key: test_jcc
|
|
value: [0.57142857 0.63636364 0.69230769 0.9 0.8 0.45454545
|
|
0.88888889 0.81818182 0.66666667 0.8 ]
|
|
|
|
mean value: 0.7228382728382728
|
|
|
|
key: train_jcc
|
|
value: [0.87777778 0.84615385 0.85714286 0.86813187 0.87912088 0.91111111
|
|
0.84946237 0.85106383 0.86956522 0.84946237]
|
|
|
|
mean value: 0.8658992117799673
|
|
|
|
MCC on Blind test: 0.26
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01066494 0.01008058 0.0076077 0.00746608 0.00735378 0.0078814
|
|
0.00793862 0.00814748 0.00798297 0.0080893 ]
|
|
|
|
mean value: 0.008321285247802734
|
|
|
|
key: score_time
|
|
value: [0.01098418 0.00850654 0.00823236 0.00799155 0.00801158 0.00867128
|
|
0.00872326 0.00870562 0.00858879 0.008641 ]
|
|
|
|
mean value: 0.008705615997314453
|
|
|
|
key: test_mcc
|
|
value: [0.48483174 0.59030128 0.52378493 0.79539491 0.56447381 0.49897013
|
|
0.74977715 0.58333333 0.74230749 0.35783003]
|
|
|
|
mean value: 0.5891004788968268
|
|
|
|
key: train_mcc
|
|
value: [0.70574597 0.60320861 0.66220727 0.65220882 0.63952782 0.60127303
|
|
0.65498926 0.65973727 0.65999897 0.63678872]
|
|
|
|
mean value: 0.6475685731489803
|
|
|
|
key: test_accuracy
|
|
value: [0.8 0.86666667 0.82222222 0.91111111 0.82222222 0.75555556
|
|
0.91111111 0.86666667 0.88636364 0.77272727]
|
|
|
|
mean value: 0.8414646464646465
|
|
|
|
key: train_accuracy
|
|
value: [0.88337469 0.86848635 0.87344913 0.86600496 0.86600496 0.81141439
|
|
0.86848635 0.87096774 0.87128713 0.85891089]
|
|
|
|
mean value: 0.863838660540992
|
|
|
|
key: test_fscore
|
|
value: [0.60869565 0.66666667 0.63636364 0.83333333 0.66666667 0.59259259
|
|
0.8 0.66666667 0.7826087 0.5 ]
|
|
|
|
mean value: 0.675359391011565
|
|
|
|
key: train_fscore
|
|
value: [0.76616915 0.68639053 0.7357513 0.72727273 0.71875 0.6779661
|
|
0.73096447 0.73469388 0.73469388 0.71641791]
|
|
|
|
mean value: 0.7229069943632542
|
|
|
|
key: test_precision
|
|
value: [0.53846154 0.75 0.58333333 0.71428571 0.57142857 0.44444444
|
|
0.72727273 0.66666667 0.64285714 0.45454545]
|
|
|
|
mean value: 0.6093295593295593
|
|
|
|
key: train_precision
|
|
value: [0.6637931 0.69047619 0.65740741 0.63716814 0.64485981 0.53333333
|
|
0.64864865 0.65454545 0.65454545 0.62608696]
|
|
|
|
mean value: 0.6410864503603536
|
|
|
|
key: test_recall
|
|
value: [0.7 0.6 0.7 1. 0.8 0.88888889
|
|
0.88888889 0.66666667 1. 0.55555556]
|
|
|
|
mean value: 0.78
|
|
|
|
key: train_recall
|
|
value: [0.90588235 0.68235294 0.83529412 0.84705882 0.81176471 0.93023256
|
|
0.8372093 0.8372093 0.8372093 0.8372093 ]
|
|
|
|
mean value: 0.8361422708618331
|
|
|
|
key: test_roc_auc
|
|
value: [0.76428571 0.77142857 0.77857143 0.94285714 0.81428571 0.80555556
|
|
0.90277778 0.79166667 0.92857143 0.69206349]
|
|
|
|
mean value: 0.8192063492063492
|
|
|
|
key: train_roc_auc
|
|
value: [0.89162042 0.80029597 0.85947096 0.859064 0.84613393 0.85470618
|
|
0.85709046 0.85866774 0.85885622 0.85099459]
|
|
|
|
mean value: 0.8536900470036405
|
|
|
|
key: test_jcc
|
|
value: [0.4375 0.5 0.46666667 0.71428571 0.5 0.42105263
|
|
0.66666667 0.5 0.64285714 0.33333333]
|
|
|
|
mean value: 0.5182362155388471
|
|
|
|
key: train_jcc
|
|
value: [0.62096774 0.52252252 0.58196721 0.57142857 0.56097561 0.51282051
|
|
0.576 0.58064516 0.58064516 0.55813953]
|
|
|
|
mean value: 0.5666112029042308
|
|
|
|
MCC on Blind test: 0.31
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00779533 0.00809622 0.00823021 0.0081985 0.00811529 0.00841022
|
|
0.00828648 0.00819182 0.00816894 0.00820422]
|
|
|
|
mean value: 0.008169722557067872
|
|
|
|
key: score_time
|
|
value: [0.00871468 0.00865245 0.00868225 0.00860405 0.00801325 0.00864792
|
|
0.0086062 0.00858259 0.00805998 0.0086143 ]
|
|
|
|
mean value: 0.008517765998840332
|
|
|
|
key: test_mcc
|
|
value: [0.24896765 0.50799198 0.26726124 0.65547353 0.48483174 0.0919709
|
|
0.53033009 0.34874292 0.35783003 0.2627869 ]
|
|
|
|
mean value: 0.37561869673775167
|
|
|
|
key: train_mcc
|
|
value: [0.46269974 0.47030687 0.49545247 0.43032136 0.4560545 0.49551509
|
|
0.4750932 0.46293038 0.45477034 0.4913158 ]
|
|
|
|
mean value: 0.4694459747127293
|
|
|
|
key: test_accuracy
|
|
value: [0.71111111 0.84444444 0.75555556 0.88888889 0.8 0.73333333
|
|
0.86666667 0.8 0.77272727 0.75 ]
|
|
|
|
mean value: 0.7922727272727272
|
|
|
|
key: train_accuracy
|
|
value: [0.81637717 0.82878412 0.83622829 0.81885856 0.82630273 0.82382134
|
|
0.82878412 0.80645161 0.82178218 0.82425743]
|
|
|
|
mean value: 0.8231647544407046
|
|
|
|
key: test_fscore
|
|
value: [0.43478261 0.58823529 0.42105263 0.70588235 0.60869565 0.25
|
|
0.57142857 0.47058824 0.5 0.42105263]
|
|
|
|
mean value: 0.49717179778089726
|
|
|
|
key: train_fscore
|
|
value: [0.57954545 0.57668712 0.59756098 0.5408805 0.5625 0.60773481
|
|
0.58181818 0.58510638 0.56626506 0.60335196]
|
|
|
|
mean value: 0.5801450436839247
|
|
|
|
key: test_precision
|
|
value: [0.38461538 0.71428571 0.44444444 0.85714286 0.53846154 0.28571429
|
|
0.8 0.5 0.45454545 0.4 ]
|
|
|
|
mean value: 0.5379209679209679
|
|
|
|
key: train_precision
|
|
value: [0.56043956 0.6025641 0.62025316 0.58108108 0.6 0.57894737
|
|
0.60759494 0.53921569 0.5875 0.58064516]
|
|
|
|
mean value: 0.5858241061336452
|
|
|
|
key: test_recall
|
|
value: [0.5 0.5 0.4 0.6 0.7 0.22222222
|
|
0.44444444 0.44444444 0.55555556 0.44444444]
|
|
|
|
mean value: 0.4811111111111111
|
|
|
|
key: train_recall
|
|
value: [0.6 0.55294118 0.57647059 0.50588235 0.52941176 0.63953488
|
|
0.55813953 0.63953488 0.54651163 0.62790698]
|
|
|
|
mean value: 0.5776333789329685
|
|
|
|
key: test_roc_auc
|
|
value: [0.63571429 0.72142857 0.62857143 0.78571429 0.76428571 0.54166667
|
|
0.70833333 0.66666667 0.69206349 0.63650794]
|
|
|
|
mean value: 0.6780952380952381
|
|
|
|
key: train_roc_auc
|
|
value: [0.73710692 0.72772845 0.74106548 0.70419904 0.71753607 0.75667596
|
|
0.73017387 0.74563495 0.72136902 0.75263273]
|
|
|
|
mean value: 0.7334122492545921
|
|
|
|
key: test_jcc
|
|
value: [0.27777778 0.41666667 0.26666667 0.54545455 0.4375 0.14285714
|
|
0.4 0.30769231 0.33333333 0.26666667]
|
|
|
|
mean value: 0.3394615107115107
|
|
|
|
key: train_jcc
|
|
value: [0.408 0.40517241 0.42608696 0.37068966 0.39130435 0.43650794
|
|
0.41025641 0.41353383 0.39495798 0.432 ]
|
|
|
|
mean value: 0.4088509537857433
|
|
|
|
MCC on Blind test: 0.27
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00812173 0.00822234 0.00809884 0.00798321 0.00790763 0.00786543
|
|
0.00798416 0.00782347 0.00781441 0.00777555]
|
|
|
|
mean value: 0.007959675788879395
|
|
|
|
key: score_time
|
|
value: [0.05056334 0.01157475 0.01144004 0.01309729 0.01108718 0.01101208
|
|
0.01113534 0.01105309 0.01099586 0.01101422]
|
|
|
|
mean value: 0.015297317504882812
|
|
|
|
key: test_mcc
|
|
value: [0.15118579 0.5 0.41931393 0.65547353 0.28571429 0.53452248
|
|
0.42947785 0.6681531 0.78360391 0.15983741]
|
|
|
|
mean value: 0.458728228665888
|
|
|
|
key: train_mcc
|
|
value: [0.60794624 0.62646908 0.63584249 0.56988091 0.6086966 0.60327572
|
|
0.62166491 0.60374425 0.59538331 0.60392999]
|
|
|
|
mean value: 0.6076833509798295
|
|
|
|
key: test_accuracy
|
|
value: [0.75555556 0.84444444 0.82222222 0.88888889 0.8 0.86666667
|
|
0.84444444 0.88888889 0.93181818 0.79545455]
|
|
|
|
mean value: 0.8438383838383838
|
|
|
|
key: train_accuracy
|
|
value: [0.8808933 0.88585608 0.88833747 0.87096774 0.8808933 0.87841191
|
|
0.88337469 0.87841191 0.87623762 0.87871287]
|
|
|
|
mean value: 0.8802096897034617
|
|
|
|
key: test_fscore
|
|
value: [0.26666667 0.46153846 0.5 0.70588235 0.30769231 0.5
|
|
0.46153846 0.73684211 0.8 0.18181818]
|
|
|
|
mean value: 0.49219785374584135
|
|
|
|
key: train_fscore
|
|
value: [0.64179104 0.66176471 0.67625899 0.6 0.65217391 0.64233577
|
|
0.6618705 0.64748201 0.64285714 0.64748201]
|
|
|
|
mean value: 0.6474016098162307
|
|
|
|
key: test_precision
|
|
value: [0.4 1. 0.66666667 0.85714286 0.66666667 1.
|
|
0.75 0.7 1. 0.5 ]
|
|
|
|
mean value: 0.7540476190476191
|
|
|
|
key: train_precision
|
|
value: [0.87755102 0.88235294 0.87037037 0.86666667 0.8490566 0.8627451
|
|
0.86792453 0.8490566 0.83333333 0.8490566 ]
|
|
|
|
mean value: 0.8608113769616862
|
|
|
|
key: test_recall
|
|
value: [0.2 0.3 0.4 0.6 0.2 0.33333333
|
|
0.33333333 0.77777778 0.66666667 0.11111111]
|
|
|
|
mean value: 0.3922222222222222
|
|
|
|
key: train_recall
|
|
value: [0.50588235 0.52941176 0.55294118 0.45882353 0.52941176 0.51162791
|
|
0.53488372 0.52325581 0.52325581 0.52325581]
|
|
|
|
mean value: 0.5192749658002737
|
|
|
|
key: test_roc_auc
|
|
value: [0.55714286 0.65 0.67142857 0.78571429 0.58571429 0.66666667
|
|
0.65277778 0.84722222 0.83333333 0.54126984]
|
|
|
|
mean value: 0.6791269841269841
|
|
|
|
key: train_roc_auc
|
|
value: [0.74350721 0.75527192 0.7654643 0.7199778 0.75212727 0.74477294
|
|
0.75640085 0.74900961 0.74747696 0.74904929]
|
|
|
|
mean value: 0.7483058161342697
|
|
|
|
key: test_jcc
|
|
value: [0.15384615 0.3 0.33333333 0.54545455 0.18181818 0.33333333
|
|
0.3 0.58333333 0.66666667 0.1 ]
|
|
|
|
mean value: 0.3497785547785548
|
|
|
|
key: train_jcc
|
|
value: [0.47252747 0.49450549 0.51086957 0.42857143 0.48387097 0.47311828
|
|
0.49462366 0.4787234 0.47368421 0.4787234 ]
|
|
|
|
mean value: 0.4789217883084548
|
|
|
|
MCC on Blind test: 0.34
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01309419 0.01144147 0.01072574 0.0107615 0.01076436 0.0107367
|
|
0.01085353 0.01092339 0.01091146 0.01101494]
|
|
|
|
mean value: 0.011122727394104004
|
|
|
|
key: score_time
|
|
value: [0.00937915 0.00849342 0.00914073 0.00845909 0.00852418 0.00841475
|
|
0.00845885 0.00858498 0.00842404 0.0084095 ]
|
|
|
|
mean value: 0.00862886905670166
|
|
|
|
key: test_mcc
|
|
value: [0.44223199 0.66143783 0.53452248 0.73010948 0.57655666 0.62103443
|
|
0.78446454 0.70511024 0.70370542 0.49137176]
|
|
|
|
mean value: 0.6250544835615041
|
|
|
|
key: train_mcc
|
|
value: [0.78203228 0.75720841 0.76610765 0.71535862 0.74092334 0.76796395
|
|
0.73477752 0.76796395 0.76847981 0.74415611]
|
|
|
|
mean value: 0.7544971629935804
|
|
|
|
key: test_accuracy
|
|
value: [0.8 0.88888889 0.84444444 0.91111111 0.86666667 0.88888889
|
|
0.93333333 0.91111111 0.90909091 0.84090909]
|
|
|
|
mean value: 0.8794444444444445
|
|
|
|
key: train_accuracy
|
|
value: [0.93052109 0.92307692 0.92555831 0.91066998 0.91811414 0.92555831
|
|
0.91563275 0.92555831 0.92574257 0.91831683]
|
|
|
|
mean value: 0.9218749232243324
|
|
|
|
key: test_fscore
|
|
value: [0.57142857 0.66666667 0.63157895 0.77777778 0.625 0.66666667
|
|
0.8 0.75 0.75 0.58823529]
|
|
|
|
mean value: 0.6827353924025751
|
|
|
|
key: train_fscore
|
|
value: [0.81578947 0.79194631 0.80519481 0.75675676 0.78145695 0.80519481
|
|
0.77333333 0.80519481 0.80769231 0.78709677]
|
|
|
|
mean value: 0.7929656323611789
|
|
|
|
key: test_precision
|
|
value: [0.54545455 1. 0.66666667 0.875 0.83333333 0.83333333
|
|
1. 0.85714286 0.85714286 0.625 ]
|
|
|
|
mean value: 0.8093073593073593
|
|
|
|
key: train_precision
|
|
value: [0.92537313 0.921875 0.89855072 0.88888889 0.89393939 0.91176471
|
|
0.90625 0.91176471 0.9 0.88405797]
|
|
|
|
mean value: 0.9042464524573521
|
|
|
|
key: test_recall
|
|
value: [0.6 0.5 0.6 0.7 0.5 0.55555556
|
|
0.66666667 0.66666667 0.66666667 0.55555556]
|
|
|
|
mean value: 0.6011111111111112
|
|
|
|
key: train_recall
|
|
value: [0.72941176 0.69411765 0.72941176 0.65882353 0.69411765 0.72093023
|
|
0.6744186 0.72093023 0.73255814 0.70930233]
|
|
|
|
mean value: 0.7064021887824897
|
|
|
|
key: test_roc_auc
|
|
value: [0.72857143 0.75 0.75714286 0.83571429 0.73571429 0.76388889
|
|
0.83333333 0.81944444 0.81904762 0.73492063]
|
|
|
|
mean value: 0.7777777777777778
|
|
|
|
key: train_roc_auc
|
|
value: [0.85684425 0.83919719 0.85369959 0.81840548 0.83605253 0.85100139
|
|
0.82774558 0.85100139 0.85527278 0.84207255]
|
|
|
|
mean value: 0.8431292732694862
|
|
|
|
key: test_jcc
|
|
value: [0.4 0.5 0.46153846 0.63636364 0.45454545 0.5
|
|
0.66666667 0.6 0.6 0.41666667]
|
|
|
|
mean value: 0.5235780885780885
|
|
|
|
key: train_jcc
|
|
value: [0.68888889 0.65555556 0.67391304 0.60869565 0.64130435 0.67391304
|
|
0.63043478 0.67391304 0.67741935 0.64893617]
|
|
|
|
mean value: 0.6572973882539398
|
|
|
|
MCC on Blind test: 0.38
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.72929549 0.56049705 0.5628233 0.42927814 0.6755352 0.6567452
|
|
0.94534802 0.34498525 1.1619606 0.58351111]
|
|
|
|
mean value: 0.664997935295105
|
|
|
|
key: score_time
|
|
value: [0.0111177 0.01106381 0.01113963 0.01541114 0.01135039 0.01112366
|
|
0.01115203 0.02141356 0.01108098 0.01109648]
|
|
|
|
mean value: 0.012594938278198242
|
|
|
|
key: test_mcc
|
|
value: [0.56660974 0.49135381 0.41931393 0.76553182 0.53452248 0.55182541
|
|
0.78446454 0.42947785 0.78353876 0.70156665]
|
|
|
|
mean value: 0.6028204990182109
|
|
|
|
key: train_mcc
|
|
value: [0.70728394 0.72527216 0.59866291 0.72938054 0.58118493 0.75134611
|
|
0.79357239 0.57559177 0.83228784 0.78243871]
|
|
|
|
mean value: 0.7077021299353916
|
|
|
|
key: test_accuracy
|
|
value: [0.84444444 0.84444444 0.82222222 0.91111111 0.84444444 0.86666667
|
|
0.93333333 0.84444444 0.93181818 0.88636364]
|
|
|
|
mean value: 0.8729292929292929
|
|
|
|
key: train_accuracy
|
|
value: [0.90818859 0.91066998 0.87841191 0.91066998 0.87344913 0.91811414
|
|
0.93300248 0.87096774 0.94554455 0.92574257]
|
|
|
|
mean value: 0.9074761074122301
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.53333333 0.5 0.81818182 0.63157895 0.625
|
|
0.8 0.46153846 0.82352941 0.76190476]
|
|
|
|
mean value: 0.6621733400758169
|
|
|
|
key: train_fscore
|
|
value: [0.75167785 0.7804878 0.6259542 0.78571429 0.62773723 0.80239521
|
|
0.83229814 0.61764706 0.86075949 0.82954545]
|
|
|
|
mean value: 0.7514216720958653
|
|
|
|
key: test_precision
|
|
value: [0.63636364 0.8 0.66666667 0.75 0.66666667 0.71428571
|
|
1. 0.75 0.875 0.66666667]
|
|
|
|
mean value: 0.752564935064935
|
|
|
|
key: train_precision
|
|
value: [0.875 0.81012658 0.89130435 0.79518072 0.82692308 0.82716049
|
|
0.89333333 0.84 0.94444444 0.81111111]
|
|
|
|
mean value: 0.8514584112635261
|
|
|
|
key: test_recall
|
|
value: [0.7 0.4 0.4 0.9 0.6 0.55555556
|
|
0.66666667 0.33333333 0.77777778 0.88888889]
|
|
|
|
mean value: 0.6222222222222222
|
|
|
|
key: train_recall
|
|
value: [0.65882353 0.75294118 0.48235294 0.77647059 0.50588235 0.77906977
|
|
0.77906977 0.48837209 0.79069767 0.84883721]
|
|
|
|
mean value: 0.6862517099863201
|
|
|
|
key: test_roc_auc
|
|
value: [0.79285714 0.68571429 0.67142857 0.90714286 0.75714286 0.75
|
|
0.83333333 0.65277778 0.87460317 0.88730159]
|
|
|
|
mean value: 0.7812301587301587
|
|
|
|
key: train_roc_auc
|
|
value: [0.81683315 0.85288568 0.73331484 0.86150573 0.73879023 0.86745286
|
|
0.87691659 0.73156775 0.88905953 0.89768904]
|
|
|
|
mean value: 0.8266015409642332
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.36363636 0.33333333 0.69230769 0.46153846 0.45454545
|
|
0.66666667 0.3 0.7 0.61538462]
|
|
|
|
mean value: 0.5087412587412588
|
|
|
|
key: train_jcc
|
|
value: [0.60215054 0.64 0.45555556 0.64705882 0.45744681 0.67
|
|
0.71276596 0.44680851 0.75555556 0.70873786]
|
|
|
|
mean value: 0.6096079612948346
|
|
|
|
MCC on Blind test: 0.27
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01367283 0.01149011 0.01038933 0.00990009 0.00930309 0.00920916
|
|
0.01029682 0.00981116 0.00988531 0.00956535]
|
|
|
|
mean value: 0.010352325439453126
|
|
|
|
key: score_time
|
|
value: [0.01084828 0.00840163 0.00803304 0.00791907 0.00791812 0.0079155
|
|
0.00788355 0.00793147 0.00789642 0.00787997]
|
|
|
|
mean value: 0.008262705802917481
|
|
|
|
key: test_mcc
|
|
value: [0.72069583 0.80178373 0.88640526 0.80295507 0.53452248 0.72222222
|
|
0.86111111 0.86111111 0.85775039 0.87831007]
|
|
|
|
mean value: 0.7926867266324226
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.88888889 0.93333333 0.95555556 0.93333333 0.84444444 0.91111111
|
|
0.95555556 0.95555556 0.95454545 0.95454545]
|
|
|
|
mean value: 0.9286868686868687
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.7826087 0.84210526 0.90909091 0.82352941 0.63157895 0.77777778
|
|
0.88888889 0.88888889 0.875 0.9 ]
|
|
|
|
mean value: 0.8319468782589661
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.69230769 0.88888889 0.83333333 1. 0.66666667 0.77777778
|
|
0.88888889 0.88888889 1. 0.81818182]
|
|
|
|
mean value: 0.8454933954933955
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.9 0.8 1. 0.7 0.6 0.77777778
|
|
0.88888889 0.88888889 0.77777778 1. ]
|
|
|
|
mean value: 0.8333333333333334
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.89285714 0.88571429 0.97142857 0.85 0.75714286 0.86111111
|
|
0.93055556 0.93055556 0.88888889 0.97142857]
|
|
|
|
mean value: 0.893968253968254
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.64285714 0.72727273 0.83333333 0.7 0.46153846 0.63636364
|
|
0.8 0.8 0.77777778 0.81818182]
|
|
|
|
mean value: 0.7197324897324897
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.24
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.09645939 0.09477639 0.09699512 0.09707952 0.10095119 0.10206556
|
|
0.1050036 0.09934473 0.09767556 0.09797168]
|
|
|
|
mean value: 0.09883227348327636
|
|
|
|
key: score_time
|
|
value: [0.01664639 0.01817846 0.01748824 0.01789355 0.0176363 0.01849771
|
|
0.01823759 0.01819301 0.01714015 0.01752734]
|
|
|
|
mean value: 0.017743873596191406
|
|
|
|
key: test_mcc
|
|
value: [0.64465837 0.50799198 0.59030128 0.80295507 0.73010948 0.63936201
|
|
0.78446454 0.78446454 0.92962225 0.70370542]
|
|
|
|
mean value: 0.7117634944566859
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.86666667 0.84444444 0.86666667 0.93333333 0.91111111 0.88888889
|
|
0.93333333 0.93333333 0.97727273 0.90909091]
|
|
|
|
mean value: 0.9064141414141414
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.72727273 0.58823529 0.66666667 0.82352941 0.77777778 0.70588235
|
|
0.8 0.8 0.94117647 0.75 ]
|
|
|
|
mean value: 0.7580540701128936
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.71428571 0.75 1. 0.875 0.75
|
|
1. 1. 1. 0.85714286]
|
|
|
|
mean value: 0.8613095238095239
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.8 0.5 0.6 0.7 0.7 0.66666667
|
|
0.66666667 0.66666667 0.88888889 0.66666667]
|
|
|
|
mean value: 0.6855555555555555
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.84285714 0.72142857 0.77142857 0.85 0.83571429 0.80555556
|
|
0.83333333 0.83333333 0.94444444 0.81904762]
|
|
|
|
mean value: 0.8257142857142856
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.57142857 0.41666667 0.5 0.7 0.63636364 0.54545455
|
|
0.66666667 0.66666667 0.88888889 0.6 ]
|
|
|
|
mean value: 0.6192135642135642
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.34
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00831676 0.00815201 0.00814176 0.00760484 0.00792241 0.00806022
|
|
0.00744224 0.00811696 0.00831771 0.007792 ]
|
|
|
|
mean value: 0.007986688613891601
|
|
|
|
key: score_time
|
|
value: [0.0079205 0.00823855 0.0081501 0.00784802 0.00789833 0.00781298
|
|
0.00785351 0.00774097 0.00806856 0.00786448]
|
|
|
|
mean value: 0.007939600944519043
|
|
|
|
key: test_mcc
|
|
value: [0.54554473 0.6681531 0.49629167 0.76553182 0.45049308 0.24525574
|
|
0.72222222 0.53452248 0.53168513 0.57505463]
|
|
|
|
mean value: 0.5534754596434204
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.77777778 0.88888889 0.77777778 0.91111111 0.82222222 0.77777778
|
|
0.91111111 0.84444444 0.84090909 0.84090909]
|
|
|
|
mean value: 0.8392929292929293
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.64285714 0.73684211 0.61538462 0.81818182 0.55555556 0.375
|
|
0.77777778 0.63157895 0.63157895 0.66666667]
|
|
|
|
mean value: 0.6451423576423576
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.5 0.77777778 0.5 0.75 0.625 0.42857143
|
|
0.77777778 0.6 0.6 0.58333333]
|
|
|
|
mean value: 0.6142460317460318
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.9 0.7 0.8 0.9 0.5 0.33333333
|
|
0.77777778 0.66666667 0.66666667 0.77777778]
|
|
|
|
mean value: 0.7022222222222222
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.82142857 0.82142857 0.78571429 0.90714286 0.70714286 0.61111111
|
|
0.86111111 0.77777778 0.77619048 0.81746032]
|
|
|
|
mean value: 0.7886507936507936
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.47368421 0.58333333 0.44444444 0.69230769 0.38461538 0.23076923
|
|
0.63636364 0.46153846 0.46153846 0.5 ]
|
|
|
|
mean value: 0.48685948554369607
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.15
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.25313497 1.24195075 1.2457087 1.27323222 1.23807859 1.32632828
|
|
1.27514148 1.23370337 1.23768973 1.23494816]
|
|
|
|
mean value: 1.2559916257858277
|
|
|
|
key: score_time
|
|
value: [0.09639502 0.09044957 0.09679174 0.0888958 0.0955162 0.096071
|
|
0.09548783 0.09246659 0.09599972 0.08963633]
|
|
|
|
mean value: 0.09377098083496094
|
|
|
|
key: test_mcc
|
|
value: [0.79539491 0.93974299 0.88640526 0.86991767 0.86991767 0.72222222
|
|
0.92998111 0.93541435 0.92962225 0.93503247]
|
|
|
|
mean value: 0.8813650899545291
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.91111111 0.97777778 0.95555556 0.95555556 0.95555556 0.91111111
|
|
0.97777778 0.97777778 0.97727273 0.97727273]
|
|
|
|
mean value: 0.9576767676767677
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.95238095 0.90909091 0.88888889 0.88888889 0.77777778
|
|
0.94117647 0.94736842 0.94117647 0.94736842]
|
|
|
|
mean value: 0.9027450533642484
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.71428571 0.90909091 0.83333333 1. 1. 0.77777778
|
|
1. 0.9 1. 0.9 ]
|
|
|
|
mean value: 0.9034487734487735
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 0.8 0.8 0.77777778
|
|
0.88888889 1. 0.88888889 1. ]
|
|
|
|
mean value: 0.9155555555555556
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.94285714 0.98571429 0.97142857 0.9 0.9 0.86111111
|
|
0.94444444 0.98611111 0.94444444 0.98571429]
|
|
|
|
mean value: 0.9421825396825396
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.90909091 0.83333333 0.8 0.8 0.63636364
|
|
0.88888889 0.9 0.88888889 0.9 ]
|
|
|
|
mean value: 0.8270851370851371
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.27
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
|
|
key: fit_time
|
|
value: [1.7415638 0.88944888 0.92626333 0.89779186 0.89715624 0.98574638
|
|
0.93079758 0.91791654 0.86064243 0.91723776]
|
|
|
|
mean value: 0.9964564800262451
|
|
|
|
key: score_time
|
|
value: [0.22609925 0.25738049 0.15809059 0.23059368 0.22971487 0.24633598
|
|
0.2073884 0.26125479 0.13984299 0.23984051]
|
|
|
|
mean value: 0.21965415477752687
|
|
|
|
key: test_mcc
|
|
value: [0.83862787 0.93541435 0.74285714 0.86991767 0.86991767 0.72222222
|
|
0.92998111 0.93541435 0.92962225 0.93503247]
|
|
|
|
mean value: 0.8709007101271089
|
|
|
|
key: train_mcc
|
|
value: [0.93322152 0.93322152 0.93322152 0.93322152 0.94097505 0.95612789
|
|
0.93378477 0.93378477 0.94150783 0.94090976]
|
|
|
|
mean value: 0.9379976137162074
|
|
|
|
key: test_accuracy
|
|
value: [0.93333333 0.97777778 0.91111111 0.95555556 0.95555556 0.91111111
|
|
0.97777778 0.97777778 0.97727273 0.97727273]
|
|
|
|
mean value: 0.9554545454545454
|
|
|
|
key: train_accuracy
|
|
value: [0.97766749 0.97766749 0.97766749 0.97766749 0.98014888 0.98511166
|
|
0.97766749 0.97766749 0.98019802 0.98019802]
|
|
|
|
mean value: 0.9791661548288824
|
|
|
|
key: test_fscore
|
|
value: [0.86956522 0.94736842 0.8 0.88888889 0.88888889 0.77777778
|
|
0.94117647 0.94736842 0.94117647 0.94736842]
|
|
|
|
mean value: 0.8949578977281226
|
|
|
|
key: train_fscore
|
|
value: [0.94736842 0.94736842 0.94736842 0.94736842 0.95348837 0.96551724
|
|
0.94797688 0.94797688 0.95402299 0.95348837]
|
|
|
|
mean value: 0.9511944415507063
|
|
|
|
key: test_precision
|
|
value: [0.76923077 1. 0.8 1. 1. 0.77777778
|
|
1. 0.9 1. 0.9 ]
|
|
|
|
mean value: 0.9147008547008547
|
|
|
|
key: train_precision
|
|
value: [0.94186047 0.94186047 0.94186047 0.94186047 0.94252874 0.95454545
|
|
0.94252874 0.94252874 0.94318182 0.95348837]
|
|
|
|
mean value: 0.9446243712181964
|
|
|
|
key: test_recall
|
|
value: [1. 0.9 0.8 0.8 0.8 0.77777778
|
|
0.88888889 1. 0.88888889 1. ]
|
|
|
|
mean value: 0.8855555555555555
|
|
|
|
key: train_recall
|
|
value: [0.95294118 0.95294118 0.95294118 0.95294118 0.96470588 0.97674419
|
|
0.95348837 0.95348837 0.96511628 0.95348837]
|
|
|
|
mean value: 0.9578796169630643
|
|
|
|
key: test_roc_auc
|
|
value: [0.95714286 0.95 0.87142857 0.9 0.9 0.86111111
|
|
0.94444444 0.98611111 0.94444444 0.98571429]
|
|
|
|
mean value: 0.9300396825396825
|
|
|
|
key: train_roc_auc
|
|
value: [0.96860895 0.96860895 0.96860895 0.96860895 0.97449131 0.98206294
|
|
0.96885775 0.96885775 0.9746965 0.97045488]
|
|
|
|
mean value: 0.9713856946391021
|
|
|
|
key: test_jcc
|
|
value: [0.76923077 0.9 0.66666667 0.8 0.8 0.63636364
|
|
0.88888889 0.9 0.88888889 0.9 ]
|
|
|
|
mean value: 0.815003885003885
|
|
|
|
key: train_jcc
|
|
value: [0.9 0.9 0.9 0.9 0.91111111 0.93333333
|
|
0.9010989 0.9010989 0.91208791 0.91111111]
|
|
|
|
mean value: 0.906984126984127
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01952696 0.00753379 0.00747681 0.00779676 0.00800824 0.00813532
|
|
0.00763774 0.0079987 0.00786948 0.00797772]
|
|
|
|
mean value: 0.008996152877807617
|
|
|
|
key: score_time
|
|
value: [0.01012135 0.00860143 0.00816488 0.00859714 0.0084126 0.00811219
|
|
0.00836134 0.00859356 0.00860572 0.00817752]
|
|
|
|
mean value: 0.008574771881103515
|
|
|
|
key: test_mcc
|
|
value: [0.24896765 0.50799198 0.26726124 0.65547353 0.48483174 0.0919709
|
|
0.53033009 0.34874292 0.35783003 0.2627869 ]
|
|
|
|
mean value: 0.37561869673775167
|
|
|
|
key: train_mcc
|
|
value: [0.46269974 0.47030687 0.49545247 0.43032136 0.4560545 0.49551509
|
|
0.4750932 0.46293038 0.45477034 0.4913158 ]
|
|
|
|
mean value: 0.4694459747127293
|
|
|
|
key: test_accuracy
|
|
value: [0.71111111 0.84444444 0.75555556 0.88888889 0.8 0.73333333
|
|
0.86666667 0.8 0.77272727 0.75 ]
|
|
|
|
mean value: 0.7922727272727272
|
|
|
|
key: train_accuracy
|
|
value: [0.81637717 0.82878412 0.83622829 0.81885856 0.82630273 0.82382134
|
|
0.82878412 0.80645161 0.82178218 0.82425743]
|
|
|
|
mean value: 0.8231647544407046
|
|
|
|
key: test_fscore
|
|
value: [0.43478261 0.58823529 0.42105263 0.70588235 0.60869565 0.25
|
|
0.57142857 0.47058824 0.5 0.42105263]
|
|
|
|
mean value: 0.49717179778089726
|
|
|
|
key: train_fscore
|
|
value: [0.57954545 0.57668712 0.59756098 0.5408805 0.5625 0.60773481
|
|
0.58181818 0.58510638 0.56626506 0.60335196]
|
|
|
|
mean value: 0.5801450436839247
|
|
|
|
key: test_precision
|
|
value: [0.38461538 0.71428571 0.44444444 0.85714286 0.53846154 0.28571429
|
|
0.8 0.5 0.45454545 0.4 ]
|
|
|
|
mean value: 0.5379209679209679
|
|
|
|
key: train_precision
|
|
value: [0.56043956 0.6025641 0.62025316 0.58108108 0.6 0.57894737
|
|
0.60759494 0.53921569 0.5875 0.58064516]
|
|
|
|
mean value: 0.5858241061336452
|
|
|
|
key: test_recall
|
|
value: [0.5 0.5 0.4 0.6 0.7 0.22222222
|
|
0.44444444 0.44444444 0.55555556 0.44444444]
|
|
|
|
mean value: 0.4811111111111111
|
|
|
|
key: train_recall
|
|
value: [0.6 0.55294118 0.57647059 0.50588235 0.52941176 0.63953488
|
|
0.55813953 0.63953488 0.54651163 0.62790698]
|
|
|
|
mean value: 0.5776333789329685
|
|
|
|
key: test_roc_auc
|
|
value: [0.63571429 0.72142857 0.62857143 0.78571429 0.76428571 0.54166667
|
|
0.70833333 0.66666667 0.69206349 0.63650794]
|
|
|
|
mean value: 0.6780952380952381
|
|
|
|
key: train_roc_auc
|
|
value: [0.73710692 0.72772845 0.74106548 0.70419904 0.71753607 0.75667596
|
|
0.73017387 0.74563495 0.72136902 0.75263273]
|
|
|
|
mean value: 0.7334122492545921
|
|
|
|
key: test_jcc
|
|
value: [0.27777778 0.41666667 0.26666667 0.54545455 0.4375 0.14285714
|
|
0.4 0.30769231 0.33333333 0.26666667]
|
|
|
|
mean value: 0.3394615107115107
|
|
|
|
key: train_jcc
|
|
value: [0.408 0.40517241 0.42608696 0.37068966 0.39130435 0.43650794
|
|
0.41025641 0.41353383 0.39495798 0.432 ]
|
|
|
|
mean value: 0.4088509537857433
|
|
|
|
MCC on Blind test: 0.27
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.09265685 0.05299425 0.05342174 0.05338025 0.05346775 0.10898733
|
|
0.04733276 0.04669976 0.04698706 0.05033135]
|
|
|
|
mean value: 0.06062591075897217
|
|
|
|
key: score_time
|
|
value: [0.01007628 0.00975966 0.00967383 0.01012087 0.01011181 0.01093411
|
|
0.01036572 0.01043487 0.01195836 0.01052809]
|
|
|
|
mean value: 0.010396361351013184
|
|
|
|
key: test_mcc
|
|
value: [0.79539491 0.93541435 0.88640526 0.86991767 0.93541435 0.63936201
|
|
0.92998111 0.93541435 0.92962225 0.93503247]
|
|
|
|
mean value: 0.8791958723600636
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.91111111 0.97777778 0.95555556 0.95555556 0.97777778 0.88888889
|
|
0.97777778 0.97777778 0.97727273 0.97727273]
|
|
|
|
mean value: 0.9576767676767677
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.94736842 0.90909091 0.88888889 0.94736842 0.70588235
|
|
0.94117647 0.94736842 0.94117647 0.94736842]
|
|
|
|
mean value: 0.9009022109641305
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.71428571 1. 0.83333333 1. 1. 0.75
|
|
1. 0.9 1. 0.9 ]
|
|
|
|
mean value: 0.9097619047619048
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.9 1. 0.8 0.9 0.66666667
|
|
0.88888889 1. 0.88888889 1. ]
|
|
|
|
mean value: 0.9044444444444444
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.94285714 0.95 0.97142857 0.9 0.95 0.80555556
|
|
0.94444444 0.98611111 0.94444444 0.98571429]
|
|
|
|
mean value: 0.9380555555555555
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.9 0.83333333 0.8 0.9 0.54545455
|
|
0.88888889 0.9 0.88888889 0.9 ]
|
|
|
|
mean value: 0.8270851370851371
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.27
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.019593 0.02771401 0.03854823 0.0388329 0.03496313 0.0385735
|
|
0.05273533 0.03399873 0.03737974 0.04336548]
|
|
|
|
mean value: 0.036570405960083006
|
|
|
|
key: score_time
|
|
value: [0.01057959 0.0109818 0.01963091 0.02015376 0.02013206 0.02053571
|
|
0.01113605 0.01271248 0.02074575 0.02328992]
|
|
|
|
mean value: 0.016989803314208983
|
|
|
|
key: test_mcc
|
|
value: [0.56660974 0.80178373 0.76553182 0.93541435 0.93541435 0.63936201
|
|
0.86111111 0.87904907 0.78360391 0.86031746]
|
|
|
|
mean value: 0.8028197545072195
|
|
|
|
key: train_mcc
|
|
value: [0.8869427 0.87274633 0.88072512 0.87383838 0.86705728 0.90434373
|
|
0.88174015 0.8749027 0.87498726 0.86704695]
|
|
|
|
mean value: 0.8784330603510744
|
|
|
|
key: test_accuracy
|
|
value: [0.84444444 0.93333333 0.91111111 0.97777778 0.97777778 0.88888889
|
|
0.95555556 0.95555556 0.93181818 0.95454545]
|
|
|
|
mean value: 0.9330808080808081
|
|
|
|
key: train_accuracy
|
|
value: [0.96277916 0.95781638 0.96029777 0.95781638 0.95533499 0.96774194
|
|
0.96029777 0.95781638 0.95792079 0.95544554]
|
|
|
|
mean value: 0.9593267081050537
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.84210526 0.81818182 0.94736842 0.94736842 0.70588235
|
|
0.88888889 0.9 0.8 0.88888889]
|
|
|
|
mean value: 0.8405350720830598
|
|
|
|
key: train_fscore
|
|
value: [0.91017964 0.89940828 0.90588235 0.9005848 0.89534884 0.92485549
|
|
0.90697674 0.9017341 0.9017341 0.89534884]
|
|
|
|
mean value: 0.9042053191031663
|
|
|
|
key: test_precision
|
|
value: [0.63636364 0.88888889 0.75 1. 1. 0.75
|
|
0.88888889 0.81818182 1. 0.88888889]
|
|
|
|
mean value: 0.8621212121212121
|
|
|
|
key: train_precision
|
|
value: [0.92682927 0.9047619 0.90588235 0.89534884 0.88505747 0.91954023
|
|
0.90697674 0.89655172 0.89655172 0.89534884]
|
|
|
|
mean value: 0.9032849094025703
|
|
|
|
key: test_recall
|
|
value: [0.7 0.8 0.9 0.9 0.9 0.66666667
|
|
0.88888889 1. 0.66666667 0.88888889]
|
|
|
|
mean value: 0.8311111111111111
|
|
|
|
key: train_recall
|
|
value: [0.89411765 0.89411765 0.90588235 0.90588235 0.90588235 0.93023256
|
|
0.90697674 0.90697674 0.90697674 0.89534884]
|
|
|
|
mean value: 0.9052393980848153
|
|
|
|
key: test_roc_auc
|
|
value: [0.79285714 0.88571429 0.90714286 0.95 0.95 0.80555556
|
|
0.93055556 0.97222222 0.83333333 0.93015873]
|
|
|
|
mean value: 0.8957539682539682
|
|
|
|
key: train_roc_auc
|
|
value: [0.93762486 0.93448021 0.94036256 0.93879023 0.93721791 0.95407527
|
|
0.94087008 0.93929279 0.93933743 0.93352348]
|
|
|
|
mean value: 0.9395574805236686
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.72727273 0.69230769 0.9 0.9 0.54545455
|
|
0.8 0.81818182 0.66666667 0.8 ]
|
|
|
|
mean value: 0.734988344988345
|
|
|
|
key: train_jcc
|
|
value: [0.83516484 0.8172043 0.82795699 0.81914894 0.81052632 0.86021505
|
|
0.82978723 0.82105263 0.82105263 0.81052632]
|
|
|
|
mean value: 0.8252635244200465
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.84
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00993943 0.00762343 0.00771737 0.00769854 0.00768757 0.00771499
|
|
0.00805521 0.00784445 0.00804305 0.0080626 ]
|
|
|
|
mean value: 0.008038663864135742
|
|
|
|
key: score_time
|
|
value: [0.00817943 0.00850177 0.00837326 0.00846052 0.0085566 0.00857735
|
|
0.00863457 0.00853038 0.00856733 0.00824285]
|
|
|
|
mean value: 0.00846240520477295
|
|
|
|
key: test_mcc
|
|
value: [0.44223199 0.57655666 0.15118579 0.59030128 0.26207121 0.45760432
|
|
0.62469505 0.45760432 0.63745526 0.45523656]
|
|
|
|
mean value: 0.4654942424271857
|
|
|
|
key: train_mcc
|
|
value: [0.55751053 0.53112666 0.56707247 0.50578969 0.51581016 0.49290093
|
|
0.50957787 0.48621959 0.48320568 0.56367163]
|
|
|
|
mean value: 0.5212885208195438
|
|
|
|
key: test_accuracy
|
|
value: [0.8 0.86666667 0.75555556 0.86666667 0.77777778 0.84444444
|
|
0.88888889 0.84444444 0.88636364 0.84090909]
|
|
|
|
mean value: 0.8371717171717172
|
|
|
|
key: train_accuracy
|
|
value: [0.86352357 0.8560794 0.86600496 0.85111663 0.85359801 0.84367246
|
|
0.84863524 0.84119107 0.84158416 0.86138614]
|
|
|
|
mean value: 0.8526791636980076
|
|
|
|
key: test_fscore
|
|
value: [0.57142857 0.625 0.26666667 0.66666667 0.375 0.53333333
|
|
0.61538462 0.53333333 0.70588235 0.53333333]
|
|
|
|
mean value: 0.5426028873087696
|
|
|
|
key: train_fscore
|
|
value: [0.63087248 0.60810811 0.64 0.57746479 0.58741259 0.57718121
|
|
0.59060403 0.57333333 0.56756757 0.64556962]
|
|
|
|
mean value: 0.5998113723527961
|
|
|
|
key: test_precision
|
|
value: [0.54545455 0.83333333 0.4 0.75 0.5 0.66666667
|
|
1. 0.66666667 0.75 0.66666667]
|
|
|
|
mean value: 0.6778787878787879
|
|
|
|
key: train_precision
|
|
value: [0.734375 0.71428571 0.73846154 0.71929825 0.72413793 0.68253968
|
|
0.6984127 0.671875 0.67741935 0.70833333]
|
|
|
|
mean value: 0.7069138498520194
|
|
|
|
key: test_recall
|
|
value: [0.6 0.5 0.2 0.6 0.3 0.44444444
|
|
0.44444444 0.44444444 0.66666667 0.44444444]
|
|
|
|
mean value: 0.46444444444444444
|
|
|
|
key: train_recall
|
|
value: [0.55294118 0.52941176 0.56470588 0.48235294 0.49411765 0.5
|
|
0.51162791 0.5 0.48837209 0.59302326]
|
|
|
|
mean value: 0.521655266757866
|
|
|
|
key: test_roc_auc
|
|
value: [0.72857143 0.73571429 0.55714286 0.77142857 0.60714286 0.69444444
|
|
0.72222222 0.69444444 0.8047619 0.69365079]
|
|
|
|
mean value: 0.700952380952381
|
|
|
|
key: train_roc_auc
|
|
value: [0.74974103 0.736404 0.75562338 0.71601924 0.72190159 0.71845426
|
|
0.7258455 0.71687697 0.71273951 0.76349276]
|
|
|
|
mean value: 0.7317098229311422
|
|
|
|
key: test_jcc
|
|
value: [0.4 0.45454545 0.15384615 0.5 0.23076923 0.36363636
|
|
0.44444444 0.36363636 0.54545455 0.36363636]
|
|
|
|
mean value: 0.381996891996892
|
|
|
|
key: train_jcc
|
|
value: [0.46078431 0.4368932 0.47058824 0.40594059 0.41584158 0.40566038
|
|
0.41904762 0.40186916 0.39622642 0.47663551]
|
|
|
|
mean value: 0.42894870155185705
|
|
|
|
MCC on Blind test: 0.34
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.008708 0.01181459 0.01200914 0.01360393 0.01108837 0.01205063
|
|
0.01200819 0.01373696 0.01307154 0.01215887]
|
|
|
|
mean value: 0.012025022506713867
|
|
|
|
key: score_time
|
|
value: [0.00815749 0.01002741 0.00992107 0.0105021 0.01042247 0.0105381
|
|
0.01050901 0.01046824 0.0104754 0.01043797]
|
|
|
|
mean value: 0.010145926475524902
|
|
|
|
key: test_mcc
|
|
value: [0.64465837 0.58434871 0.72069583 0.93541435 0.87142857 0.63936201
|
|
0.85839508 0.80178373 0.70609879 0.86031746]
|
|
|
|
mean value: 0.7622502886917752
|
|
|
|
key: train_mcc
|
|
value: [0.92034122 0.7429756 0.87903746 0.90352995 0.86545712 0.87195445
|
|
0.79224384 0.89514372 0.87964331 0.84825809]
|
|
|
|
mean value: 0.8598584753360916
|
|
|
|
key: test_accuracy
|
|
value: [0.86666667 0.86666667 0.88888889 0.97777778 0.95555556 0.88888889
|
|
0.95555556 0.93333333 0.90909091 0.95454545]
|
|
|
|
mean value: 0.9196969696969697
|
|
|
|
key: train_accuracy
|
|
value: [0.97270471 0.91811414 0.96029777 0.96774194 0.9528536 0.95781638
|
|
0.93300248 0.96526055 0.96039604 0.95049505]
|
|
|
|
mean value: 0.9538682652384345
|
|
|
|
key: test_fscore
|
|
value: [0.72727273 0.57142857 0.7826087 0.94736842 0.9 0.70588235
|
|
0.875 0.84210526 0.71428571 0.88888889]
|
|
|
|
mean value: 0.7954840634679778
|
|
|
|
key: train_fscore
|
|
value: [0.93714286 0.76258993 0.90361446 0.92397661 0.89385475 0.89171975
|
|
0.82580645 0.91666667 0.90361446 0.87654321]
|
|
|
|
mean value: 0.8835529131032591
|
|
|
|
key: test_precision
|
|
value: [0.66666667 1. 0.69230769 1. 0.9 0.75
|
|
1. 0.8 1. 0.88888889]
|
|
|
|
mean value: 0.8697863247863248
|
|
|
|
key: train_precision
|
|
value: [0.91111111 0.98148148 0.92592593 0.91860465 0.85106383 0.98591549
|
|
0.92753623 0.93902439 0.9375 0.93421053]
|
|
|
|
mean value: 0.931237364087004
|
|
|
|
key: test_recall
|
|
value: [0.8 0.4 0.9 0.9 0.9 0.66666667
|
|
0.77777778 0.88888889 0.55555556 0.88888889]
|
|
|
|
mean value: 0.7677777777777778
|
|
|
|
key: train_recall
|
|
value: [0.96470588 0.62352941 0.88235294 0.92941176 0.94117647 0.81395349
|
|
0.74418605 0.89534884 0.87209302 0.8255814 ]
|
|
|
|
mean value: 0.8492339261285909
|
|
|
|
key: test_roc_auc
|
|
value: [0.84285714 0.7 0.89285714 0.95 0.93571429 0.80555556
|
|
0.88888889 0.91666667 0.77777778 0.93015873]
|
|
|
|
mean value: 0.864047619047619
|
|
|
|
key: train_roc_auc
|
|
value: [0.96977432 0.81019238 0.93174251 0.95369959 0.94857566 0.90539946
|
|
0.86420659 0.93978798 0.92818488 0.90492906]
|
|
|
|
mean value: 0.9156492428889091
|
|
|
|
key: test_jcc
|
|
value: [0.57142857 0.4 0.64285714 0.9 0.81818182 0.54545455
|
|
0.77777778 0.72727273 0.55555556 0.8 ]
|
|
|
|
mean value: 0.6738528138528139
|
|
|
|
key: train_jcc
|
|
value: [0.88172043 0.61627907 0.82417582 0.85869565 0.80808081 0.8045977
|
|
0.7032967 0.84615385 0.82417582 0.78021978]
|
|
|
|
mean value: 0.7947395639301094
|
|
|
|
MCC on Blind test: 0.24
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01214361 0.01189399 0.01183558 0.0117414 0.01215601 0.01234984
|
|
0.01391053 0.01211047 0.01208448 0.01138926]
|
|
|
|
mean value: 0.012161517143249511
|
|
|
|
key: score_time
|
|
value: [0.01047754 0.01044488 0.01046324 0.01056027 0.01050949 0.01049018
|
|
0.01044488 0.01046228 0.01046324 0.0104301 ]
|
|
|
|
mean value: 0.010474610328674316
|
|
|
|
key: test_mcc
|
|
value: [0.56660974 0.66143783 0.75592895 0.87142857 0.86991767 0.55182541
|
|
0.87904907 0.93541435 0.85775039 0.70370542]
|
|
|
|
mean value: 0.7653067401926541
|
|
|
|
key: train_mcc
|
|
value: [0.86239285 0.83069452 0.77202883 0.88457302 0.91366773 0.91009599
|
|
0.90782821 0.91435935 0.83242511 0.80863263]
|
|
|
|
mean value: 0.8636698252023903
|
|
|
|
key: test_accuracy
|
|
value: [0.84444444 0.88888889 0.88888889 0.95555556 0.95555556 0.86666667
|
|
0.95555556 0.97777778 0.95454545 0.90909091]
|
|
|
|
mean value: 0.9196969696969697
|
|
|
|
key: train_accuracy
|
|
value: [0.95533499 0.94540943 0.9057072 0.96029777 0.97022333 0.97022333
|
|
0.96774194 0.97022333 0.94554455 0.93811881]
|
|
|
|
mean value: 0.9528824656659214
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.66666667 0.8 0.9 0.88888889 0.625
|
|
0.9 0.94736842 0.875 0.75 ]
|
|
|
|
mean value: 0.8019590643274854
|
|
|
|
key: train_fscore
|
|
value: [0.8875 0.85714286 0.81372549 0.90909091 0.93181818 0.92592593
|
|
0.9273743 0.93258427 0.8625 0.83660131]
|
|
|
|
mean value: 0.8884263242702394
|
|
|
|
key: test_precision
|
|
value: [0.63636364 1. 0.66666667 0.9 1. 0.71428571
|
|
0.81818182 0.9 1. 0.85714286]
|
|
|
|
mean value: 0.8492640692640693
|
|
|
|
key: train_precision
|
|
value: [0.94666667 0.95652174 0.69747899 0.87912088 0.9010989 0.98684211
|
|
0.89247312 0.90217391 0.93243243 0.95522388]
|
|
|
|
mean value: 0.9050032627229174
|
|
|
|
key: test_recall
|
|
value: [0.7 0.5 1. 0.9 0.8 0.55555556
|
|
1. 1. 0.77777778 0.66666667]
|
|
|
|
mean value: 0.79
|
|
|
|
key: train_recall
|
|
value: [0.83529412 0.77647059 0.97647059 0.94117647 0.96470588 0.87209302
|
|
0.96511628 0.96511628 0.80232558 0.74418605]
|
|
|
|
mean value: 0.8842954856361149
|
|
|
|
key: test_roc_auc
|
|
value: [0.79285714 0.75 0.92857143 0.93571429 0.9 0.75
|
|
0.97222222 0.98611111 0.88888889 0.81904762]
|
|
|
|
mean value: 0.8723412698412698
|
|
|
|
key: train_roc_auc
|
|
value: [0.91135775 0.88351831 0.93163152 0.95329264 0.968202 0.93446922
|
|
0.96678527 0.96836256 0.89330116 0.86737604]
|
|
|
|
mean value: 0.9278296466729867
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.5 0.66666667 0.81818182 0.8 0.45454545
|
|
0.81818182 0.9 0.77777778 0.6 ]
|
|
|
|
mean value: 0.6835353535353536
|
|
|
|
key: train_jcc
|
|
value: [0.79775281 0.75 0.68595041 0.83333333 0.87234043 0.86206897
|
|
0.86458333 0.87368421 0.75824176 0.71910112]
|
|
|
|
mean value: 0.8017056372291307
|
|
|
|
MCC on Blind test: 0.28
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10343218 0.0909369 0.08587861 0.08887339 0.0882163 0.08921981
|
|
0.08564425 0.08561397 0.08643341 0.08570504]
|
|
|
|
mean value: 0.08899538516998291
|
|
|
|
key: score_time
|
|
value: [0.01581144 0.01447177 0.01518631 0.01536417 0.01544499 0.01527143
|
|
0.01452279 0.01482272 0.01428127 0.01465106]
|
|
|
|
mean value: 0.014982795715332032
|
|
|
|
key: test_mcc
|
|
value: [0.79539491 0.93541435 0.88640526 0.86991767 0.80295507 0.63936201
|
|
0.92998111 0.93541435 0.92962225 0.87831007]
|
|
|
|
mean value: 0.8602777044139022
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.91111111 0.97777778 0.95555556 0.95555556 0.93333333 0.88888889
|
|
0.97777778 0.97777778 0.97727273 0.95454545]
|
|
|
|
mean value: 0.950959595959596
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.94736842 0.90909091 0.88888889 0.82352941 0.70588235
|
|
0.94117647 0.94736842 0.94117647 0.9 ]
|
|
|
|
mean value: 0.8837814679300747
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.71428571 1. 0.83333333 1. 1. 0.75
|
|
1. 0.9 1. 0.81818182]
|
|
|
|
mean value: 0.9015800865800866
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.9 1. 0.8 0.7 0.66666667
|
|
0.88888889 1. 0.88888889 1. ]
|
|
|
|
mean value: 0.8844444444444445
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.94285714 0.95 0.97142857 0.9 0.85 0.80555556
|
|
0.94444444 0.98611111 0.94444444 0.97142857]
|
|
|
|
mean value: 0.9266269841269841
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.9 0.83333333 0.8 0.7 0.54545455
|
|
0.88888889 0.9 0.88888889 0.81818182]
|
|
|
|
mean value: 0.7989033189033189
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.3
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0306735 0.03133774 0.04422259 0.031358 0.03073716 0.03635859
|
|
0.03435659 0.03395271 0.03568673 0.04747963]
|
|
|
|
mean value: 0.03561632633209229
|
|
|
|
key: score_time
|
|
value: [0.01692772 0.02699375 0.02704811 0.02176785 0.02640796 0.01848412
|
|
0.02651548 0.02117252 0.03250337 0.03121567]
|
|
|
|
mean value: 0.02490365505218506
|
|
|
|
key: test_mcc
|
|
value: [0.79539491 0.87142857 0.88640526 0.86991767 0.93541435 0.72222222
|
|
1. 0.63936201 0.92962225 0.86031746]
|
|
|
|
mean value: 0.851008470726259
|
|
|
|
key: train_mcc
|
|
value: [1. 0.98532572 0.97018128 0.98509064 1. 0.99266683
|
|
0.9854476 0.97029022 0.96293777 0.97045488]
|
|
|
|
mean value: 0.9822394937971208
|
|
|
|
key: test_accuracy
|
|
value: [0.91111111 0.95555556 0.95555556 0.95555556 0.97777778 0.91111111
|
|
1. 0.88888889 0.97727273 0.95454545]
|
|
|
|
mean value: 0.9487373737373738
|
|
|
|
key: train_accuracy
|
|
value: [1. 0.99503722 0.99007444 0.99503722 1. 0.99751861
|
|
0.99503722 0.99007444 0.98762376 0.99009901]
|
|
|
|
mean value: 0.9940501928604771
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.9 0.90909091 0.88888889 0.94736842 0.77777778
|
|
1. 0.70588235 0.94117647 0.88888889]
|
|
|
|
mean value: 0.8792407042561842
|
|
|
|
key: train_fscore
|
|
value: [1. 0.98837209 0.97647059 0.98823529 1. 0.99421965
|
|
0.98850575 0.97647059 0.97076023 0.97674419]
|
|
|
|
mean value: 0.9859778383881759
|
|
|
|
key: test_precision
|
|
value: [0.71428571 0.9 0.83333333 1. 1. 0.77777778
|
|
1. 0.75 1. 0.88888889]
|
|
|
|
mean value: 0.8864285714285715
|
|
|
|
key: train_precision
|
|
value: [1. 0.97701149 0.97647059 0.98823529 1. 0.98850575
|
|
0.97727273 0.98809524 0.97647059 0.97674419]
|
|
|
|
mean value: 0.9848805863382023
|
|
|
|
key: test_recall
|
|
value: [1. 0.9 1. 0.8 0.9 0.77777778
|
|
1. 0.66666667 0.88888889 0.88888889]
|
|
|
|
mean value: 0.8822222222222222
|
|
|
|
key: train_recall
|
|
value: [1. 1. 0.97647059 0.98823529 1. 1.
|
|
1. 0.96511628 0.96511628 0.97674419]
|
|
|
|
mean value: 0.9871682626538988
|
|
|
|
key: test_roc_auc
|
|
value: [0.94285714 0.93571429 0.97142857 0.9 0.95 0.86111111
|
|
1. 0.80555556 0.94444444 0.93015873]
|
|
|
|
mean value: 0.9241269841269841
|
|
|
|
key: train_roc_auc
|
|
value: [1. 0.99685535 0.98509064 0.99254532 1. 0.99842271
|
|
0.99684543 0.98098085 0.97941349 0.98522744]
|
|
|
|
mean value: 0.9915381221608284
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.81818182 0.83333333 0.8 0.9 0.63636364
|
|
1. 0.54545455 0.88888889 0.8 ]
|
|
|
|
mean value: 0.7936507936507937
|
|
|
|
key: train_jcc
|
|
value: [1. 0.97701149 0.95402299 0.97674419 1. 0.98850575
|
|
0.97727273 0.95402299 0.94318182 0.95454545]
|
|
|
|
mean value: 0.9725307404437317
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0795815 0.11382031 0.06929111 0.11211514 0.07584715 0.13550496
|
|
0.15576649 0.17540956 0.1215744 0.10611081]
|
|
|
|
mean value: 0.11450214385986328
|
|
|
|
key: score_time
|
|
value: [0.0121398 0.01215172 0.0121305 0.01206064 0.01252985 0.02870059
|
|
0.02875829 0.01892948 0.01877236 0.01204371]
|
|
|
|
mean value: 0.016821694374084473
|
|
|
|
key: test_mcc
|
|
value: [0.15118579 0.28571429 0.39652234 0.39652234 0.58434871 0.53452248
|
|
0.53452248 0.42947785 0.62360956 0.3099003 ]
|
|
|
|
mean value: 0.4246326143681106
|
|
|
|
key: train_mcc
|
|
value: [0.75122811 0.7429756 0.76894596 0.72633485 0.77573617 0.77127576
|
|
0.74561704 0.77797744 0.7553134 0.77005492]
|
|
|
|
mean value: 0.7585459249101077
|
|
|
|
key: test_accuracy
|
|
value: [0.75555556 0.8 0.82222222 0.82222222 0.86666667 0.86666667
|
|
0.86666667 0.84444444 0.88636364 0.81818182]
|
|
|
|
mean value: 0.8348989898989899
|
|
|
|
key: train_accuracy
|
|
value: [0.92059553 0.91811414 0.92555831 0.91315136 0.9280397 0.92555831
|
|
0.91811414 0.9280397 0.92079208 0.92574257]
|
|
|
|
mean value: 0.9223705869346239
|
|
|
|
key: test_fscore
|
|
value: [0.26666667 0.30769231 0.42857143 0.42857143 0.57142857 0.5
|
|
0.5 0.46153846 0.61538462 0.33333333]
|
|
|
|
mean value: 0.4413186813186813
|
|
|
|
key: train_fscore
|
|
value: [0.77142857 0.76258993 0.78571429 0.74452555 0.7972028 0.78873239
|
|
0.76595745 0.8 0.77142857 0.79166667]
|
|
|
|
mean value: 0.777924620911841
|
|
|
|
key: test_precision
|
|
value: [0.4 0.66666667 0.75 0.75 1. 1.
|
|
1. 0.75 1. 0.66666667]
|
|
|
|
mean value: 0.7983333333333333
|
|
|
|
key: train_precision
|
|
value: [0.98181818 0.98148148 1. 0.98076923 0.98275862 1.
|
|
0.98181818 0.98305085 1. 0.98275862]
|
|
|
|
mean value: 0.9874455164724013
|
|
|
|
key: test_recall
|
|
value: [0.2 0.2 0.3 0.3 0.4 0.33333333
|
|
0.33333333 0.33333333 0.44444444 0.22222222]
|
|
|
|
mean value: 0.30666666666666664
|
|
|
|
key: train_recall
|
|
value: [0.63529412 0.62352941 0.64705882 0.6 0.67058824 0.65116279
|
|
0.62790698 0.6744186 0.62790698 0.6627907 ]
|
|
|
|
mean value: 0.6420656634746922
|
|
|
|
key: test_roc_auc
|
|
value: [0.55714286 0.58571429 0.63571429 0.63571429 0.7 0.66666667
|
|
0.66666667 0.65277778 0.72222222 0.5968254 ]
|
|
|
|
mean value: 0.6419444444444444
|
|
|
|
key: train_roc_auc
|
|
value: [0.81607473 0.81019238 0.82352941 0.79842767 0.83372179 0.8255814
|
|
0.8123762 0.83563202 0.81395349 0.82982302]
|
|
|
|
mean value: 0.8199312108020843
|
|
|
|
key: test_jcc
|
|
value: [0.15384615 0.18181818 0.27272727 0.27272727 0.4 0.33333333
|
|
0.33333333 0.3 0.44444444 0.2 ]
|
|
|
|
mean value: 0.2892229992229992
|
|
|
|
key: train_jcc
|
|
value: [0.62790698 0.61627907 0.64705882 0.59302326 0.6627907 0.65116279
|
|
0.62068966 0.66666667 0.62790698 0.65517241]
|
|
|
|
mean value: 0.6368657326603456
|
|
|
|
MCC on Blind test: 0.53
|
|
|
|
Accuracy on Blind test: 0.96
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.20419002 0.19745517 0.19840646 0.2032311 0.19804025 0.19976878
|
|
0.19917703 0.19644904 0.19782686 0.20109701]
|
|
|
|
mean value: 0.19956417083740235
|
|
|
|
key: score_time
|
|
value: [0.0086832 0.00846052 0.00847912 0.00850511 0.00864482 0.00882816
|
|
0.00845933 0.00858998 0.00887918 0.00864172]
|
|
|
|
mean value: 0.008617115020751954
|
|
|
|
key: test_mcc
|
|
value: [0.79539491 0.87142857 0.88640526 0.86991767 0.80178373 0.63936201
|
|
0.92998111 0.80178373 0.92962225 0.87831007]
|
|
|
|
mean value: 0.8403989305108202
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.91111111 0.95555556 0.95555556 0.95555556 0.93333333 0.88888889
|
|
0.97777778 0.93333333 0.97727273 0.95454545]
|
|
|
|
mean value: 0.9442929292929293
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.9 0.90909091 0.88888889 0.84210526 0.70588235
|
|
0.94117647 0.84210526 0.94117647 0.9 ]
|
|
|
|
mean value: 0.8703758951746567
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.71428571 0.9 0.83333333 1. 0.88888889 0.75
|
|
1. 0.8 1. 0.81818182]
|
|
|
|
mean value: 0.8704689754689755
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.9 1. 0.8 0.8 0.66666667
|
|
0.88888889 0.88888889 0.88888889 1. ]
|
|
|
|
mean value: 0.8833333333333333
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.94285714 0.93571429 0.97142857 0.9 0.88571429 0.80555556
|
|
0.94444444 0.91666667 0.94444444 0.97142857]
|
|
|
|
mean value: 0.9218253968253968
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.81818182 0.83333333 0.8 0.72727273 0.54545455
|
|
0.88888889 0.72727273 0.88888889 0.81818182]
|
|
|
|
mean value: 0.7761760461760462
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.24
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01316524 0.01303029 0.01386619 0.01346922 0.01556659 0.01349592
|
|
0.01358891 0.01323795 0.01327229 0.01385522]
|
|
|
|
mean value: 0.013654780387878419
|
|
|
|
key: score_time
|
|
value: [0.01129103 0.01097775 0.01097012 0.01095533 0.01109576 0.01304388
|
|
0.01086783 0.01356649 0.01312065 0.01090169]
|
|
|
|
mean value: 0.01167905330657959
|
|
|
|
key: test_mcc
|
|
value: [0.41931393 0.39652234 0.22857143 0.58434871 0.28571429 0.35355339
|
|
0.2941742 0.16174916 0.63745526 0.3099003 ]
|
|
|
|
mean value: 0.36713030130627083
|
|
|
|
key: train_mcc
|
|
value: [0.83864579 0.87027877 0.84093096 0.65974353 0.67151674 0.74002844
|
|
0.72151646 0.77656964 0.87964331 0.73577 ]
|
|
|
|
mean value: 0.7734643657272753
|
|
|
|
key: test_accuracy
|
|
value: [0.82222222 0.82222222 0.73333333 0.86666667 0.8 0.82222222
|
|
0.8 0.8 0.88636364 0.81818182]
|
|
|
|
mean value: 0.8171212121212121
|
|
|
|
key: train_accuracy
|
|
value: [0.94789082 0.95781638 0.94540943 0.89330025 0.89826303 0.89826303
|
|
0.91066998 0.9280397 0.96039604 0.91584158]
|
|
|
|
mean value: 0.9255890229221433
|
|
|
|
key: test_fscore
|
|
value: [0.5 0.42857143 0.4 0.57142857 0.30769231 0.42857143
|
|
0.4 0.18181818 0.70588235 0.33333333]
|
|
|
|
mean value: 0.4257297604356428
|
|
|
|
key: train_fscore
|
|
value: [0.86451613 0.89440994 0.875 0.66141732 0.70921986 0.79396985
|
|
0.7721519 0.81528662 0.90361446 0.77922078]
|
|
|
|
mean value: 0.8068806857147465
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.75 0.4 1. 0.66666667 0.6
|
|
0.5 0.5 0.75 0.66666667]
|
|
|
|
mean value: 0.65
|
|
|
|
key: train_precision
|
|
value: [0.95714286 0.94736842 0.84615385 1. 0.89285714 0.69911504
|
|
0.84722222 0.90140845 0.9375 0.88235294]
|
|
|
|
mean value: 0.8911120925557183
|
|
|
|
key: test_recall
|
|
value: [0.4 0.3 0.4 0.4 0.2 0.33333333
|
|
0.33333333 0.11111111 0.66666667 0.22222222]
|
|
|
|
mean value: 0.33666666666666667
|
|
|
|
key: train_recall
|
|
value: [0.78823529 0.84705882 0.90588235 0.49411765 0.58823529 0.91860465
|
|
0.70930233 0.74418605 0.87209302 0.69767442]
|
|
|
|
mean value: 0.7565389876880985
|
|
|
|
key: test_roc_auc
|
|
value: [0.67142857 0.63571429 0.61428571 0.7 0.58571429 0.63888889
|
|
0.625 0.54166667 0.8047619 0.5968254 ]
|
|
|
|
mean value: 0.6414285714285715
|
|
|
|
key: train_roc_auc
|
|
value: [0.88940067 0.9172401 0.9309286 0.74705882 0.78468368 0.90567457
|
|
0.83730101 0.86105201 0.92818488 0.83625859]
|
|
|
|
mean value: 0.8637782929234691
|
|
|
|
key: test_jcc
|
|
value: [0.33333333 0.27272727 0.25 0.4 0.18181818 0.27272727
|
|
0.25 0.1 0.54545455 0.2 ]
|
|
|
|
mean value: 0.2806060606060606
|
|
|
|
key: train_jcc
|
|
value: [0.76136364 0.80898876 0.77777778 0.49411765 0.54945055 0.65833333
|
|
0.62886598 0.68817204 0.82417582 0.63829787]
|
|
|
|
mean value: 0.682954342693751
|
|
|
|
MCC on Blind test: 0.34
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01960707 0.02915788 0.02913284 0.02917004 0.02329302 0.02906346
|
|
0.03248668 0.02917218 0.02937102 0.02931643]
|
|
|
|
mean value: 0.02797706127166748
|
|
|
|
key: score_time
|
|
value: [0.01943707 0.01073003 0.01065588 0.02089858 0.02081704 0.01065969
|
|
0.01883626 0.01075506 0.02131557 0.02014804]
|
|
|
|
mean value: 0.016425323486328126
|
|
|
|
key: test_mcc
|
|
value: [0.56660974 0.73379939 0.74285714 0.93541435 0.86991767 0.63936201
|
|
0.85839508 0.86111111 0.78360391 0.86031746]
|
|
|
|
mean value: 0.7851387860068911
|
|
|
|
key: train_mcc
|
|
value: [0.83918085 0.83252135 0.83979823 0.83252135 0.81720003 0.84123675
|
|
0.83401533 0.83401533 0.84212687 0.85010878]
|
|
|
|
mean value: 0.8362724873808268
|
|
|
|
key: test_accuracy
|
|
value: [0.84444444 0.91111111 0.91111111 0.97777778 0.95555556 0.88888889
|
|
0.95555556 0.95555556 0.93181818 0.95454545]
|
|
|
|
mean value: 0.9286363636363636
|
|
|
|
key: train_accuracy
|
|
value: [0.94789082 0.94540943 0.94789082 0.94540943 0.94044665 0.94789082
|
|
0.94540943 0.94540943 0.9480198 0.95049505]
|
|
|
|
mean value: 0.9464271675306488
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.75 0.8 0.94736842 0.88888889 0.70588235
|
|
0.875 0.88888889 0.8 0.88888889]
|
|
|
|
mean value: 0.8211584107327141
|
|
|
|
key: train_fscore
|
|
value: [0.86956522 0.86585366 0.87116564 0.86585366 0.85365854 0.87272727
|
|
0.86746988 0.86746988 0.8742515 0.88095238]
|
|
|
|
mean value: 0.8688967624943407
|
|
|
|
key: test_precision
|
|
value: [0.63636364 1. 0.8 1. 1. 0.75
|
|
1. 0.88888889 1. 0.88888889]
|
|
|
|
mean value: 0.8964141414141414
|
|
|
|
key: train_precision
|
|
value: [0.92105263 0.89873418 0.91025641 0.89873418 0.88607595 0.91139241
|
|
0.9 0.9 0.90123457 0.90243902]
|
|
|
|
mean value: 0.9029919342987596
|
|
|
|
key: test_recall
|
|
value: [0.7 0.6 0.8 0.9 0.8 0.66666667
|
|
0.77777778 0.88888889 0.66666667 0.88888889]
|
|
|
|
mean value: 0.7688888888888888
|
|
|
|
key: train_recall
|
|
value: [0.82352941 0.83529412 0.83529412 0.83529412 0.82352941 0.8372093
|
|
0.8372093 0.8372093 0.84883721 0.86046512]
|
|
|
|
mean value: 0.8373871409028728
|
|
|
|
key: test_roc_auc
|
|
value: [0.79285714 0.8 0.87142857 0.95 0.9 0.80555556
|
|
0.88888889 0.93055556 0.83333333 0.93015873]
|
|
|
|
mean value: 0.8702777777777777
|
|
|
|
key: train_roc_auc
|
|
value: [0.90233074 0.90506844 0.90664077 0.90506844 0.89761376 0.90756364
|
|
0.90598635 0.90598635 0.91183999 0.91765394]
|
|
|
|
mean value: 0.9065752441613346
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.6 0.66666667 0.9 0.8 0.54545455
|
|
0.77777778 0.8 0.66666667 0.8 ]
|
|
|
|
mean value: 0.7056565656565656
|
|
|
|
key: train_jcc
|
|
value: [0.76923077 0.76344086 0.77173913 0.76344086 0.74468085 0.77419355
|
|
0.76595745 0.76595745 0.77659574 0.78723404]
|
|
|
|
mean value: 0.768247070039765
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: /home/tanu/git/LSHTM_analysis/scripts/ml/./embb_config.py:122: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
baseline_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./embb_config.py:125: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
baseline_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.19244957 0.19756484 0.1905818 0.18681002 0.18788815 0.18729663
|
|
0.18735313 0.18713069 0.18840814 0.19011474]
|
|
|
|
mean value: 0.18955976963043214
|
|
|
|
key: score_time
|
|
value: [0.01917291 0.0214653 0.01095438 0.01946831 0.0198698 0.01965523
|
|
0.02010727 0.02099466 0.0204978 0.02179575]
|
|
|
|
mean value: 0.019398140907287597
|
|
|
|
key: test_mcc
|
|
value: [0.56660974 0.80295507 0.81536524 0.93541435 0.86991767 0.63936201
|
|
0.85839508 0.93541435 0.78360391 0.86031746]
|
|
|
|
mean value: 0.806735487504937
|
|
|
|
key: train_mcc
|
|
value: [0.86316397 0.87183415 0.87183415 0.83252135 0.81720003 0.88082743
|
|
0.84202517 0.85797496 0.84212687 0.86600555]
|
|
|
|
mean value: 0.8545513633312516
|
|
|
|
key: test_accuracy
|
|
value: [0.84444444 0.93333333 0.93333333 0.97777778 0.95555556 0.88888889
|
|
0.95555556 0.97777778 0.93181818 0.95454545]
|
|
|
|
mean value: 0.9353030303030303
|
|
|
|
key: train_accuracy
|
|
value: [0.95533499 0.95781638 0.95781638 0.94540943 0.94044665 0.96029777
|
|
0.94789082 0.9528536 0.9480198 0.95544554]
|
|
|
|
mean value: 0.9521331351497433
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.82352941 0.85714286 0.94736842 0.88888889 0.70588235
|
|
0.875 0.94736842 0.8 0.88888889]
|
|
|
|
mean value: 0.8400735908398448
|
|
|
|
key: train_fscore
|
|
value: [0.8902439 0.89820359 0.89820359 0.86585366 0.85365854 0.90588235
|
|
0.8742515 0.88757396 0.8742515 0.89411765]
|
|
|
|
mean value: 0.8842240241698736
|
|
|
|
key: test_precision
|
|
value: [0.63636364 1. 0.81818182 1. 1. 0.75
|
|
1. 0.9 1. 0.88888889]
|
|
|
|
mean value: 0.8993434343434343
|
|
|
|
key: train_precision
|
|
value: [0.92405063 0.91463415 0.91463415 0.89873418 0.88607595 0.91666667
|
|
0.90123457 0.90361446 0.90123457 0.9047619 ]
|
|
|
|
mean value: 0.9065641217238963
|
|
|
|
key: test_recall
|
|
value: [0.7 0.7 0.9 0.9 0.8 0.66666667
|
|
0.77777778 1. 0.66666667 0.88888889]
|
|
|
|
mean value: 0.8
|
|
|
|
key: train_recall
|
|
value: [0.85882353 0.88235294 0.88235294 0.83529412 0.82352941 0.89534884
|
|
0.84883721 0.87209302 0.84883721 0.88372093]
|
|
|
|
mean value: 0.8631190150478796
|
|
|
|
key: test_roc_auc
|
|
value: [0.79285714 0.85 0.92142857 0.95 0.9 0.80555556
|
|
0.88888889 0.98611111 0.83333333 0.93015873]
|
|
|
|
mean value: 0.8858333333333333
|
|
|
|
key: train_roc_auc
|
|
value: [0.9199778 0.93017018 0.93017018 0.90506844 0.89761376 0.93663341
|
|
0.91180031 0.92342822 0.91183999 0.92928185]
|
|
|
|
mean value: 0.9195984139382406
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.7 0.75 0.9 0.8 0.54545455
|
|
0.77777778 0.9 0.66666667 0.8 ]
|
|
|
|
mean value: 0.733989898989899
|
|
|
|
key: train_jcc
|
|
value: [0.8021978 0.81521739 0.81521739 0.76344086 0.74468085 0.82795699
|
|
0.77659574 0.79787234 0.77659574 0.80851064]
|
|
|
|
mean value: 0.79282857534178
|
|
|
|
MCC on Blind test: 0.28
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.04934955 0.02539492 0.02788877 0.02606297 0.02528405 0.02800322
|
|
0.02623415 0.02650237 0.02881813 0.02588701]
|
|
|
|
mean value: 0.028942513465881347
|
|
|
|
key: score_time
|
|
value: [0.01096201 0.01063871 0.01081228 0.01083779 0.01075006 0.01077032
|
|
0.01072264 0.01072216 0.01076746 0.01077724]
|
|
|
|
mean value: 0.010776066780090332
|
|
|
|
key: test_mcc
|
|
value: [0.83214239 0.88862624 0.94365079 0.91587302 0.9451949 0.83095238
|
|
0.85749293 0.91766294 0.82992752 0.74560114]
|
|
|
|
mean value: 0.8707124227917115
|
|
|
|
key: train_mcc
|
|
value: [0.90236595 0.88663261 0.88350545 0.90558532 0.90567611 0.90236595
|
|
0.90573203 0.89315242 0.90567829 0.89644363]
|
|
|
|
mean value: 0.8987137772379575
|
|
|
|
key: test_accuracy
|
|
value: [0.91549296 0.94366197 0.97183099 0.95774648 0.97183099 0.91549296
|
|
0.92857143 0.95714286 0.91428571 0.87142857]
|
|
|
|
mean value: 0.934748490945674
|
|
|
|
key: train_accuracy
|
|
value: [0.9511811 0.94330709 0.94173228 0.95275591 0.95275591 0.9511811
|
|
0.95283019 0.94654088 0.95283019 0.94811321]
|
|
|
|
mean value: 0.9493227851235577
|
|
|
|
key: test_fscore
|
|
value: [0.91891892 0.94594595 0.97222222 0.95774648 0.97222222 0.91428571
|
|
0.92753623 0.95890411 0.91666667 0.86567164]
|
|
|
|
mean value: 0.9350120152399073
|
|
|
|
key: train_fscore
|
|
value: [0.95102686 0.94339623 0.94191523 0.95253165 0.95238095 0.95133438
|
|
0.953125 0.94620253 0.95268139 0.94867807]
|
|
|
|
mean value: 0.9493272279338961
|
|
|
|
key: test_precision
|
|
value: [0.89473684 0.92105263 0.97222222 0.94444444 0.94594595 0.91428571
|
|
0.94117647 0.92105263 0.89189189 0.90625 ]
|
|
|
|
mean value: 0.9253058794641612
|
|
|
|
key: train_precision
|
|
value: [0.95253165 0.94043887 0.9375 0.95859873 0.96153846 0.94984326
|
|
0.94720497 0.9522293 0.9556962 0.93846154]
|
|
|
|
mean value: 0.9494042974184514
|
|
|
|
key: test_recall
|
|
value: [0.94444444 0.97222222 0.97222222 0.97142857 1. 0.91428571
|
|
0.91428571 1. 0.94285714 0.82857143]
|
|
|
|
mean value: 0.946031746031746
|
|
|
|
key: train_recall
|
|
value: [0.94952681 0.94637224 0.94637224 0.94654088 0.94339623 0.95283019
|
|
0.9591195 0.94025157 0.94968553 0.9591195 ]
|
|
|
|
mean value: 0.949321468960181
|
|
|
|
key: test_roc_auc
|
|
value: [0.91507937 0.94325397 0.9718254 0.95793651 0.97222222 0.91547619
|
|
0.92857143 0.95714286 0.91428571 0.87142857]
|
|
|
|
mean value: 0.9347222222222222
|
|
|
|
key: train_roc_auc
|
|
value: [0.9511785 0.94331191 0.94173958 0.95276571 0.95277067 0.9511785
|
|
0.95283019 0.94654088 0.95283019 0.94811321]
|
|
|
|
mean value: 0.9493259329801798
|
|
|
|
key: test_jcc
|
|
value: [0.85 0.8974359 0.94594595 0.91891892 0.94594595 0.84210526
|
|
0.86486486 0.92105263 0.84615385 0.76315789]
|
|
|
|
mean value: 0.8795581208739104
|
|
|
|
key: train_jcc
|
|
value: [0.90662651 0.89285714 0.89020772 0.90936556 0.90909091 0.90718563
|
|
0.91044776 0.8978979 0.90963855 0.90236686]
|
|
|
|
mean value: 0.9035684537974702
|
|
|
|
MCC on Blind test: 0.27
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.76378393 0.85567832 0.79737496 0.71720099 0.86780381 0.74783587
|
|
0.76583719 0.92471051 0.7560215 0.84938383]
|
|
|
|
mean value: 0.8045630931854248
|
|
|
|
key: score_time
|
|
value: [0.01349831 0.01428533 0.01490283 0.01481152 0.01480985 0.01473713
|
|
0.01456404 0.01478338 0.01481318 0.01483154]
|
|
|
|
mean value: 0.014603710174560547
|
|
|
|
key: test_mcc
|
|
value: [0.88730159 0.88862624 0.97222222 0.88730159 0.91885703 0.88862624
|
|
0.94440028 1. 0.88571429 0.860309 ]
|
|
|
|
mean value: 0.9133358474476411
|
|
|
|
key: train_mcc
|
|
value: [0.94962551 0.94962551 0.94646152 0.95276028 0.94962452 0.94649802
|
|
0.94968553 0.94025622 0.95912424 0.9528349 ]
|
|
|
|
mean value: 0.9496496257416195
|
|
|
|
key: test_accuracy
|
|
value: [0.94366197 0.94366197 0.98591549 0.94366197 0.95774648 0.94366197
|
|
0.97142857 1. 0.94285714 0.92857143]
|
|
|
|
mean value: 0.9561167002012072
|
|
|
|
key: train_accuracy
|
|
value: [0.97480315 0.97480315 0.97322835 0.97637795 0.97480315 0.97322835
|
|
0.97484277 0.97012579 0.97955975 0.97641509]
|
|
|
|
mean value: 0.9748187490714604
|
|
|
|
key: test_fscore
|
|
value: [0.94444444 0.94594595 0.98591549 0.94285714 0.95890411 0.94117647
|
|
0.97222222 1. 0.94285714 0.92537313]
|
|
|
|
mean value: 0.955969610579028
|
|
|
|
key: train_fscore
|
|
value: [0.97484277 0.97484277 0.97322835 0.97645212 0.97492163 0.97339593
|
|
0.97484277 0.97017268 0.97959184 0.97645212]
|
|
|
|
mean value: 0.9748742969391556
|
|
|
|
key: test_precision
|
|
value: [0.94444444 0.92105263 1. 0.94285714 0.92105263 0.96969697
|
|
0.94594595 1. 0.94285714 0.96875 ]
|
|
|
|
mean value: 0.955665690895954
|
|
|
|
key: train_precision
|
|
value: [0.97178683 0.97178683 0.97169811 0.97492163 0.971875 0.96884735
|
|
0.97484277 0.96865204 0.97805643 0.97492163]
|
|
|
|
mean value: 0.9727388624377596
|
|
|
|
key: test_recall
|
|
value: [0.94444444 0.97222222 0.97222222 0.94285714 1. 0.91428571
|
|
1. 1. 0.94285714 0.88571429]
|
|
|
|
mean value: 0.9574603174603175
|
|
|
|
key: train_recall
|
|
value: [0.97791798 0.97791798 0.97476341 0.97798742 0.97798742 0.97798742
|
|
0.97484277 0.97169811 0.98113208 0.97798742]
|
|
|
|
mean value: 0.9770222010594608
|
|
|
|
key: test_roc_auc
|
|
value: [0.94365079 0.94325397 0.98611111 0.94365079 0.95833333 0.94325397
|
|
0.97142857 1. 0.94285714 0.92857143]
|
|
|
|
mean value: 0.9561111111111111
|
|
|
|
key: train_roc_auc
|
|
value: [0.97480805 0.97480805 0.97323076 0.97637541 0.97479813 0.97322084
|
|
0.97484277 0.97012579 0.97955975 0.97641509]
|
|
|
|
mean value: 0.9748184631867151
|
|
|
|
key: test_jcc
|
|
value: [0.89473684 0.8974359 0.97222222 0.89189189 0.92105263 0.88888889
|
|
0.94594595 1. 0.89189189 0.86111111]
|
|
|
|
mean value: 0.916517732307206
|
|
|
|
key: train_jcc
|
|
value: [0.95092025 0.95092025 0.94785276 0.95398773 0.95107034 0.94817073
|
|
0.95092025 0.94207317 0.96 0.95398773]
|
|
|
|
mean value: 0.9509903195885676
|
|
|
|
MCC on Blind test: 0.24
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02257991 0.00823998 0.0080502 0.00808787 0.00771427 0.00770688
|
|
0.00799084 0.00788474 0.00776696 0.00769186]
|
|
|
|
mean value: 0.009371352195739747
|
|
|
|
key: score_time
|
|
value: [0.01607466 0.00837326 0.00837708 0.00820518 0.00806141 0.00812292
|
|
0.00803041 0.00804925 0.00794697 0.008039 ]
|
|
|
|
mean value: 0.008928012847900391
|
|
|
|
key: test_mcc
|
|
value: [0.76074845 0.63412698 0.80301852 0.72329377 0.75442414 0.7468254
|
|
0.77651637 0.54374562 0.80295507 0.66701701]
|
|
|
|
mean value: 0.7212671319076379
|
|
|
|
key: train_mcc
|
|
value: [0.77922929 0.78378281 0.74804396 0.75865264 0.72015797 0.76695184
|
|
0.75429609 0.75178056 0.73914559 0.71365299]
|
|
|
|
mean value: 0.7515693753369681
|
|
|
|
key: test_accuracy
|
|
value: [0.87323944 0.81690141 0.90140845 0.85915493 0.87323944 0.87323944
|
|
0.88571429 0.77142857 0.9 0.82857143]
|
|
|
|
mean value: 0.8582897384305835
|
|
|
|
key: train_accuracy
|
|
value: [0.88818898 0.89133858 0.87244094 0.87716535 0.85826772 0.88188976
|
|
0.87578616 0.87421384 0.86792453 0.8490566 ]
|
|
|
|
mean value: 0.8736272470658148
|
|
|
|
key: test_fscore
|
|
value: [0.88607595 0.81690141 0.90410959 0.86486486 0.88 0.87323944
|
|
0.89189189 0.77777778 0.90410959 0.84210526]
|
|
|
|
mean value: 0.8641075770212132
|
|
|
|
key: train_fscore
|
|
value: [0.89258699 0.88816856 0.87782805 0.88358209 0.86526946 0.88721805
|
|
0.88084465 0.87987988 0.87387387 0.86324786]
|
|
|
|
mean value: 0.8792499459540104
|
|
|
|
key: test_precision
|
|
value: [0.81395349 0.82857143 0.89189189 0.82051282 0.825 0.86111111
|
|
0.84615385 0.75675676 0.86842105 0.7804878 ]
|
|
|
|
mean value: 0.8292860200879576
|
|
|
|
key: train_precision
|
|
value: [0.85755814 0.91333333 0.84104046 0.84090909 0.82571429 0.85014409
|
|
0.84637681 0.84195402 0.8362069 0.7890625 ]
|
|
|
|
mean value: 0.8442299635272792
|
|
|
|
key: test_recall
|
|
value: [0.97222222 0.80555556 0.91666667 0.91428571 0.94285714 0.88571429
|
|
0.94285714 0.8 0.94285714 0.91428571]
|
|
|
|
mean value: 0.9037301587301587
|
|
|
|
key: train_recall
|
|
value: [0.93059937 0.86435331 0.91798107 0.93081761 0.90880503 0.92767296
|
|
0.91823899 0.92138365 0.91509434 0.95283019]
|
|
|
|
mean value: 0.9187776521238815
|
|
|
|
key: test_roc_auc
|
|
value: [0.8718254 0.81706349 0.90119048 0.85992063 0.87420635 0.8734127
|
|
0.88571429 0.77142857 0.9 0.82857143]
|
|
|
|
mean value: 0.8583333333333333
|
|
|
|
key: train_roc_auc
|
|
value: [0.88825566 0.89129615 0.87251255 0.87708073 0.858188 0.88181755
|
|
0.87578616 0.87421384 0.86792453 0.8490566 ]
|
|
|
|
mean value: 0.8736131777870365
|
|
|
|
key: test_jcc
|
|
value: [0.79545455 0.69047619 0.825 0.76190476 0.78571429 0.775
|
|
0.80487805 0.63636364 0.825 0.72727273]
|
|
|
|
mean value: 0.7627064195966635
|
|
|
|
key: train_jcc
|
|
value: [0.80601093 0.79883382 0.78225806 0.79144385 0.76253298 0.7972973
|
|
0.78706199 0.78552279 0.776 0.7593985 ]
|
|
|
|
mean value: 0.7846360220868399
|
|
|
|
MCC on Blind test: 0.23
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00863671 0.00801086 0.00807619 0.00798845 0.0079236 0.007936
|
|
0.00849605 0.00788641 0.00797105 0.0078764 ]
|
|
|
|
mean value: 0.008080172538757324
|
|
|
|
key: score_time
|
|
value: [0.00854659 0.00803828 0.00822711 0.00798297 0.00817943 0.00802064
|
|
0.00861883 0.00801945 0.00802946 0.00808167]
|
|
|
|
mean value: 0.008174443244934082
|
|
|
|
key: test_mcc
|
|
value: [0.67079854 0.71825397 0.54972312 0.70470171 0.6153057 0.6473892
|
|
0.6350853 0.66701701 0.66701701 0.57166195]
|
|
|
|
mean value: 0.6446953514858611
|
|
|
|
key: train_mcc
|
|
value: [0.66188316 0.6530534 0.67906111 0.66043489 0.67030115 0.66562282
|
|
0.66739685 0.66332496 0.66391373 0.68010917]
|
|
|
|
mean value: 0.6665101243381613
|
|
|
|
key: test_accuracy
|
|
value: [0.83098592 0.85915493 0.77464789 0.84507042 0.8028169 0.81690141
|
|
0.81428571 0.82857143 0.82857143 0.78571429]
|
|
|
|
mean value: 0.818672032193159
|
|
|
|
key: train_accuracy
|
|
value: [0.82677165 0.82362205 0.83622047 0.82677165 0.83149606 0.82992126
|
|
0.83018868 0.82861635 0.82861635 0.83647799]
|
|
|
|
mean value: 0.8298702520675482
|
|
|
|
key: test_fscore
|
|
value: [0.84615385 0.86111111 0.78378378 0.85714286 0.81578947 0.83116883
|
|
0.82666667 0.84210526 0.84210526 0.78873239]
|
|
|
|
mean value: 0.8294759490393293
|
|
|
|
key: train_fscore
|
|
value: [0.83918129 0.83431953 0.84660767 0.83870968 0.84333821 0.84070796
|
|
0.84164223 0.83946981 0.83994126 0.84750733]
|
|
|
|
mean value: 0.8411424970085409
|
|
|
|
key: test_precision
|
|
value: [0.78571429 0.86111111 0.76315789 0.78571429 0.75609756 0.76190476
|
|
0.775 0.7804878 0.7804878 0.77777778]
|
|
|
|
mean value: 0.7827453287690772
|
|
|
|
key: train_precision
|
|
value: [0.78201635 0.78551532 0.79501385 0.78571429 0.7890411 0.79166667
|
|
0.78846154 0.78947368 0.78787879 0.79395604]
|
|
|
|
mean value: 0.7888737622301876
|
|
|
|
key: test_recall
|
|
value: [0.91666667 0.86111111 0.80555556 0.94285714 0.88571429 0.91428571
|
|
0.88571429 0.91428571 0.91428571 0.8 ]
|
|
|
|
mean value: 0.8840476190476191
|
|
|
|
key: train_recall
|
|
value: [0.90536278 0.88958991 0.90536278 0.89937107 0.90566038 0.89622642
|
|
0.90251572 0.89622642 0.89937107 0.90880503]
|
|
|
|
mean value: 0.900849155804218
|
|
|
|
key: test_roc_auc
|
|
value: [0.8297619 0.85912698 0.77420635 0.84642857 0.80396825 0.81825397
|
|
0.81428571 0.82857143 0.82857143 0.78571429]
|
|
|
|
mean value: 0.8188888888888889
|
|
|
|
key: train_roc_auc
|
|
value: [0.82689522 0.82372577 0.83632919 0.82665714 0.83137908 0.82981668
|
|
0.83018868 0.82861635 0.82861635 0.83647799]
|
|
|
|
mean value: 0.8298702458187013
|
|
|
|
key: test_jcc
|
|
value: [0.73333333 0.75609756 0.64444444 0.75 0.68888889 0.71111111
|
|
0.70454545 0.72727273 0.72727273 0.65116279]
|
|
|
|
mean value: 0.7094129038541971
|
|
|
|
key: train_jcc
|
|
value: [0.72292191 0.71573604 0.73401535 0.72222222 0.72911392 0.72519084
|
|
0.72658228 0.72335025 0.72405063 0.73536896]
|
|
|
|
mean value: 0.7258552408145388
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00882816 0.00808144 0.008111 0.00824952 0.00814581 0.00827622
|
|
0.00815296 0.00838614 0.00819111 0.00840569]
|
|
|
|
mean value: 0.008282804489135742
|
|
|
|
key: score_time
|
|
value: [0.0142138 0.01573038 0.01161218 0.01225567 0.01171994 0.01176977
|
|
0.0117259 0.01164389 0.01177073 0.01233864]
|
|
|
|
mean value: 0.012478089332580567
|
|
|
|
key: test_mcc
|
|
value: [0.71917468 0.64082051 0.6656213 0.80588933 0.64082051 0.71825397
|
|
0.6350853 0.69985421 0.71545476 0.71545476]
|
|
|
|
mean value: 0.6956429316837078
|
|
|
|
key: train_mcc
|
|
value: [0.83074055 0.84991022 0.80531087 0.81373251 0.8080633 0.81862293
|
|
0.83183549 0.81660412 0.81537301 0.80994411]
|
|
|
|
mean value: 0.8200137110997675
|
|
|
|
key: test_accuracy
|
|
value: [0.85915493 0.81690141 0.83098592 0.90140845 0.81690141 0.85915493
|
|
0.81428571 0.84285714 0.85714286 0.85714286]
|
|
|
|
mean value: 0.8455935613682093
|
|
|
|
key: train_accuracy
|
|
value: [0.91496063 0.92440945 0.9023622 0.90551181 0.90393701 0.90866142
|
|
0.91509434 0.9072327 0.9072327 0.90408805]
|
|
|
|
mean value: 0.9093490318427178
|
|
|
|
key: test_fscore
|
|
value: [0.86486486 0.80597015 0.84210526 0.90410959 0.82666667 0.85714286
|
|
0.82666667 0.85714286 0.86111111 0.85294118]
|
|
|
|
mean value: 0.8498721201518333
|
|
|
|
key: train_fscore
|
|
value: [0.91666667 0.92615385 0.90402477 0.90936556 0.90513219 0.91131498
|
|
0.91768293 0.91047041 0.9093702 0.90715373]
|
|
|
|
mean value: 0.9117335282395541
|
|
|
|
key: test_precision
|
|
value: [0.84210526 0.87096774 0.8 0.86842105 0.775 0.85714286
|
|
0.775 0.78571429 0.83783784 0.87878788]
|
|
|
|
mean value: 0.8290976917207817
|
|
|
|
key: train_precision
|
|
value: [0.89728097 0.9039039 0.88753799 0.875 0.89538462 0.88690476
|
|
0.89053254 0.8797654 0.88888889 0.87905605]
|
|
|
|
mean value: 0.8884255118241281
|
|
|
|
key: test_recall
|
|
value: [0.88888889 0.75 0.88888889 0.94285714 0.88571429 0.85714286
|
|
0.88571429 0.94285714 0.88571429 0.82857143]
|
|
|
|
mean value: 0.8756349206349207
|
|
|
|
key: train_recall
|
|
value: [0.93690852 0.94952681 0.92113565 0.94654088 0.91509434 0.93710692
|
|
0.94654088 0.94339623 0.93081761 0.93710692]
|
|
|
|
mean value: 0.9364174751502887
|
|
|
|
key: test_roc_auc
|
|
value: [0.85873016 0.81785714 0.83015873 0.90198413 0.81785714 0.85912698
|
|
0.81428571 0.84285714 0.85714286 0.85714286]
|
|
|
|
mean value: 0.8457142857142856
|
|
|
|
key: train_roc_auc
|
|
value: [0.91499514 0.92444894 0.90239172 0.9054471 0.90391941 0.90861655
|
|
0.91509434 0.9072327 0.9072327 0.90408805]
|
|
|
|
mean value: 0.9093466658730631
|
|
|
|
key: test_jcc
|
|
value: [0.76190476 0.675 0.72727273 0.825 0.70454545 0.75
|
|
0.70454545 0.75 0.75609756 0.74358974]
|
|
|
|
mean value: 0.7397955702833752
|
|
|
|
key: train_jcc
|
|
value: [0.84615385 0.86246418 0.82485876 0.83379501 0.82670455 0.83707865
|
|
0.84788732 0.8356546 0.83380282 0.83008357]
|
|
|
|
mean value: 0.8378483299992395
|
|
|
|
MCC on Blind test: 0.26
|
|
|
|
Accuracy on Blind test: 0.76
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01908469 0.01650476 0.01644874 0.01647878 0.01608276 0.01604009
|
|
0.0166142 0.01631188 0.01670742 0.01612663]
|
|
|
|
mean value: 0.01663999557495117
|
|
|
|
key: score_time
|
|
value: [0.00963759 0.00929117 0.00938487 0.00942397 0.00930023 0.00936937
|
|
0.00929046 0.00934219 0.00984812 0.00929856]
|
|
|
|
mean value: 0.009418654441833495
|
|
|
|
key: test_mcc
|
|
value: [0.77565853 0.83095238 0.83214239 0.88880092 0.89315217 0.88730159
|
|
0.80032673 0.80829038 0.8340361 0.74316054]
|
|
|
|
mean value: 0.8293821713525972
|
|
|
|
key: train_mcc
|
|
value: [0.88047545 0.88998365 0.88047545 0.88357096 0.88033094 0.88381426
|
|
0.88065992 0.88680999 0.88368712 0.89658557]
|
|
|
|
mean value: 0.8846393313783368
|
|
|
|
key: test_accuracy
|
|
value: [0.88732394 0.91549296 0.91549296 0.94366197 0.94366197 0.94366197
|
|
0.9 0.9 0.91428571 0.87142857]
|
|
|
|
mean value: 0.9135010060362173
|
|
|
|
key: train_accuracy
|
|
value: [0.94015748 0.94488189 0.94015748 0.94173228 0.94015748 0.94173228
|
|
0.94025157 0.94339623 0.9418239 0.94811321]
|
|
|
|
mean value: 0.9422403803298173
|
|
|
|
key: test_fscore
|
|
value: [0.89189189 0.91666667 0.91891892 0.94444444 0.94594595 0.94285714
|
|
0.90140845 0.90666667 0.91891892 0.86956522]
|
|
|
|
mean value: 0.9157284264406126
|
|
|
|
key: train_fscore
|
|
value: [0.940625 0.94539782 0.940625 0.94227769 0.94043887 0.94263566
|
|
0.94080997 0.94357367 0.94209703 0.94883721]
|
|
|
|
mean value: 0.9427317909873709
|
|
|
|
key: test_precision
|
|
value: [0.86842105 0.91666667 0.89473684 0.91891892 0.8974359 0.94285714
|
|
0.88888889 0.85 0.87179487 0.88235294]
|
|
|
|
mean value: 0.8932073222475699
|
|
|
|
key: train_precision
|
|
value: [0.93188854 0.93518519 0.93188854 0.93498452 0.9375 0.92966361
|
|
0.93209877 0.940625 0.9376947 0.93577982]
|
|
|
|
mean value: 0.9347308689650702
|
|
|
|
key: test_recall
|
|
value: [0.91666667 0.91666667 0.94444444 0.97142857 1. 0.94285714
|
|
0.91428571 0.97142857 0.97142857 0.85714286]
|
|
|
|
mean value: 0.9406349206349206
|
|
|
|
key: train_recall
|
|
value: [0.94952681 0.95583596 0.94952681 0.94968553 0.94339623 0.95597484
|
|
0.94968553 0.94654088 0.94654088 0.96226415]
|
|
|
|
mean value: 0.9508977640219828
|
|
|
|
key: test_roc_auc
|
|
value: [0.88690476 0.91547619 0.91507937 0.94404762 0.94444444 0.94365079
|
|
0.9 0.9 0.91428571 0.87142857]
|
|
|
|
mean value: 0.913531746031746
|
|
|
|
key: train_roc_auc
|
|
value: [0.94017221 0.94489911 0.94017221 0.94171974 0.94015237 0.94170982
|
|
0.94025157 0.94339623 0.9418239 0.94811321]
|
|
|
|
mean value: 0.9422410372398469
|
|
|
|
key: test_jcc
|
|
value: [0.80487805 0.84615385 0.85 0.89473684 0.8974359 0.89189189
|
|
0.82051282 0.82926829 0.85 0.76923077]
|
|
|
|
mean value: 0.8454108408793903
|
|
|
|
key: train_jcc
|
|
value: [0.8879056 0.8964497 0.8879056 0.89085546 0.88757396 0.8914956
|
|
0.88823529 0.89317507 0.89053254 0.90265487]
|
|
|
|
mean value: 0.8916783716415699
|
|
|
|
MCC on Blind test: 0.32
|
|
|
|
Accuracy on Blind test: 0.84
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.91041517 1.81590009 1.9913187 1.93741298 1.92132735 1.84885836
|
|
1.98798418 2.0122838 1.8185308 1.97723055]
|
|
|
|
mean value: 1.9221261978149413
|
|
|
|
key: score_time
|
|
value: [0.01374125 0.01442313 0.01120591 0.01360011 0.01358199 0.01433253
|
|
0.01412654 0.01376438 0.02158213 0.01124787]
|
|
|
|
mean value: 0.014160585403442384
|
|
|
|
key: test_mcc
|
|
value: [0.94365079 0.86205133 0.9451949 0.91587302 0.91885703 0.91580648
|
|
0.91465912 0.94285714 0.91465912 0.860309 ]
|
|
|
|
mean value: 0.9133917944367449
|
|
|
|
key: train_mcc
|
|
value: [0.99372055 0.99372055 0.99372055 1. 0.99372043 0.99372043
|
|
0.99371069 0.99686027 0.99373035 0.99373035]
|
|
|
|
mean value: 0.9946634157813034
|
|
|
|
key: test_accuracy
|
|
value: [0.97183099 0.92957746 0.97183099 0.95774648 0.95774648 0.95774648
|
|
0.95714286 0.97142857 0.95714286 0.92857143]
|
|
|
|
mean value: 0.9560764587525151
|
|
|
|
key: train_accuracy
|
|
value: [0.99685039 0.99685039 0.99685039 1. 0.99685039 0.99685039
|
|
0.99685535 0.99842767 0.99685535 0.99685535]
|
|
|
|
mean value: 0.9973245679195761
|
|
|
|
key: test_fscore
|
|
value: [0.97222222 0.93333333 0.97142857 0.95774648 0.95890411 0.95652174
|
|
0.95774648 0.97142857 0.95774648 0.92537313]
|
|
|
|
mean value: 0.9562451118080251
|
|
|
|
key: train_fscore
|
|
value: [0.99685535 0.99685535 0.99685535 1. 0.9968652 0.9968652
|
|
0.99685535 0.99843014 0.9968652 0.9968652 ]
|
|
|
|
mean value: 0.9973312339982106
|
|
|
|
key: test_precision
|
|
value: [0.97222222 0.8974359 1. 0.94444444 0.92105263 0.97058824
|
|
0.94444444 0.97142857 0.94444444 0.96875 ]
|
|
|
|
mean value: 0.953481089129309
|
|
|
|
key: train_precision
|
|
value: [0.99373041 0.99373041 0.99373041 1. 0.99375 0.99375
|
|
0.99685535 0.9968652 0.99375 0.99375 ]
|
|
|
|
mean value: 0.9949911772244238
|
|
|
|
key: test_recall
|
|
value: [0.97222222 0.97222222 0.94444444 0.97142857 1. 0.94285714
|
|
0.97142857 0.97142857 0.97142857 0.88571429]
|
|
|
|
mean value: 0.9603174603174603
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
0.99685535 1. 1. 1. ]
|
|
|
|
mean value: 0.999685534591195
|
|
|
|
key: test_roc_auc
|
|
value: [0.9718254 0.92896825 0.97222222 0.95793651 0.95833333 0.95753968
|
|
0.95714286 0.97142857 0.95714286 0.92857143]
|
|
|
|
mean value: 0.9561111111111111
|
|
|
|
key: train_roc_auc
|
|
value: [0.99685535 0.99685535 0.99685535 1. 0.99684543 0.99684543
|
|
0.99685535 0.99842767 0.99685535 0.99685535]
|
|
|
|
mean value: 0.9973250600162689
|
|
|
|
key: test_jcc
|
|
value: [0.94594595 0.875 0.94444444 0.91891892 0.92105263 0.91666667
|
|
0.91891892 0.94444444 0.91891892 0.86111111]
|
|
|
|
mean value: 0.9165422000948317
|
|
|
|
key: train_jcc
|
|
value: [0.99373041 0.99373041 0.99373041 1. 0.99375 0.99375
|
|
0.99373041 0.9968652 0.99375 0.99375 ]
|
|
|
|
mean value: 0.99467868338558
|
|
|
|
MCC on Blind test: 0.28
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01854157 0.01431727 0.01315069 0.01308203 0.01279688 0.01326585
|
|
0.01374698 0.01271224 0.01384306 0.01365948]
|
|
|
|
mean value: 0.013911604881286621
|
|
|
|
key: score_time
|
|
value: [0.01067591 0.00834036 0.00812936 0.00806141 0.00822115 0.00790977
|
|
0.00792432 0.00795054 0.00802064 0.00805449]
|
|
|
|
mean value: 0.008328795433044434
|
|
|
|
key: test_mcc
|
|
value: [0.8594125 0.97220047 0.88880092 0.86237318 0.91587302 0.83095238
|
|
0.85749293 1. 0.94440028 0.8871639 ]
|
|
|
|
mean value: 0.9018669569382461
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.92957746 0.98591549 0.94366197 0.92957746 0.95774648 0.91549296
|
|
0.92857143 1. 0.97142857 0.94285714]
|
|
|
|
mean value: 0.9504828973843058
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.93150685 0.98630137 0.94285714 0.93150685 0.95774648 0.91428571
|
|
0.92957746 1. 0.97222222 0.94117647]
|
|
|
|
mean value: 0.9507180562108437
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.91891892 0.97297297 0.97058824 0.89473684 0.94444444 0.91428571
|
|
0.91666667 1. 0.94594595 0.96969697]
|
|
|
|
mean value: 0.9448256710331013
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.94444444 1. 0.91666667 0.97142857 0.97142857 0.91428571
|
|
0.94285714 1. 1. 0.91428571]
|
|
|
|
mean value: 0.9575396825396825
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.92936508 0.98571429 0.94404762 0.93015873 0.95793651 0.91547619
|
|
0.92857143 1. 0.97142857 0.94285714]
|
|
|
|
mean value: 0.9505555555555555
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.87179487 0.97297297 0.89189189 0.87179487 0.91891892 0.84210526
|
|
0.86842105 1. 0.94594595 0.88888889]
|
|
|
|
mean value: 0.9072734677997836
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: -0.08
|
|
|
|
Accuracy on Blind test: 0.3
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10807514 0.10659599 0.1065073 0.10630512 0.10647535 0.10710096
|
|
0.10619712 0.10621285 0.10633135 0.10575533]
|
|
|
|
mean value: 0.1065556526184082
|
|
|
|
key: score_time
|
|
value: [0.01735711 0.01735687 0.01724958 0.01735258 0.01745963 0.01735044
|
|
0.01732707 0.01728344 0.01737738 0.0173161 ]
|
|
|
|
mean value: 0.017343020439147948
|
|
|
|
key: test_mcc
|
|
value: [0.89282857 0.94365079 0.97222222 0.91587302 0.91885703 0.88880092
|
|
0.85749293 1. 0.91465912 0.82992752]
|
|
|
|
mean value: 0.9134312117371046
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.94366197 0.97183099 0.98591549 0.95774648 0.95774648 0.94366197
|
|
0.92857143 1. 0.95714286 0.91428571]
|
|
|
|
mean value: 0.956056338028169
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 0.97222222 0.98591549 0.95774648 0.95890411 0.94444444
|
|
0.92957746 1. 0.95774648 0.91176471]
|
|
|
|
mean value: 0.956568981868365
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.9 0.97222222 1. 0.94444444 0.92105263 0.91891892
|
|
0.91666667 1. 0.94444444 0.93939394]
|
|
|
|
mean value: 0.9457143267669583
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.97222222 0.97222222 0.97142857 1. 0.97142857
|
|
0.94285714 1. 0.97142857 0.88571429]
|
|
|
|
mean value: 0.9687301587301587
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.94285714 0.9718254 0.98611111 0.95793651 0.95833333 0.94404762
|
|
0.92857143 1. 0.95714286 0.91428571]
|
|
|
|
mean value: 0.9561111111111111
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.9 0.94594595 0.97222222 0.91891892 0.92105263 0.89473684
|
|
0.86842105 1. 0.91891892 0.83783784]
|
|
|
|
mean value: 0.9178054370159634
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.32
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00805402 0.00804996 0.00802207 0.00799775 0.00799274 0.00810194
|
|
0.00802922 0.00801492 0.0081358 0.00799394]
|
|
|
|
mean value: 0.008039236068725586
|
|
|
|
key: score_time
|
|
value: [0.00802302 0.00803089 0.00806451 0.00799036 0.0080173 0.00841832
|
|
0.0080297 0.00801516 0.00801277 0.00795412]
|
|
|
|
mean value: 0.008055615425109863
|
|
|
|
key: test_mcc
|
|
value: [0.72329377 0.91580648 0.69292162 0.63412698 0.77460317 0.7468254
|
|
0.77142857 0.68599434 0.77651637 0.63089327]
|
|
|
|
mean value: 0.7352409975683333
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.85915493 0.95774648 0.84507042 0.81690141 0.88732394 0.87323944
|
|
0.88571429 0.84285714 0.88571429 0.81428571]
|
|
|
|
mean value: 0.8668008048289738
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.85294118 0.95890411 0.84057971 0.81690141 0.88571429 0.87323944
|
|
0.88571429 0.84507042 0.89189189 0.80597015]
|
|
|
|
mean value: 0.8656926876384385
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.90625 0.94594595 0.87878788 0.80555556 0.88571429 0.86111111
|
|
0.88571429 0.83333333 0.84615385 0.84375 ]
|
|
|
|
mean value: 0.8692316242316243
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.80555556 0.97222222 0.80555556 0.82857143 0.88571429 0.88571429
|
|
0.88571429 0.85714286 0.94285714 0.77142857]
|
|
|
|
mean value: 0.8640476190476191
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.85992063 0.95753968 0.84563492 0.81706349 0.88730159 0.8734127
|
|
0.88571429 0.84285714 0.88571429 0.81428571]
|
|
|
|
mean value: 0.8669444444444444
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.74358974 0.92105263 0.725 0.69047619 0.79487179 0.775
|
|
0.79487179 0.73170732 0.80487805 0.675 ]
|
|
|
|
mean value: 0.765644752124213
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.23
|
|
|
|
Accuracy on Blind test: 0.77
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.43684602 1.44837141 1.44293976 1.43462992 1.43130136 1.44044256
|
|
1.4325366 1.44482112 1.4325006 1.43315077]
|
|
|
|
mean value: 1.437754011154175
|
|
|
|
key: score_time
|
|
value: [0.09354448 0.09333181 0.09289074 0.09419799 0.09253526 0.09277344
|
|
0.09239817 0.09314775 0.0924108 0.09247732]
|
|
|
|
mean value: 0.09297077655792237
|
|
|
|
key: test_mcc
|
|
value: [0.94511009 1. 0.97222222 0.9451949 0.9451949 0.97220047
|
|
0.94440028 1. 0.94440028 0.8871639 ]
|
|
|
|
mean value: 0.9555887036197296
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.97183099 1. 0.98591549 0.97183099 0.97183099 0.98591549
|
|
0.97142857 1. 0.97142857 0.94285714]
|
|
|
|
mean value: 0.9773038229376257
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.97297297 1. 0.98591549 0.97222222 0.97222222 0.98550725
|
|
0.97222222 1. 0.97222222 0.94117647]
|
|
|
|
mean value: 0.9774461071784655
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.94736842 1. 1. 0.94594595 0.94594595 1.
|
|
0.94594595 1. 0.94594595 0.96969697]
|
|
|
|
mean value: 0.9700849174533385
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.97222222 1. 1. 0.97142857
|
|
1. 1. 1. 0.91428571]
|
|
|
|
mean value: 0.9857936507936508
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.97142857 1. 0.98611111 0.97222222 0.97222222 0.98571429
|
|
0.97142857 1. 0.97142857 0.94285714]
|
|
|
|
mean value: 0.9773412698412698
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.94736842 1. 0.97222222 0.94594595 0.94594595 0.97142857
|
|
0.94594595 1. 0.94594595 0.88888889]
|
|
|
|
mean value: 0.9563691887376098
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.24
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.87997389 0.93496108 0.96271634 0.90993977 0.95138836 0.95243716
|
|
0.92486715 0.92792821 0.93744898 0.93452334]
|
|
|
|
mean value: 0.9316184282302856
|
|
|
|
key: score_time
|
|
value: [0.26987338 0.27504158 0.23175645 0.26720405 0.27712655 0.27116489
|
|
0.22560811 0.20714259 0.17259336 0.24573827]
|
|
|
|
mean value: 0.2443249225616455
|
|
|
|
key: test_mcc
|
|
value: [0.94511009 1. 0.97222222 0.91587302 0.9451949 0.94511009
|
|
0.94440028 1. 0.94440028 0.8871639 ]
|
|
|
|
mean value: 0.9499474779757923
|
|
|
|
key: train_mcc
|
|
value: [0.96559014 0.9625117 0.9625117 0.96867592 0.96867592 0.96250874
|
|
0.96872591 0.9625688 0.96872591 0.96579568]
|
|
|
|
mean value: 0.9656290400102543
|
|
|
|
key: test_accuracy
|
|
value: [0.97183099 1. 0.98591549 0.95774648 0.97183099 0.97183099
|
|
0.97142857 1. 0.97142857 0.94285714]
|
|
|
|
mean value: 0.974486921529175
|
|
|
|
key: train_accuracy
|
|
value: [0.98267717 0.98110236 0.98110236 0.98425197 0.98425197 0.98110236
|
|
0.98427673 0.98113208 0.98427673 0.9827044 ]
|
|
|
|
mean value: 0.9826878126083296
|
|
|
|
key: test_fscore
|
|
value: [0.97297297 1. 0.98591549 0.95774648 0.97222222 0.97058824
|
|
0.97222222 1. 0.97222222 0.94117647]
|
|
|
|
mean value: 0.9745066317352978
|
|
|
|
key: train_fscore
|
|
value: [0.98283931 0.98130841 0.98130841 0.98442368 0.98442368 0.98136646
|
|
0.98442368 0.98136646 0.98442368 0.98294574]
|
|
|
|
mean value: 0.9828829495741059
|
|
|
|
key: test_precision
|
|
value: [0.94736842 1. 1. 0.94444444 0.94594595 1.
|
|
0.94594595 1. 0.94594595 0.96969697]
|
|
|
|
mean value: 0.9699347673031884
|
|
|
|
key: train_precision
|
|
value: [0.97222222 0.96923077 0.96923077 0.97530864 0.97530864 0.96932515
|
|
0.97530864 0.96932515 0.97530864 0.96941896]
|
|
|
|
mean value: 0.971998759557811
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.97222222 0.97142857 1. 0.94285714
|
|
1. 1. 1. 0.91428571]
|
|
|
|
mean value: 0.9800793650793651
|
|
|
|
key: train_recall
|
|
value: [0.99369085 0.99369085 0.99369085 0.99371069 0.99371069 0.99371069
|
|
0.99371069 0.99371069 0.99371069 0.99685535]
|
|
|
|
mean value: 0.9940192052060394
|
|
|
|
key: test_roc_auc
|
|
value: [0.97142857 1. 0.98611111 0.95793651 0.97222222 0.97142857
|
|
0.97142857 1. 0.97142857 0.94285714]
|
|
|
|
mean value: 0.974484126984127
|
|
|
|
key: train_roc_auc
|
|
value: [0.98269448 0.98112216 0.98112216 0.98423705 0.98423705 0.98108248
|
|
0.98427673 0.98113208 0.98427673 0.9827044 ]
|
|
|
|
mean value: 0.9826885304446165
|
|
|
|
key: test_jcc
|
|
value: [0.94736842 1. 0.97222222 0.91891892 0.94594595 0.94285714
|
|
0.94594595 1. 0.94594595 0.88888889]
|
|
|
|
mean value: 0.9508093431777642
|
|
|
|
key: train_jcc
|
|
value: [0.96625767 0.96330275 0.96330275 0.96932515 0.96932515 0.96341463
|
|
0.96932515 0.96341463 0.96932515 0.96646341]
|
|
|
|
mean value: 0.9663456469722573
|
|
|
|
MCC on Blind test: 0.24
|
|
|
|
Accuracy on Blind test: 0.76
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02038693 0.0088098 0.00817513 0.00887132 0.0087924 0.0088861
|
|
0.00894165 0.00874758 0.00866318 0.0089376 ]
|
|
|
|
mean value: 0.00992116928100586
|
|
|
|
key: score_time
|
|
value: [0.01118541 0.00879335 0.00893068 0.00872803 0.00877929 0.00873256
|
|
0.00876904 0.00859785 0.00878286 0.00883555]
|
|
|
|
mean value: 0.00901346206665039
|
|
|
|
key: test_mcc
|
|
value: [0.67079854 0.71825397 0.54972312 0.70470171 0.6153057 0.6473892
|
|
0.6350853 0.66701701 0.66701701 0.57166195]
|
|
|
|
mean value: 0.6446953514858611
|
|
|
|
key: train_mcc
|
|
value: [0.66188316 0.6530534 0.67906111 0.66043489 0.67030115 0.66562282
|
|
0.66739685 0.66332496 0.66391373 0.68010917]
|
|
|
|
mean value: 0.6665101243381613
|
|
|
|
key: test_accuracy
|
|
value: [0.83098592 0.85915493 0.77464789 0.84507042 0.8028169 0.81690141
|
|
0.81428571 0.82857143 0.82857143 0.78571429]
|
|
|
|
mean value: 0.818672032193159
|
|
|
|
key: train_accuracy
|
|
value: [0.82677165 0.82362205 0.83622047 0.82677165 0.83149606 0.82992126
|
|
0.83018868 0.82861635 0.82861635 0.83647799]
|
|
|
|
mean value: 0.8298702520675482
|
|
|
|
key: test_fscore
|
|
value: [0.84615385 0.86111111 0.78378378 0.85714286 0.81578947 0.83116883
|
|
0.82666667 0.84210526 0.84210526 0.78873239]
|
|
|
|
mean value: 0.8294759490393293
|
|
|
|
key: train_fscore
|
|
value: [0.83918129 0.83431953 0.84660767 0.83870968 0.84333821 0.84070796
|
|
0.84164223 0.83946981 0.83994126 0.84750733]
|
|
|
|
mean value: 0.8411424970085409
|
|
|
|
key: test_precision
|
|
value: [0.78571429 0.86111111 0.76315789 0.78571429 0.75609756 0.76190476
|
|
0.775 0.7804878 0.7804878 0.77777778]
|
|
|
|
mean value: 0.7827453287690772
|
|
|
|
key: train_precision
|
|
value: [0.78201635 0.78551532 0.79501385 0.78571429 0.7890411 0.79166667
|
|
0.78846154 0.78947368 0.78787879 0.79395604]
|
|
|
|
mean value: 0.7888737622301876
|
|
|
|
key: test_recall
|
|
value: [0.91666667 0.86111111 0.80555556 0.94285714 0.88571429 0.91428571
|
|
0.88571429 0.91428571 0.91428571 0.8 ]
|
|
|
|
mean value: 0.8840476190476191
|
|
|
|
key: train_recall
|
|
value: [0.90536278 0.88958991 0.90536278 0.89937107 0.90566038 0.89622642
|
|
0.90251572 0.89622642 0.89937107 0.90880503]
|
|
|
|
mean value: 0.900849155804218
|
|
|
|
key: test_roc_auc
|
|
value: [0.8297619 0.85912698 0.77420635 0.84642857 0.80396825 0.81825397
|
|
0.81428571 0.82857143 0.82857143 0.78571429]
|
|
|
|
mean value: 0.8188888888888889
|
|
|
|
key: train_roc_auc
|
|
value: [0.82689522 0.82372577 0.83632919 0.82665714 0.83137908 0.82981668
|
|
0.83018868 0.82861635 0.82861635 0.83647799]
|
|
|
|
mean value: 0.8298702458187013
|
|
|
|
key: test_jcc
|
|
value: [0.73333333 0.75609756 0.64444444 0.75 0.68888889 0.71111111
|
|
0.70454545 0.72727273 0.72727273 0.65116279]
|
|
|
|
mean value: 0.7094129038541971
|
|
|
|
key: train_jcc
|
|
value: [0.72292191 0.71573604 0.73401535 0.72222222 0.72911392 0.72519084
|
|
0.72658228 0.72335025 0.72405063 0.73536896]
|
|
|
|
mean value: 0.7258552408145388
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.08077264 0.05656648 0.05679369 0.05571246 0.06284809 0.0810225
|
|
0.06032395 0.06238079 0.06300926 0.22280788]
|
|
|
|
mean value: 0.08022377490997315
|
|
|
|
key: score_time
|
|
value: [0.01013517 0.00978065 0.00984907 0.01045036 0.00996995 0.01013207
|
|
0.00967622 0.00966024 0.00964713 0.01016784]
|
|
|
|
mean value: 0.009946870803833007
|
|
|
|
key: test_mcc
|
|
value: [0.94511009 1. 0.9451949 0.9451949 0.91885703 0.94511009
|
|
0.94440028 1. 0.94440028 0.8871639 ]
|
|
|
|
mean value: 0.9475431470899147
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.97183099 1. 0.97183099 0.97183099 0.95774648 0.97183099
|
|
0.97142857 1. 0.97142857 0.94285714]
|
|
|
|
mean value: 0.9730784708249497
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.97297297 1. 0.97142857 0.97222222 0.95890411 0.97058824
|
|
0.97222222 1. 0.97222222 0.94117647]
|
|
|
|
mean value: 0.9731737026539605
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.94736842 1. 1. 0.94594595 0.92105263 1.
|
|
0.94594595 1. 0.94594595 0.96969697]
|
|
|
|
mean value: 0.9675955860166386
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.94444444 1. 1. 0.94285714
|
|
1. 1. 1. 0.91428571]
|
|
|
|
mean value: 0.9801587301587301
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.97142857 1. 0.97222222 0.97222222 0.95833333 0.97142857
|
|
0.97142857 1. 0.97142857 0.94285714]
|
|
|
|
mean value: 0.9731349206349206
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.94736842 1. 0.94444444 0.94594595 0.92105263 0.94285714
|
|
0.94594595 1. 0.94594595 0.88888889]
|
|
|
|
mean value: 0.9482449366659893
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.01
|
|
|
|
Accuracy on Blind test: 0.32
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01711631 0.05170107 0.04450679 0.04490185 0.04405975 0.04858232
|
|
0.04450536 0.03757882 0.04440331 0.04462457]
|
|
|
|
mean value: 0.042198014259338376
|
|
|
|
key: score_time
|
|
value: [0.01043272 0.01453161 0.01896262 0.01934981 0.01965714 0.01737046
|
|
0.01697755 0.01947474 0.02058792 0.01093698]
|
|
|
|
mean value: 0.016828155517578124
|
|
|
|
key: test_mcc
|
|
value: [0.94365079 0.89282857 1. 0.88730159 0.91587302 0.9186708
|
|
0.91465912 0.97182532 0.88571429 0.82857143]
|
|
|
|
mean value: 0.9159094921471266
|
|
|
|
key: train_mcc
|
|
value: [0.93700772 0.93070849 0.92759921 0.94962452 0.94016229 0.9433251
|
|
0.94341489 0.92771424 0.94025622 0.94341489]
|
|
|
|
mean value: 0.93832275544477
|
|
|
|
key: test_accuracy
|
|
value: [0.97183099 0.94366197 1. 0.94366197 0.95774648 0.95774648
|
|
0.95714286 0.98571429 0.94285714 0.91428571]
|
|
|
|
mean value: 0.9574647887323944
|
|
|
|
key: train_accuracy
|
|
value: [0.96850394 0.96535433 0.96377953 0.97480315 0.97007874 0.97165354
|
|
0.97169811 0.96383648 0.97012579 0.97169811]
|
|
|
|
mean value: 0.9691531718912494
|
|
|
|
key: test_fscore
|
|
value: [0.97222222 0.94736842 1. 0.94285714 0.95774648 0.95522388
|
|
0.95774648 0.98591549 0.94285714 0.91428571]
|
|
|
|
mean value: 0.9576222974576094
|
|
|
|
key: train_fscore
|
|
value: [0.96845426 0.96529968 0.96354992 0.97492163 0.97007874 0.97178683
|
|
0.97160883 0.96366509 0.97007874 0.97178683]
|
|
|
|
mean value: 0.9691230561794373
|
|
|
|
key: test_precision
|
|
value: [0.97222222 0.9 1. 0.94285714 0.94444444 1.
|
|
0.94444444 0.97222222 0.94285714 0.91428571]
|
|
|
|
mean value: 0.9533333333333334
|
|
|
|
key: train_precision
|
|
value: [0.96845426 0.96529968 0.96815287 0.971875 0.97160883 0.96875
|
|
0.97468354 0.96825397 0.97160883 0.96875 ]
|
|
|
|
mean value: 0.9697436987632612
|
|
|
|
key: test_recall
|
|
value: [0.97222222 1. 1. 0.94285714 0.97142857 0.91428571
|
|
0.97142857 1. 0.94285714 0.91428571]
|
|
|
|
mean value: 0.962936507936508
|
|
|
|
key: train_recall
|
|
value: [0.96845426 0.96529968 0.95899054 0.97798742 0.96855346 0.97484277
|
|
0.96855346 0.9591195 0.96855346 0.97484277]
|
|
|
|
mean value: 0.9685197309683947
|
|
|
|
key: test_roc_auc
|
|
value: [0.9718254 0.94285714 1. 0.94365079 0.95793651 0.95714286
|
|
0.95714286 0.98571429 0.94285714 0.91428571]
|
|
|
|
mean value: 0.9573412698412698
|
|
|
|
key: train_roc_auc
|
|
value: [0.96850386 0.96535424 0.963772 0.97479813 0.97008115 0.97164851
|
|
0.97169811 0.96383648 0.97012579 0.97169811]
|
|
|
|
mean value: 0.9691516377993373
|
|
|
|
key: test_jcc
|
|
value: [0.94594595 0.9 1. 0.89189189 0.91891892 0.91428571
|
|
0.91891892 0.97222222 0.89189189 0.84210526]
|
|
|
|
mean value: 0.9196180767233398
|
|
|
|
key: train_jcc
|
|
value: [0.93883792 0.93292683 0.92966361 0.95107034 0.94189602 0.94512195
|
|
0.94478528 0.92987805 0.94189602 0.94512195]
|
|
|
|
mean value: 0.9401197970934513
|
|
|
|
MCC on Blind test: 0.3
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01068687 0.0080061 0.00796509 0.00772619 0.00784731 0.00785708
|
|
0.00791907 0.00809073 0.00793958 0.00783396]
|
|
|
|
mean value: 0.008187198638916015
|
|
|
|
key: score_time
|
|
value: [0.00917411 0.00820422 0.00817823 0.00810766 0.00799704 0.00803733
|
|
0.00800204 0.00832915 0.0080092 0.00808311]
|
|
|
|
mean value: 0.00821220874786377
|
|
|
|
key: test_mcc
|
|
value: [0.72811105 0.6656213 0.66269083 0.69762232 0.69762232 0.64082051
|
|
0.57735027 0.6614769 0.71899664 0.68599434]
|
|
|
|
mean value: 0.6736306482828506
|
|
|
|
key: train_mcc
|
|
value: [0.68565341 0.6963999 0.68905264 0.68994047 0.69528523 0.69577133
|
|
0.69581242 0.68997016 0.69048219 0.69506299]
|
|
|
|
mean value: 0.692343074334257
|
|
|
|
key: test_accuracy
|
|
value: [0.85915493 0.83098592 0.83098592 0.84507042 0.84507042 0.81690141
|
|
0.78571429 0.82857143 0.85714286 0.84285714]
|
|
|
|
mean value: 0.8342454728370221
|
|
|
|
key: train_accuracy
|
|
value: [0.84094488 0.84566929 0.84251969 0.84251969 0.84566929 0.84566929
|
|
0.84591195 0.8427673 0.8427673 0.84433962]
|
|
|
|
mean value: 0.8438778289506265
|
|
|
|
key: test_fscore
|
|
value: [0.87179487 0.84210526 0.83783784 0.85333333 0.85333333 0.82666667
|
|
0.8 0.83783784 0.86486486 0.84057971]
|
|
|
|
mean value: 0.8428353718971567
|
|
|
|
key: train_fscore
|
|
value: [0.84857571 0.85416667 0.8502994 0.85163205 0.85373134 0.85416667
|
|
0.85373134 0.85119048 0.85163205 0.85419735]
|
|
|
|
mean value: 0.8523323053430707
|
|
|
|
key: test_precision
|
|
value: [0.80952381 0.8 0.81578947 0.8 0.8 0.775
|
|
0.75 0.79487179 0.82051282 0.85294118]
|
|
|
|
mean value: 0.8018639075063224
|
|
|
|
key: train_precision
|
|
value: [0.80857143 0.8084507 0.80911681 0.80617978 0.8125 0.81073446
|
|
0.8125 0.8079096 0.80617978 0.8033241 ]
|
|
|
|
mean value: 0.808546665999499
|
|
|
|
key: test_recall
|
|
value: [0.94444444 0.88888889 0.86111111 0.91428571 0.91428571 0.88571429
|
|
0.85714286 0.88571429 0.91428571 0.82857143]
|
|
|
|
mean value: 0.8894444444444445
|
|
|
|
key: train_recall
|
|
value: [0.89274448 0.90536278 0.89589905 0.90251572 0.89937107 0.90251572
|
|
0.89937107 0.89937107 0.90251572 0.91194969]
|
|
|
|
mean value: 0.9011616372041347
|
|
|
|
key: test_roc_auc
|
|
value: [0.85793651 0.83015873 0.83055556 0.84603175 0.84603175 0.81785714
|
|
0.78571429 0.82857143 0.85714286 0.84285714]
|
|
|
|
mean value: 0.8342857142857143
|
|
|
|
key: train_roc_auc
|
|
value: [0.84102633 0.84576315 0.84260361 0.84242505 0.84558459 0.84557963
|
|
0.84591195 0.8427673 0.8427673 0.84433962]
|
|
|
|
mean value: 0.8438768525682995
|
|
|
|
key: test_jcc
|
|
value: [0.77272727 0.72727273 0.72093023 0.74418605 0.74418605 0.70454545
|
|
0.66666667 0.72093023 0.76190476 0.725 ]
|
|
|
|
mean value: 0.7288349441256418
|
|
|
|
key: train_jcc
|
|
value: [0.73697917 0.74545455 0.73958333 0.74160207 0.74479167 0.74545455
|
|
0.74479167 0.74093264 0.74160207 0.74550129]
|
|
|
|
mean value: 0.742669298644344
|
|
|
|
MCC on Blind test: 0.32
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00993681 0.01531029 0.01399684 0.01523709 0.01411676 0.01322913
|
|
0.01689434 0.0154171 0.01349378 0.01362562]
|
|
|
|
mean value: 0.014125776290893555
|
|
|
|
key: score_time
|
|
value: [0.00818229 0.0100224 0.01002216 0.01056504 0.0105257 0.01053262
|
|
0.01064253 0.01050687 0.0105536 0.01051474]
|
|
|
|
mean value: 0.010206794738769532
|
|
|
|
key: test_mcc
|
|
value: [0.8365327 0.91580648 0.89315217 0.91885703 0.83214239 0.9186708
|
|
0.91766294 1. 0.80032673 0.80829038]
|
|
|
|
mean value: 0.8841441619108454
|
|
|
|
key: train_mcc
|
|
value: [0.87618527 0.93389881 0.91532447 0.91736146 0.75092152 0.93072764
|
|
0.92900139 0.92149756 0.93418862 0.94654556]
|
|
|
|
mean value: 0.9055652313449004
|
|
|
|
key: test_accuracy
|
|
value: [0.91549296 0.95774648 0.94366197 0.95774648 0.91549296 0.95774648
|
|
0.95714286 1. 0.9 0.9 ]
|
|
|
|
mean value: 0.9405030181086519
|
|
|
|
key: train_accuracy
|
|
value: [0.93543307 0.96692913 0.95748031 0.95748031 0.86299213 0.96535433
|
|
0.96383648 0.96069182 0.96698113 0.97327044]
|
|
|
|
mean value: 0.9510449165552419
|
|
|
|
key: test_fscore
|
|
value: [0.91176471 0.95890411 0.94117647 0.95890411 0.91176471 0.95522388
|
|
0.95890411 1. 0.89855072 0.89230769]
|
|
|
|
mean value: 0.9387500508662453
|
|
|
|
key: train_fscore
|
|
value: [0.93155259 0.96671949 0.9568 0.95902883 0.84324324 0.96529968
|
|
0.96477795 0.96099844 0.96661367 0.9733124 ]
|
|
|
|
mean value: 0.9488346302113415
|
|
|
|
key: test_precision
|
|
value: [0.96875 0.94594595 1. 0.92105263 0.93939394 1.
|
|
0.92105263 1. 0.91176471 0.96666667]
|
|
|
|
mean value: 0.95746265210468
|
|
|
|
key: train_precision
|
|
value: [0.9893617 0.97133758 0.97077922 0.92668622 0.98734177 0.96835443
|
|
0.94029851 0.95356037 0.97749196 0.97178683]
|
|
|
|
mean value: 0.9656998596315463
|
|
|
|
key: test_recall
|
|
value: [0.86111111 0.97222222 0.88888889 1. 0.88571429 0.91428571
|
|
1. 1. 0.88571429 0.82857143]
|
|
|
|
mean value: 0.9236507936507936
|
|
|
|
key: train_recall
|
|
value: [0.88012618 0.96214511 0.94321767 0.99371069 0.73584906 0.96226415
|
|
0.99056604 0.96855346 0.95597484 0.97484277]
|
|
|
|
mean value: 0.9367249965279845
|
|
|
|
key: test_roc_auc
|
|
value: [0.91626984 0.95753968 0.94444444 0.95833333 0.91507937 0.95714286
|
|
0.95714286 1. 0.9 0.9 ]
|
|
|
|
mean value: 0.940595238095238
|
|
|
|
key: train_roc_auc
|
|
value: [0.93534611 0.96692161 0.95745789 0.95742317 0.86319267 0.9653592
|
|
0.96383648 0.96069182 0.96698113 0.97327044]
|
|
|
|
mean value: 0.951048052695276
|
|
|
|
key: test_jcc
|
|
value: [0.83783784 0.92105263 0.88888889 0.92105263 0.83783784 0.91428571
|
|
0.92105263 1. 0.81578947 0.80555556]
|
|
|
|
mean value: 0.8863353202826887
|
|
|
|
key: train_jcc
|
|
value: [0.871875 0.93558282 0.91717791 0.9212828 0.72897196 0.93292683
|
|
0.93195266 0.92492492 0.93538462 0.94801223]
|
|
|
|
mean value: 0.9048091762362589
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.82
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01513577 0.0140748 0.01494813 0.01315999 0.01602697 0.01350999
|
|
0.01594853 0.01357126 0.01522708 0.01573467]
|
|
|
|
mean value: 0.014733719825744628
|
|
|
|
key: score_time
|
|
value: [0.0106318 0.01056147 0.01055241 0.01054549 0.01066828 0.01055503
|
|
0.01060414 0.01057744 0.01061797 0.02540255]
|
|
|
|
mean value: 0.012071657180786132
|
|
|
|
key: test_mcc
|
|
value: [0.85952381 0.88880092 0.91885703 0.91587302 0.91885703 0.88862624
|
|
0.91465912 0.97182532 0.80295507 0.85749293]
|
|
|
|
mean value: 0.8937470474465333
|
|
|
|
key: train_mcc
|
|
value: [0.94990974 0.90065217 0.92530412 0.95620727 0.95298581 0.9401617
|
|
0.9311123 0.93081761 0.88444772 0.94985462]
|
|
|
|
mean value: 0.9321453053914295
|
|
|
|
key: test_accuracy
|
|
value: [0.92957746 0.94366197 0.95774648 0.95774648 0.95774648 0.94366197
|
|
0.95714286 0.98571429 0.9 0.92857143]
|
|
|
|
mean value: 0.9461569416498994
|
|
|
|
key: train_accuracy
|
|
value: [0.97480315 0.9496063 0.96220472 0.97795276 0.97637795 0.97007874
|
|
0.96540881 0.96540881 0.94025157 0.97484277]
|
|
|
|
mean value: 0.9656935571732779
|
|
|
|
key: test_fscore
|
|
value: [0.92957746 0.94285714 0.95652174 0.95774648 0.95890411 0.94117647
|
|
0.95774648 0.98591549 0.89552239 0.92957746]
|
|
|
|
mean value: 0.9455545230506246
|
|
|
|
key: train_fscore
|
|
value: [0.97507788 0.94805195 0.96129032 0.97826087 0.97667185 0.97017268
|
|
0.96496815 0.96540881 0.93729373 0.97507788]
|
|
|
|
mean value: 0.9652274125866556
|
|
|
|
key: test_precision
|
|
value: [0.94285714 0.97058824 1. 0.94444444 0.92105263 0.96969697
|
|
0.94444444 0.97222222 0.9375 0.91666667]
|
|
|
|
mean value: 0.9519472757204955
|
|
|
|
key: train_precision
|
|
value: [0.96307692 0.97658863 0.98349835 0.96625767 0.96615385 0.96865204
|
|
0.97741935 0.96540881 0.98611111 0.96604938]
|
|
|
|
mean value: 0.9719216107854822
|
|
|
|
key: test_recall
|
|
value: [0.91666667 0.91666667 0.91666667 0.97142857 1. 0.91428571
|
|
0.97142857 1. 0.85714286 0.94285714]
|
|
|
|
mean value: 0.9407142857142857
|
|
|
|
key: train_recall
|
|
value: [0.9873817 0.92113565 0.94006309 0.99056604 0.98742138 0.97169811
|
|
0.95283019 0.96540881 0.89308176 0.98427673]
|
|
|
|
mean value: 0.9593863460508303
|
|
|
|
key: test_roc_auc
|
|
value: [0.9297619 0.94404762 0.95833333 0.95793651 0.95833333 0.94325397
|
|
0.95714286 0.98571429 0.9 0.92857143]
|
|
|
|
mean value: 0.9463095238095238
|
|
|
|
key: train_roc_auc
|
|
value: [0.97482293 0.94956153 0.96216991 0.97793286 0.97636053 0.97007619
|
|
0.96540881 0.96540881 0.94025157 0.97484277]
|
|
|
|
mean value: 0.9656835902624844
|
|
|
|
key: test_jcc
|
|
value: [0.86842105 0.89189189 0.91666667 0.91891892 0.92105263 0.88888889
|
|
0.91891892 0.97222222 0.81081081 0.86842105]
|
|
|
|
mean value: 0.8976213055160424
|
|
|
|
key: train_jcc
|
|
value: [0.95136778 0.90123457 0.92546584 0.95744681 0.95440729 0.94207317
|
|
0.93230769 0.9331307 0.88198758 0.95136778]
|
|
|
|
mean value: 0.9330789211831344
|
|
|
|
MCC on Blind test: 0.16
|
|
|
|
Accuracy on Blind test: 0.65
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.12434745 0.10765958 0.10760546 0.10737205 0.10747123 0.1078012
|
|
0.10726714 0.10751939 0.11066079 0.10946774]
|
|
|
|
mean value: 0.10971720218658447
|
|
|
|
key: score_time
|
|
value: [0.01442289 0.01422215 0.0144012 0.01428294 0.01433635 0.01430273
|
|
0.01422977 0.0143826 0.01571679 0.01569033]
|
|
|
|
mean value: 0.014598774909973144
|
|
|
|
key: test_mcc
|
|
value: [0.91580648 1. 0.91885703 0.9451949 0.91587302 0.94511009
|
|
0.91766294 0.97182532 0.94440028 0.8871639 ]
|
|
|
|
mean value: 0.9361893953436148
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.95774648 1. 0.95774648 0.97183099 0.95774648 0.97183099
|
|
0.95714286 0.98571429 0.97142857 0.94285714]
|
|
|
|
mean value: 0.9674044265593561
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.95890411 1. 0.95652174 0.97222222 0.95774648 0.97058824
|
|
0.95890411 0.98550725 0.97222222 0.94117647]
|
|
|
|
mean value: 0.9673792833885365
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.94594595 1. 1. 0.94594595 0.94444444 1.
|
|
0.92105263 1. 0.94594595 0.96969697]
|
|
|
|
mean value: 0.96730318835582
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.97222222 1. 0.91666667 1. 0.97142857 0.94285714
|
|
1. 0.97142857 1. 0.91428571]
|
|
|
|
mean value: 0.9688888888888889
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.95753968 1. 0.95833333 0.97222222 0.95793651 0.97142857
|
|
0.95714286 0.98571429 0.97142857 0.94285714]
|
|
|
|
mean value: 0.9674603174603175
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.92105263 1. 0.91666667 0.94594595 0.91891892 0.94285714
|
|
0.92105263 0.97142857 0.94594595 0.88888889]
|
|
|
|
mean value: 0.9372757343809975
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.11
|
|
|
|
Accuracy on Blind test: 0.65
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03845501 0.04855371 0.05894065 0.03629017 0.03888655 0.0367496
|
|
0.03712749 0.03992295 0.06053734 0.04424524]
|
|
|
|
mean value: 0.04397087097167969
|
|
|
|
key: score_time
|
|
value: [0.02625251 0.03578377 0.03374958 0.03522325 0.01706123 0.01737881
|
|
0.02301645 0.02451253 0.030761 0.02602267]
|
|
|
|
mean value: 0.026976180076599122
|
|
|
|
key: test_mcc
|
|
value: [0.94365079 0.97220047 0.91587302 0.91587302 0.9451949 0.94365079
|
|
0.91766294 1. 0.94440028 0.8871639 ]
|
|
|
|
mean value: 0.9385670099602731
|
|
|
|
key: train_mcc
|
|
value: [0.99685535 0.99055612 0.99370077 0.98425689 0.99685531 0.98425673
|
|
0.99373035 0.99057094 0.99371069 0.99061012]
|
|
|
|
mean value: 0.9915103266748221
|
|
|
|
key: test_accuracy
|
|
value: [0.97183099 0.98591549 0.95774648 0.95774648 0.97183099 0.97183099
|
|
0.95714286 1. 0.97142857 0.94285714]
|
|
|
|
mean value: 0.9688329979879275
|
|
|
|
key: train_accuracy
|
|
value: [0.9984252 0.99527559 0.99685039 0.99212598 0.9984252 0.99212598
|
|
0.99685535 0.99528302 0.99685535 0.99528302]
|
|
|
|
mean value: 0.9957505076016441
|
|
|
|
key: test_fscore
|
|
value: [0.97222222 0.98630137 0.95774648 0.95774648 0.97222222 0.97142857
|
|
0.95890411 1. 0.97222222 0.94117647]
|
|
|
|
mean value: 0.9689970145882008
|
|
|
|
key: train_fscore
|
|
value: [0.9984252 0.99527559 0.99684543 0.99212598 0.99843014 0.99215071
|
|
0.99684543 0.99529042 0.99685535 0.99530516]
|
|
|
|
mean value: 0.9957549405205315
|
|
|
|
key: test_precision
|
|
value: [0.97222222 0.97297297 0.97142857 0.94444444 0.94594595 0.97142857
|
|
0.92105263 1. 0.94594595 0.96969697]
|
|
|
|
mean value: 0.9615138275664592
|
|
|
|
key: train_precision
|
|
value: [0.99685535 0.99371069 0.99684543 0.99369085 0.9968652 0.99059561
|
|
1. 0.99373041 0.99685535 0.99065421]
|
|
|
|
mean value: 0.9949803089428332
|
|
|
|
key: test_recall
|
|
value: [0.97222222 1. 0.94444444 0.97142857 1. 0.97142857
|
|
1. 1. 1. 0.91428571]
|
|
|
|
mean value: 0.9773809523809524
|
|
|
|
key: train_recall
|
|
value: [1. 0.99684543 0.99684543 0.99056604 1. 0.99371069
|
|
0.99371069 0.99685535 0.99685535 1. ]
|
|
|
|
mean value: 0.9965388964942563
|
|
|
|
key: test_roc_auc
|
|
value: [0.9718254 0.98571429 0.95793651 0.95793651 0.97222222 0.9718254
|
|
0.95714286 1. 0.97142857 0.94285714]
|
|
|
|
mean value: 0.9688888888888889
|
|
|
|
key: train_roc_auc
|
|
value: [0.99842767 0.99527806 0.99685039 0.99212844 0.99842271 0.99212348
|
|
0.99685535 0.99528302 0.99685535 0.99528302]
|
|
|
|
mean value: 0.9957507489633554
|
|
|
|
key: test_jcc
|
|
value: [0.94594595 0.97297297 0.91891892 0.91891892 0.94594595 0.94444444
|
|
0.92105263 1. 0.94594595 0.88888889]
|
|
|
|
mean value: 0.940303461356093
|
|
|
|
key: train_jcc
|
|
value: [0.99685535 0.99059561 0.99371069 0.984375 0.9968652 0.98442368
|
|
0.99371069 0.990625 0.99373041 0.99065421]
|
|
|
|
mean value: 0.9915545833750219
|
|
|
|
MCC on Blind test: 0.0
|
|
|
|
Accuracy on Blind test: 0.38
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.25946164 0.30284429 0.27964544 0.27268171 0.31926012 0.17955041
|
|
0.16414642 0.25722194 0.18691826 0.15706849]
|
|
|
|
mean value: 0.23787987232208252
|
|
|
|
key: score_time
|
|
value: [0.02183843 0.0219655 0.02176881 0.02186322 0.02178836 0.01367235
|
|
0.02136087 0.01341414 0.01887727 0.02596784]
|
|
|
|
mean value: 0.02025167942047119
|
|
|
|
key: test_mcc
|
|
value: [0.77565853 0.66190476 0.74662454 0.88880092 0.86802778 0.83240693
|
|
0.6882472 0.81649658 0.82992752 0.65714286]
|
|
|
|
mean value: 0.7765237623635132
|
|
|
|
key: train_mcc
|
|
value: [0.87752313 0.88987659 0.86815344 0.87737406 0.88046834 0.88367504
|
|
0.88078191 0.88078191 0.87771008 0.89068168]
|
|
|
|
mean value: 0.8807026177428205
|
|
|
|
key: test_accuracy
|
|
value: [0.88732394 0.83098592 0.87323944 0.94366197 0.92957746 0.91549296
|
|
0.84285714 0.9 0.91428571 0.82857143]
|
|
|
|
mean value: 0.8865995975855131
|
|
|
|
key: train_accuracy
|
|
value: [0.93858268 0.94488189 0.93385827 0.93858268 0.94015748 0.94173228
|
|
0.94025157 0.94025157 0.93867925 0.94496855]
|
|
|
|
mean value: 0.9401946218986778
|
|
|
|
key: test_fscore
|
|
value: [0.89189189 0.83333333 0.87671233 0.94444444 0.93333333 0.91666667
|
|
0.84931507 0.90909091 0.91666667 0.82857143]
|
|
|
|
mean value: 0.8900026071258949
|
|
|
|
key: train_fscore
|
|
value: [0.93934681 0.94522692 0.93478261 0.93934681 0.94080997 0.94245723
|
|
0.94099379 0.94099379 0.93953488 0.94607088]
|
|
|
|
mean value: 0.9409563689601331
|
|
|
|
key: test_precision
|
|
value: [0.86842105 0.83333333 0.86486486 0.91891892 0.875 0.89189189
|
|
0.81578947 0.83333333 0.89189189 0.82857143]
|
|
|
|
mean value: 0.8622016189121453
|
|
|
|
key: train_precision
|
|
value: [0.92638037 0.9378882 0.9204893 0.92923077 0.93209877 0.93230769
|
|
0.92944785 0.92944785 0.9266055 0.92749245]
|
|
|
|
mean value: 0.9291388747701107
|
|
|
|
key: test_recall
|
|
value: [0.91666667 0.83333333 0.88888889 0.97142857 1. 0.94285714
|
|
0.88571429 1. 0.94285714 0.82857143]
|
|
|
|
mean value: 0.921031746031746
|
|
|
|
key: train_recall
|
|
value: [0.95268139 0.95268139 0.94952681 0.94968553 0.94968553 0.95283019
|
|
0.95283019 0.95283019 0.95283019 0.96540881]
|
|
|
|
mean value: 0.9530990218836181
|
|
|
|
key: test_roc_auc
|
|
value: [0.88690476 0.83095238 0.87301587 0.94404762 0.93055556 0.91587302
|
|
0.84285714 0.9 0.91428571 0.82857143]
|
|
|
|
mean value: 0.8867063492063492
|
|
|
|
key: train_roc_auc
|
|
value: [0.93860484 0.94489415 0.9338829 0.93856516 0.94014245 0.94171478
|
|
0.94025157 0.94025157 0.93867925 0.94496855]
|
|
|
|
mean value: 0.9401955240759479
|
|
|
|
key: test_jcc
|
|
value: [0.80487805 0.71428571 0.7804878 0.89473684 0.875 0.84615385
|
|
0.73809524 0.83333333 0.84615385 0.70731707]
|
|
|
|
mean value: 0.8040441746956509
|
|
|
|
key: train_jcc
|
|
value: [0.8856305 0.89614243 0.87755102 0.8856305 0.88823529 0.89117647
|
|
0.88856305 0.88856305 0.88596491 0.89766082]
|
|
|
|
mean value: 0.8885118046116812
|
|
|
|
MCC on Blind test: 0.36
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.31332707 0.30789495 0.30552244 0.30341172 0.30553842 0.30712295
|
|
0.30548239 0.31058264 0.30930758 0.30872202]
|
|
|
|
mean value: 0.30769121646881104
|
|
|
|
key: score_time
|
|
value: [0.00987029 0.0085175 0.00868106 0.00901246 0.00944805 0.00861239
|
|
0.00868559 0.00857615 0.00932789 0.00859332]
|
|
|
|
mean value: 0.00893247127532959
|
|
|
|
key: test_mcc
|
|
value: [0.94511009 0.97220047 0.91885703 0.88880092 0.9451949 0.94511009
|
|
0.91766294 1. 0.94440028 0.860309 ]
|
|
|
|
mean value: 0.9337645710169541
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.97183099 0.98591549 0.95774648 0.94366197 0.97183099 0.97183099
|
|
0.95714286 1. 0.97142857 0.92857143]
|
|
|
|
mean value: 0.9659959758551308
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.97297297 0.98630137 0.95652174 0.94444444 0.97222222 0.97058824
|
|
0.95890411 1. 0.97222222 0.92537313]
|
|
|
|
mean value: 0.9659550450066827
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.94736842 0.97297297 1. 0.91891892 0.94594595 1.
|
|
0.92105263 1. 0.94594595 0.96875 ]
|
|
|
|
mean value: 0.9620954836415363
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.91666667 0.97142857 1. 0.94285714
|
|
1. 1. 1. 0.88571429]
|
|
|
|
mean value: 0.9716666666666667
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.97142857 0.98571429 0.95833333 0.94404762 0.97222222 0.97142857
|
|
0.95714286 1. 0.97142857 0.92857143]
|
|
|
|
mean value: 0.966031746031746
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.94736842 0.97297297 0.91666667 0.89473684 0.94594595 0.94285714
|
|
0.92105263 1. 0.94594595 0.86111111]
|
|
|
|
mean value: 0.9348657680236627
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: -0.04
|
|
|
|
Accuracy on Blind test: 0.34
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01238108 0.0148325 0.0149672 0.01481032 0.03123617 0.01486897
|
|
0.0147841 0.01480579 0.01492763 0.02490044]
|
|
|
|
mean value: 0.017251420021057128
|
|
|
|
key: score_time
|
|
value: [0.01073623 0.01106596 0.01105785 0.01112366 0.01126671 0.01110649
|
|
0.01109028 0.01363516 0.01109719 0.01138449]
|
|
|
|
mean value: 0.011356401443481445
|
|
|
|
key: test_mcc
|
|
value: [0.66068747 0.25082639 0.61746548 0.62970191 0.57247871 0.45738492
|
|
0.69954392 0.61036794 0.63245553 0.71899664]
|
|
|
|
mean value: 0.5849908918897713
|
|
|
|
key: train_mcc
|
|
value: [0.62752065 0.59587004 0.6984327 0.7396315 0.63814678 0.6036901
|
|
0.66988593 0.6066425 0.61612462 0.91194969]
|
|
|
|
mean value: 0.6707894510450889
|
|
|
|
key: test_accuracy
|
|
value: [0.8028169 0.6056338 0.77464789 0.8028169 0.76056338 0.67605634
|
|
0.82857143 0.77142857 0.78571429 0.85714286]
|
|
|
|
mean value: 0.7665392354124748
|
|
|
|
key: train_accuracy
|
|
value: [0.78267717 0.76220472 0.83622047 0.85354331 0.79055118 0.76692913
|
|
0.80974843 0.77044025 0.77515723 0.95597484]
|
|
|
|
mean value: 0.810344673896895
|
|
|
|
key: test_fscore
|
|
value: [0.75862069 0.48148148 0.71428571 0.76666667 0.69090909 0.5106383
|
|
0.79310345 0.7037037 0.72727273 0.84848485]
|
|
|
|
mean value: 0.6995166668607607
|
|
|
|
key: train_fscore
|
|
value: [0.72177419 0.6873706 0.81021898 0.82872928 0.73663366 0.69672131
|
|
0.76504854 0.70325203 0.70993915 0.95597484]
|
|
|
|
mean value: 0.7615662595724322
|
|
|
|
key: test_precision
|
|
value: [1. 0.72222222 1. 0.92 0.95 1.
|
|
1. 1. 1. 0.90322581]
|
|
|
|
mean value: 0.9495448028673835
|
|
|
|
key: train_precision
|
|
value: [1. 1. 0.96103896 1. 0.99465241 1.
|
|
1. 0.99425287 1. 0.95597484]
|
|
|
|
mean value: 0.9905919083786587
|
|
|
|
key: test_recall
|
|
value: [0.61111111 0.36111111 0.55555556 0.65714286 0.54285714 0.34285714
|
|
0.65714286 0.54285714 0.57142857 0.8 ]
|
|
|
|
mean value: 0.5642063492063492
|
|
|
|
key: train_recall
|
|
value: [0.56466877 0.52365931 0.70031546 0.70754717 0.58490566 0.53459119
|
|
0.61949686 0.54402516 0.55031447 0.95597484]
|
|
|
|
mean value: 0.6285498879034978
|
|
|
|
key: test_roc_auc
|
|
value: [0.80555556 0.60912698 0.77777778 0.80079365 0.75753968 0.67142857
|
|
0.82857143 0.77142857 0.78571429 0.85714286]
|
|
|
|
mean value: 0.7665079365079365
|
|
|
|
key: train_roc_auc
|
|
value: [0.78233438 0.76182965 0.83600679 0.85377358 0.79087554 0.7672956
|
|
0.80974843 0.77044025 0.77515723 0.95597484]
|
|
|
|
mean value: 0.8103436303394639
|
|
|
|
key: test_jcc
|
|
value: [0.61111111 0.31707317 0.55555556 0.62162162 0.52777778 0.34285714
|
|
0.65714286 0.54285714 0.57142857 0.73684211]
|
|
|
|
mean value: 0.5484267056346646
|
|
|
|
key: train_jcc
|
|
value: [0.56466877 0.52365931 0.6809816 0.70754717 0.5830721 0.53459119
|
|
0.61949686 0.54231975 0.55031447 0.91566265]
|
|
|
|
mean value: 0.6222313856468585
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.91
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02410054 0.01189899 0.01190758 0.01197672 0.01203632 0.02087998
|
|
0.03131342 0.03137183 0.03720117 0.03229856]
|
|
|
|
mean value: 0.022498512268066408
|
|
|
|
key: score_time
|
|
value: [0.02143836 0.01087689 0.01085854 0.01089215 0.01084137 0.02098989
|
|
0.02022457 0.01981997 0.01118922 0.01599312]
|
|
|
|
mean value: 0.015312409400939942
|
|
|
|
key: test_mcc
|
|
value: [0.94365079 0.9186708 1. 0.88730159 0.9451949 0.88862624
|
|
0.91465912 0.94440028 0.91465912 0.80295507]
|
|
|
|
mean value: 0.9160117910547262
|
|
|
|
key: train_mcc
|
|
value: [0.93078099 0.92126383 0.92442685 0.94646152 0.94330695 0.93700772
|
|
0.93712545 0.92454659 0.93396688 0.94025622]
|
|
|
|
mean value: 0.9339142994473958
|
|
|
|
key: test_accuracy
|
|
value: [0.97183099 0.95774648 1. 0.94366197 0.97183099 0.94366197
|
|
0.95714286 0.97142857 0.95714286 0.9 ]
|
|
|
|
mean value: 0.9574446680080483
|
|
|
|
key: train_accuracy
|
|
value: [0.96535433 0.96062992 0.96220472 0.97322835 0.97165354 0.96850394
|
|
0.96855346 0.96226415 0.96698113 0.97012579]
|
|
|
|
mean value: 0.9669499331451493
|
|
|
|
key: test_fscore
|
|
value: [0.97222222 0.96 1. 0.94285714 0.97222222 0.94117647
|
|
0.95774648 0.97222222 0.95774648 0.89552239]
|
|
|
|
mean value: 0.9571715625918226
|
|
|
|
key: train_fscore
|
|
value: [0.96507937 0.96050553 0.96202532 0.97322835 0.97169811 0.96855346
|
|
0.96845426 0.96214511 0.96692913 0.97017268]
|
|
|
|
mean value: 0.9668791316946547
|
|
|
|
key: test_precision
|
|
value: [0.97222222 0.92307692 1. 0.94285714 0.94594595 0.96969697
|
|
0.94444444 0.94594595 0.94444444 0.9375 ]
|
|
|
|
mean value: 0.9526134038634039
|
|
|
|
key: train_precision
|
|
value: [0.97124601 0.96202532 0.96507937 0.97476341 0.97169811 0.96855346
|
|
0.97151899 0.96518987 0.96845426 0.96865204]
|
|
|
|
mean value: 0.9687180824244073
|
|
|
|
key: test_recall
|
|
value: [0.97222222 1. 1. 0.94285714 1. 0.91428571
|
|
0.97142857 1. 0.97142857 0.85714286]
|
|
|
|
mean value: 0.962936507936508
|
|
|
|
key: train_recall
|
|
value: [0.95899054 0.95899054 0.95899054 0.97169811 0.97169811 0.96855346
|
|
0.96540881 0.9591195 0.96540881 0.97169811]
|
|
|
|
mean value: 0.9650556514493185
|
|
|
|
key: test_roc_auc
|
|
value: [0.9718254 0.95714286 1. 0.94365079 0.97222222 0.94325397
|
|
0.95714286 0.97142857 0.95714286 0.9 ]
|
|
|
|
mean value: 0.9573809523809523
|
|
|
|
key: train_roc_auc
|
|
value: [0.96534432 0.96062734 0.96219967 0.97323076 0.97165347 0.96850386
|
|
0.96855346 0.96226415 0.96698113 0.97012579]
|
|
|
|
mean value: 0.9669483959288138
|
|
|
|
key: test_jcc
|
|
value: [0.94594595 0.92307692 1. 0.89189189 0.94594595 0.88888889
|
|
0.91891892 0.94594595 0.91891892 0.81081081]
|
|
|
|
mean value: 0.9190344190344191
|
|
|
|
key: train_jcc
|
|
value: [0.93251534 0.92401216 0.92682927 0.94785276 0.94495413 0.93902439
|
|
0.93883792 0.92705167 0.93597561 0.94207317]
|
|
|
|
mean value: 0.9359126415900797
|
|
|
|
MCC on Blind test: 0.28
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: /home/tanu/git/LSHTM_analysis/scripts/ml/./embb_config.py:143: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
smnc_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./embb_config.py:146: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
smnc_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.14312363 0.11674213 0.22311258 0.18742394 0.09935522 0.20226431
|
|
0.20428991 0.13562512 0.20273662 0.20825934]
|
|
|
|
mean value: 0.1722932815551758
|
|
|
|
key: score_time
|
|
value: [0.01109982 0.01091933 0.01983857 0.01111674 0.01117826 0.02104044
|
|
0.01099682 0.01117277 0.01107144 0.01106048]
|
|
|
|
mean value: 0.012949466705322266
|
|
|
|
key: test_mcc
|
|
value: [0.94365079 0.9186708 1. 0.88730159 0.9451949 0.88862624
|
|
0.91465912 0.97182532 0.88571429 0.80295507]
|
|
|
|
mean value: 0.9158598109706015
|
|
|
|
key: train_mcc
|
|
value: [0.93702568 0.93070849 0.92759921 0.94646152 0.94016229 0.94649802
|
|
0.94029342 0.93400383 0.94025622 0.94025622]
|
|
|
|
mean value: 0.9383264898322129
|
|
|
|
key: test_accuracy
|
|
value: [0.97183099 0.95774648 1. 0.94366197 0.97183099 0.94366197
|
|
0.95714286 0.98571429 0.94285714 0.9 ]
|
|
|
|
mean value: 0.9574446680080483
|
|
|
|
key: train_accuracy
|
|
value: [0.96850394 0.96535433 0.96377953 0.97322835 0.97007874 0.97322835
|
|
0.97012579 0.96698113 0.97012579 0.97012579]
|
|
|
|
mean value: 0.9691531718912494
|
|
|
|
key: test_fscore
|
|
value: [0.97222222 0.96 1. 0.94285714 0.97222222 0.94117647
|
|
0.95774648 0.98591549 0.94285714 0.89552239]
|
|
|
|
mean value: 0.9570519560637653
|
|
|
|
key: train_fscore
|
|
value: [0.96835443 0.96529968 0.96354992 0.97322835 0.97007874 0.97339593
|
|
0.9699842 0.96682464 0.97007874 0.97017268]
|
|
|
|
mean value: 0.9690967324816946
|
|
|
|
key: test_precision
|
|
value: [0.97222222 0.92307692 1. 0.94285714 0.94594595 0.96969697
|
|
0.94444444 0.97222222 0.94285714 0.9375 ]
|
|
|
|
mean value: 0.9550823013323013
|
|
|
|
key: train_precision
|
|
value: [0.97142857 0.96529968 0.96815287 0.97476341 0.97160883 0.96884735
|
|
0.97460317 0.97142857 0.97160883 0.96865204]
|
|
|
|
mean value: 0.9706393330442624
|
|
|
|
key: test_recall
|
|
value: [0.97222222 1. 1. 0.94285714 1. 0.91428571
|
|
0.97142857 1. 0.94285714 0.85714286]
|
|
|
|
mean value: 0.9600793650793651
|
|
|
|
key: train_recall
|
|
value: [0.96529968 0.96529968 0.95899054 0.97169811 0.96855346 0.97798742
|
|
0.96540881 0.96226415 0.96855346 0.97169811]
|
|
|
|
mean value: 0.9675753427375354
|
|
|
|
key: test_roc_auc
|
|
value: [0.9718254 0.95714286 1. 0.94365079 0.97222222 0.94325397
|
|
0.95714286 0.98571429 0.94285714 0.9 ]
|
|
|
|
mean value: 0.9573809523809523
|
|
|
|
key: train_roc_auc
|
|
value: [0.9684989 0.96535424 0.963772 0.97323076 0.97008115 0.97322084
|
|
0.97012579 0.96698113 0.97012579 0.97012579]
|
|
|
|
mean value: 0.9691516377993373
|
|
|
|
key: test_jcc
|
|
value: [0.94594595 0.92307692 1. 0.89189189 0.94594595 0.88888889
|
|
0.91891892 0.97222222 0.89189189 0.81081081]
|
|
|
|
mean value: 0.918959343959344
|
|
|
|
key: train_jcc
|
|
value: [0.93865031 0.93292683 0.92966361 0.94785276 0.94189602 0.94817073
|
|
0.94171779 0.93577982 0.94189602 0.94207317]
|
|
|
|
mean value: 0.9400627064609138
|
|
|
|
MCC on Blind test: 0.28
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0334816 0.0385375 0.03184867 0.02891684 0.02883768 0.02635932
|
|
0.02719021 0.02624321 0.02558875 0.02810621]
|
|
|
|
mean value: 0.029510998725891115
|
|
|
|
key: score_time
|
|
value: [0.01067233 0.01144505 0.01100469 0.01092911 0.01091051 0.01090193
|
|
0.0108726 0.01089025 0.0111692 0.01106668]
|
|
|
|
mean value: 0.010986232757568359
|
|
|
|
key: test_mcc
|
|
value: [0.88730159 0.88862624 0.91587302 0.88880092 0.91885703 0.8594125
|
|
0.77269114 0.94440028 0.80032673 0.6882472 ]
|
|
|
|
mean value: 0.8564536645618587
|
|
|
|
key: train_mcc
|
|
value: [0.88367504 0.89923119 0.8772708 0.89606666 0.89291312 0.87122165
|
|
0.87757113 0.874283 0.90573203 0.88057281]
|
|
|
|
mean value: 0.8858537441772321
|
|
|
|
key: test_accuracy
|
|
value: [0.94366197 0.94366197 0.95774648 0.94366197 0.95774648 0.92957746
|
|
0.88571429 0.97142857 0.9 0.84285714]
|
|
|
|
mean value: 0.9276056338028169
|
|
|
|
key: train_accuracy
|
|
value: [0.94173228 0.9496063 0.93858268 0.9480315 0.94645669 0.93543307
|
|
0.93867925 0.93710692 0.95283019 0.94025157]
|
|
|
|
mean value: 0.9428710444213342
|
|
|
|
key: test_fscore
|
|
value: [0.94444444 0.94594595 0.95774648 0.94444444 0.95890411 0.92753623
|
|
0.88235294 0.97058824 0.89855072 0.8358209 ]
|
|
|
|
mean value: 0.9266334451811831
|
|
|
|
key: train_fscore
|
|
value: [0.94098884 0.94968553 0.93799682 0.94819466 0.94654088 0.93460925
|
|
0.93799682 0.93670886 0.953125 0.93987342]
|
|
|
|
mean value: 0.9425720082879654
|
|
|
|
key: test_precision
|
|
value: [0.94444444 0.92105263 0.97142857 0.91891892 0.92105263 0.94117647
|
|
0.90909091 1. 0.91176471 0.875 ]
|
|
|
|
mean value: 0.9313929283511326
|
|
|
|
key: train_precision
|
|
value: [0.9516129 0.94670846 0.94551282 0.94670846 0.94654088 0.94822006
|
|
0.94855305 0.94267516 0.94720497 0.94585987]
|
|
|
|
mean value: 0.9469596652319989
|
|
|
|
key: test_recall
|
|
value: [0.94444444 0.97222222 0.94444444 0.97142857 1. 0.91428571
|
|
0.85714286 0.94285714 0.88571429 0.8 ]
|
|
|
|
mean value: 0.9232539682539682
|
|
|
|
key: train_recall
|
|
value: [0.93059937 0.95268139 0.93059937 0.94968553 0.94654088 0.92138365
|
|
0.92767296 0.93081761 0.9591195 0.93396226]
|
|
|
|
mean value: 0.9383062516120072
|
|
|
|
key: test_roc_auc
|
|
value: [0.94365079 0.94325397 0.95793651 0.94404762 0.95833333 0.92936508
|
|
0.88571429 0.97142857 0.9 0.84285714]
|
|
|
|
mean value: 0.9276587301587301
|
|
|
|
key: train_roc_auc
|
|
value: [0.94171478 0.94961113 0.93857012 0.94802889 0.94645656 0.93545523
|
|
0.93867925 0.93710692 0.95283019 0.94025157]
|
|
|
|
mean value: 0.942870464059679
|
|
|
|
key: test_jcc
|
|
value: [0.89473684 0.8974359 0.91891892 0.89473684 0.92105263 0.86486486
|
|
0.78947368 0.94285714 0.81578947 0.71794872]
|
|
|
|
mean value: 0.8657815015709752
|
|
|
|
key: train_jcc
|
|
value: [0.88855422 0.90419162 0.88323353 0.90149254 0.89850746 0.87724551
|
|
0.88323353 0.88095238 0.91044776 0.88656716]
|
|
|
|
mean value: 0.8914425714809752
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.77223945 0.88186479 0.73250079 0.70630646 0.87280202 0.74433756
|
|
0.76517749 0.89203405 0.77521062 0.80457163]
|
|
|
|
mean value: 0.7947044849395752
|
|
|
|
key: score_time
|
|
value: [0.01399922 0.01436377 0.01483202 0.01464963 0.01440787 0.01132131
|
|
0.01460695 0.01472092 0.01459765 0.01145387]
|
|
|
|
mean value: 0.013895320892333984
|
|
|
|
key: test_mcc
|
|
value: [0.88730159 0.86753285 1. 0.9451949 0.9451949 0.97222222
|
|
0.94440028 0.97182532 0.91465912 0.91465912]
|
|
|
|
mean value: 0.936299028941898
|
|
|
|
key: train_mcc
|
|
value: [0.96867777 0.9625117 0.9625117 0.96558776 0.96558776 0.96250874
|
|
0.96872591 0.9625688 0.97501633 0.96564279]
|
|
|
|
mean value: 0.9659339262257346
|
|
|
|
key: test_accuracy
|
|
value: [0.94366197 0.92957746 1. 0.97183099 0.97183099 0.98591549
|
|
0.97142857 0.98571429 0.95714286 0.95714286]
|
|
|
|
mean value: 0.9674245472837022
|
|
|
|
key: train_accuracy
|
|
value: [0.98425197 0.98110236 0.98110236 0.98267717 0.98267717 0.98110236
|
|
0.98427673 0.98113208 0.98742138 0.9827044 ]
|
|
|
|
mean value: 0.982844797702174
|
|
|
|
key: test_fscore
|
|
value: [0.94444444 0.93506494 1. 0.97222222 0.97222222 0.98591549
|
|
0.97222222 0.98591549 0.95774648 0.95652174]
|
|
|
|
mean value: 0.9682275250095214
|
|
|
|
key: train_fscore
|
|
value: [0.984375 0.98130841 0.98130841 0.98289269 0.98289269 0.98136646
|
|
0.98442368 0.98136646 0.98753894 0.98289269]
|
|
|
|
mean value: 0.9830365430046653
|
|
|
|
key: test_precision
|
|
value: [0.94444444 0.87804878 1. 0.94594595 0.94594595 0.97222222
|
|
0.94594595 0.97222222 0.94444444 0.97058824]
|
|
|
|
mean value: 0.9519808186953094
|
|
|
|
key: train_precision
|
|
value: [0.9752322 0.96923077 0.96923077 0.97230769 0.97230769 0.96932515
|
|
0.97530864 0.96932515 0.97839506 0.97230769]
|
|
|
|
mean value: 0.97229708239792
|
|
|
|
key: test_recall
|
|
value: [0.94444444 1. 1. 1. 1. 1.
|
|
1. 1. 0.97142857 0.94285714]
|
|
|
|
mean value: 0.9858730158730159
|
|
|
|
key: train_recall
|
|
value: [0.99369085 0.99369085 0.99369085 0.99371069 0.99371069 0.99371069
|
|
0.99371069 0.99371069 0.99685535 0.99371069]
|
|
|
|
mean value: 0.9940192052060394
|
|
|
|
key: test_roc_auc
|
|
value: [0.94365079 0.92857143 1. 0.97222222 0.97222222 0.98611111
|
|
0.97142857 0.98571429 0.95714286 0.95714286]
|
|
|
|
mean value: 0.9674206349206349
|
|
|
|
key: train_roc_auc
|
|
value: [0.98426681 0.98112216 0.98112216 0.98265976 0.98265976 0.98108248
|
|
0.98427673 0.98113208 0.98742138 0.9827044 ]
|
|
|
|
mean value: 0.9828447711445747
|
|
|
|
key: test_jcc
|
|
value: [0.89473684 0.87804878 1. 0.94594595 0.94594595 0.97222222
|
|
0.94594595 0.97222222 0.91891892 0.91666667]
|
|
|
|
mean value: 0.9390653490460936
|
|
|
|
key: train_jcc
|
|
value: [0.96923077 0.96330275 0.96330275 0.96636086 0.96636086 0.96341463
|
|
0.96932515 0.96341463 0.97538462 0.96636086]
|
|
|
|
mean value: 0.9666457879676796
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01122856 0.01021957 0.00833702 0.0081079 0.00787926 0.00794339
|
|
0.00866914 0.00802183 0.0078454 0.00855756]
|
|
|
|
mean value: 0.008680963516235351
|
|
|
|
key: score_time
|
|
value: [0.01121497 0.00928688 0.00848627 0.00823808 0.00867367 0.00840926
|
|
0.00836754 0.00825548 0.00810933 0.00873709]
|
|
|
|
mean value: 0.008777856826782227
|
|
|
|
key: test_mcc
|
|
value: [0.69023056 0.57777778 0.85952381 0.75442414 0.7468254 0.77565853
|
|
0.74316054 0.6882472 0.77142857 0.600982 ]
|
|
|
|
mean value: 0.7208258525066535
|
|
|
|
key: train_mcc
|
|
value: [0.77704336 0.633035 0.7613864 0.77335915 0.75280338 0.7642249
|
|
0.77704083 0.75544945 0.75849571 0.78672387]
|
|
|
|
mean value: 0.7539562032070855
|
|
|
|
key: test_accuracy
|
|
value: [0.84507042 0.78873239 0.92957746 0.87323944 0.87323944 0.88732394
|
|
0.87142857 0.84285714 0.88571429 0.8 ]
|
|
|
|
mean value: 0.8597183098591549
|
|
|
|
key: train_accuracy
|
|
value: [0.88818898 0.80944882 0.88031496 0.88661417 0.87559055 0.88188976
|
|
0.88836478 0.87735849 0.87893082 0.89308176]
|
|
|
|
mean value: 0.8759783093151092
|
|
|
|
key: test_fscore
|
|
value: [0.84931507 0.78873239 0.92957746 0.88 0.87323944 0.88235294
|
|
0.86956522 0.84931507 0.88571429 0.79411765]
|
|
|
|
mean value: 0.8601929524101833
|
|
|
|
key: train_fscore
|
|
value: [0.89026275 0.78659612 0.88271605 0.88785047 0.87975647 0.88408037
|
|
0.88992248 0.88 0.88135593 0.89506173]
|
|
|
|
mean value: 0.875760236872007
|
|
|
|
key: test_precision
|
|
value: [0.83783784 0.8 0.94285714 0.825 0.86111111 0.90909091
|
|
0.88235294 0.81578947 0.88571429 0.81818182]
|
|
|
|
mean value: 0.8577935519653785
|
|
|
|
key: train_precision
|
|
value: [0.87272727 0.892 0.86404834 0.87962963 0.85250737 0.86930091
|
|
0.87767584 0.86144578 0.86404834 0.87878788]
|
|
|
|
mean value: 0.8712171368478436
|
|
|
|
key: test_recall
|
|
value: [0.86111111 0.77777778 0.91666667 0.94285714 0.88571429 0.85714286
|
|
0.85714286 0.88571429 0.88571429 0.77142857]
|
|
|
|
mean value: 0.8641269841269841
|
|
|
|
key: train_recall
|
|
value: [0.90851735 0.70347003 0.9022082 0.89622642 0.90880503 0.89937107
|
|
0.90251572 0.89937107 0.89937107 0.91194969]
|
|
|
|
mean value: 0.8831805646489297
|
|
|
|
key: test_roc_auc
|
|
value: [0.84484127 0.78888889 0.9297619 0.87420635 0.8734127 0.88690476
|
|
0.87142857 0.84285714 0.88571429 0.8 ]
|
|
|
|
mean value: 0.8598015873015873
|
|
|
|
key: train_roc_auc
|
|
value: [0.88822094 0.80928219 0.88034938 0.88659901 0.87553816 0.88186219
|
|
0.88836478 0.87735849 0.87893082 0.89308176]
|
|
|
|
mean value: 0.8759587722953
|
|
|
|
key: test_jcc
|
|
value: [0.73809524 0.65116279 0.86842105 0.78571429 0.775 0.78947368
|
|
0.76923077 0.73809524 0.79487179 0.65853659]
|
|
|
|
mean value: 0.756860143891296
|
|
|
|
key: train_jcc
|
|
value: [0.80222841 0.64825581 0.79005525 0.79831933 0.78532609 0.79224377
|
|
0.80167598 0.78571429 0.78787879 0.81005587]
|
|
|
|
mean value: 0.7801753573997666
|
|
|
|
MCC on Blind test: 0.28
|
|
|
|
Accuracy on Blind test: 0.76
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00878906 0.00839543 0.00800848 0.00811028 0.00816107 0.00810719
|
|
0.00804377 0.00802827 0.00797844 0.00811362]
|
|
|
|
mean value: 0.008173561096191407
|
|
|
|
key: score_time
|
|
value: [0.00841141 0.00832272 0.00822234 0.00826097 0.00850153 0.00808287
|
|
0.00810909 0.00806832 0.00813913 0.00800657]
|
|
|
|
mean value: 0.00821249485015869
|
|
|
|
key: test_mcc
|
|
value: [0.60881948 0.60555556 0.63940384 0.49285714 0.6153057 0.59007669
|
|
0.45883147 0.6614769 0.57735027 0.46188022]
|
|
|
|
mean value: 0.5711557273256194
|
|
|
|
key: train_mcc
|
|
value: [0.59860964 0.58693799 0.6055625 0.58980737 0.63409349 0.60603034
|
|
0.63214752 0.60053256 0.63553444 0.6201872 ]
|
|
|
|
mean value: 0.6109443059939785
|
|
|
|
key: test_accuracy
|
|
value: [0.8028169 0.8028169 0.81690141 0.74647887 0.8028169 0.78873239
|
|
0.72857143 0.82857143 0.78571429 0.72857143]
|
|
|
|
mean value: 0.7831991951710262
|
|
|
|
key: train_accuracy
|
|
value: [0.7984252 0.79212598 0.8015748 0.79370079 0.81574803 0.8
|
|
0.81289308 0.79874214 0.81761006 0.80974843]
|
|
|
|
mean value: 0.8040568513841431
|
|
|
|
key: test_fscore
|
|
value: [0.81578947 0.80555556 0.83116883 0.74285714 0.81578947 0.80519481
|
|
0.73972603 0.83783784 0.8 0.70769231]
|
|
|
|
mean value: 0.7901611455072162
|
|
|
|
key: train_fscore
|
|
value: [0.80547112 0.80120482 0.80966767 0.80300752 0.82406015 0.81350954
|
|
0.82525698 0.80838323 0.82043344 0.8141321 ]
|
|
|
|
mean value: 0.8125126581130029
|
|
|
|
key: test_precision
|
|
value: [0.775 0.80555556 0.7804878 0.74285714 0.75609756 0.73809524
|
|
0.71052632 0.79487179 0.75 0.76666667]
|
|
|
|
mean value: 0.7620158079689531
|
|
|
|
key: train_precision
|
|
value: [0.7771261 0.76657061 0.77681159 0.76945245 0.78962536 0.7630854
|
|
0.77410468 0.77142857 0.80792683 0.7957958 ]
|
|
|
|
mean value: 0.7791927388032522
|
|
|
|
key: test_recall
|
|
value: [0.86111111 0.80555556 0.88888889 0.74285714 0.88571429 0.88571429
|
|
0.77142857 0.88571429 0.85714286 0.65714286]
|
|
|
|
mean value: 0.8241269841269842
|
|
|
|
key: train_recall
|
|
value: [0.83596215 0.83911672 0.84542587 0.83962264 0.86163522 0.87106918
|
|
0.8836478 0.8490566 0.83333333 0.83333333]
|
|
|
|
mean value: 0.8492202845068746
|
|
|
|
key: test_roc_auc
|
|
value: [0.80198413 0.80277778 0.81587302 0.74642857 0.80396825 0.79007937
|
|
0.72857143 0.82857143 0.78571429 0.72857143]
|
|
|
|
mean value: 0.7832539682539683
|
|
|
|
key: train_roc_auc
|
|
value: [0.79848422 0.79219987 0.80164375 0.79362836 0.81567565 0.7998879
|
|
0.81289308 0.79874214 0.81761006 0.80974843]
|
|
|
|
mean value: 0.8040513461500308
|
|
|
|
key: test_jcc
|
|
value: [0.68888889 0.6744186 0.71111111 0.59090909 0.68888889 0.67391304
|
|
0.58695652 0.72093023 0.66666667 0.54761905]
|
|
|
|
mean value: 0.6550302096510388
|
|
|
|
key: train_jcc
|
|
value: [0.67430025 0.66834171 0.68020305 0.67085427 0.70076726 0.68564356
|
|
0.7025 0.67839196 0.69553806 0.6865285 ]
|
|
|
|
mean value: 0.6843068622772353
|
|
|
|
MCC on Blind test: 0.22
|
|
|
|
Accuracy on Blind test: 0.63
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00771666 0.00820422 0.00826883 0.008219 0.00830936 0.00839472
|
|
0.00791407 0.00843978 0.00832844 0.00831151]
|
|
|
|
mean value: 0.00821065902709961
|
|
|
|
key: score_time
|
|
value: [0.01118207 0.01534915 0.01180792 0.01198816 0.01184177 0.01210046
|
|
0.01215458 0.01243091 0.01233387 0.01189065]
|
|
|
|
mean value: 0.012307953834533692
|
|
|
|
key: test_mcc
|
|
value: [0.75346834 0.84273607 0.69643609 0.8365327 0.8031746 0.77991323
|
|
0.74560114 0.6614769 0.57353933 0.62882815]
|
|
|
|
mean value: 0.7321706561143749
|
|
|
|
key: train_mcc
|
|
value: [0.83709453 0.83130054 0.82734834 0.80898883 0.7990627 0.80645661
|
|
0.83592055 0.80334707 0.82330288 0.81448419]
|
|
|
|
mean value: 0.8187306253278919
|
|
|
|
key: test_accuracy
|
|
value: [0.87323944 0.91549296 0.84507042 0.91549296 0.90140845 0.88732394
|
|
0.87142857 0.82857143 0.78571429 0.81428571]
|
|
|
|
mean value: 0.8638028169014085
|
|
|
|
key: train_accuracy
|
|
value: [0.91653543 0.91338583 0.91181102 0.9007874 0.8976378 0.9007874
|
|
0.91666667 0.89937107 0.91037736 0.90566038]
|
|
|
|
mean value: 0.9073020353587877
|
|
|
|
key: test_fscore
|
|
value: [0.88311688 0.92307692 0.85714286 0.91891892 0.90140845 0.89189189
|
|
0.87671233 0.83783784 0.79452055 0.8115942 ]
|
|
|
|
mean value: 0.8696220842300416
|
|
|
|
key: train_fscore
|
|
value: [0.92030075 0.91754123 0.91566265 0.90721649 0.90254873 0.90611028
|
|
0.91981846 0.90447761 0.91376702 0.90963855]
|
|
|
|
mean value: 0.9117081778217269
|
|
|
|
key: test_precision
|
|
value: [0.82926829 0.85714286 0.80487805 0.87179487 0.88888889 0.84615385
|
|
0.84210526 0.79487179 0.76315789 0.82352941]
|
|
|
|
mean value: 0.8321791169975116
|
|
|
|
key: train_precision
|
|
value: [0.87931034 0.87428571 0.87608069 0.8531856 0.86246418 0.8611898
|
|
0.88629738 0.86079545 0.88046647 0.87283237]
|
|
|
|
mean value: 0.8706908004288777
|
|
|
|
key: test_recall
|
|
value: [0.94444444 1. 0.91666667 0.97142857 0.91428571 0.94285714
|
|
0.91428571 0.88571429 0.82857143 0.8 ]
|
|
|
|
mean value: 0.9118253968253969
|
|
|
|
key: train_recall
|
|
value: [0.96529968 0.96529968 0.95899054 0.96855346 0.94654088 0.95597484
|
|
0.95597484 0.95283019 0.94968553 0.94968553]
|
|
|
|
mean value: 0.9568835188381644
|
|
|
|
key: test_roc_auc
|
|
value: [0.87222222 0.91428571 0.84404762 0.91626984 0.9015873 0.88809524
|
|
0.87142857 0.82857143 0.78571429 0.81428571]
|
|
|
|
mean value: 0.8636507936507937
|
|
|
|
key: train_roc_auc
|
|
value: [0.91661211 0.91346745 0.91188521 0.90068052 0.89756066 0.90070036
|
|
0.91666667 0.89937107 0.91037736 0.90566038]
|
|
|
|
mean value: 0.9072981766958317
|
|
|
|
key: test_jcc
|
|
value: [0.79069767 0.85714286 0.75 0.85 0.82051282 0.80487805
|
|
0.7804878 0.72093023 0.65909091 0.68292683]
|
|
|
|
mean value: 0.771666717665016
|
|
|
|
key: train_jcc
|
|
value: [0.85236769 0.84764543 0.84444444 0.83018868 0.82240437 0.82833787
|
|
0.85154062 0.82561308 0.84122563 0.83425414]
|
|
|
|
mean value: 0.837802195297192
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.74
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02249098 0.0178566 0.01815391 0.02023125 0.02107072 0.02131462
|
|
0.01788855 0.01724267 0.01735258 0.01709366]
|
|
|
|
mean value: 0.019069552421569824
|
|
|
|
key: score_time
|
|
value: [0.01088095 0.01002216 0.01010871 0.00967574 0.01091886 0.01002645
|
|
0.01017642 0.00960231 0.00959897 0.00949121]
|
|
|
|
mean value: 0.010050177574157715
|
|
|
|
key: test_mcc
|
|
value: [0.81050059 0.88862624 0.8594125 0.85952381 0.91885703 0.88880092
|
|
0.80032673 0.8660254 0.80032673 0.74316054]
|
|
|
|
mean value: 0.8435560494237176
|
|
|
|
key: train_mcc
|
|
value: [0.87720238 0.88357673 0.88033094 0.89298187 0.88350199 0.88668202
|
|
0.89644363 0.87746696 0.88994151 0.89658557]
|
|
|
|
mean value: 0.8864713594717671
|
|
|
|
key: test_accuracy
|
|
value: [0.90140845 0.94366197 0.92957746 0.92957746 0.95774648 0.94366197
|
|
0.9 0.92857143 0.9 0.87142857]
|
|
|
|
mean value: 0.9205633802816902
|
|
|
|
key: train_accuracy
|
|
value: [0.93858268 0.94173228 0.94015748 0.94645669 0.94173228 0.94330709
|
|
0.94811321 0.93867925 0.94496855 0.94811321]
|
|
|
|
mean value: 0.9431842717773485
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.94594595 0.93150685 0.92957746 0.95890411 0.94444444
|
|
0.89855072 0.93333333 0.89855072 0.86956522]
|
|
|
|
mean value: 0.9219469723174141
|
|
|
|
key: train_fscore
|
|
value: [0.93819334 0.94209703 0.93987342 0.946875 0.94209703 0.94375
|
|
0.94867807 0.93915757 0.94488189 0.94883721]
|
|
|
|
mean value: 0.9434440551736646
|
|
|
|
key: test_precision
|
|
value: [0.85365854 0.92105263 0.91891892 0.91666667 0.92105263 0.91891892
|
|
0.91176471 0.875 0.91176471 0.88235294]
|
|
|
|
mean value: 0.9031150657188941
|
|
|
|
key: train_precision
|
|
value: [0.94267516 0.93478261 0.94285714 0.94099379 0.9376947 0.9378882
|
|
0.93846154 0.93188854 0.94637224 0.93577982]
|
|
|
|
mean value: 0.9389393742030523
|
|
|
|
key: test_recall
|
|
value: [0.97222222 0.97222222 0.94444444 0.94285714 1. 0.97142857
|
|
0.88571429 1. 0.88571429 0.85714286]
|
|
|
|
mean value: 0.9431746031746031
|
|
|
|
key: train_recall
|
|
value: [0.93375394 0.94952681 0.93690852 0.95283019 0.94654088 0.94968553
|
|
0.9591195 0.94654088 0.94339623 0.96226415]
|
|
|
|
mean value: 0.9480566632938515
|
|
|
|
key: test_roc_auc
|
|
value: [0.90039683 0.94325397 0.92936508 0.9297619 0.95833333 0.94404762
|
|
0.9 0.92857143 0.9 0.87142857]
|
|
|
|
mean value: 0.920515873015873
|
|
|
|
key: train_roc_auc
|
|
value: [0.93857508 0.94174454 0.94015237 0.94644664 0.9417247 0.94329703
|
|
0.94811321 0.93867925 0.94496855 0.94811321]
|
|
|
|
mean value: 0.9431814574529294
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.8974359 0.87179487 0.86842105 0.92105263 0.89473684
|
|
0.81578947 0.875 0.81578947 0.76923077]
|
|
|
|
mean value: 0.8562584345479083
|
|
|
|
key: train_jcc
|
|
value: [0.88358209 0.89053254 0.88656716 0.89910979 0.89053254 0.89349112
|
|
0.90236686 0.88529412 0.89552239 0.90265487]
|
|
|
|
mean value: 0.8929653495902684
|
|
|
|
MCC on Blind test: 0.32
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.88505411 2.62316203 1.86890149 2.08626842 2.09213996 2.03358459
|
|
2.04049277 2.06235981 2.0503335 2.0242188 ]
|
|
|
|
mean value: 2.0766515493392945
|
|
|
|
key: score_time
|
|
value: [0.02021956 0.01382565 0.01210284 0.01527214 0.01395893 0.01186132
|
|
0.01189065 0.01409435 0.01404357 0.01615286]
|
|
|
|
mean value: 0.014342188835144043
|
|
|
|
key: test_mcc
|
|
value: [0.97220047 0.9186708 0.97220047 0.9451949 0.91885703 0.97222222
|
|
0.94440028 0.97182532 0.8871639 0.94440028]
|
|
|
|
mean value: 0.9447135665250179
|
|
|
|
key: train_mcc
|
|
value: [0.99685535 0.99685535 0.99685535 1. 0.99685531 0.99372043
|
|
0.99686027 0.99686027 0.99686027 0.99686027]
|
|
|
|
mean value: 0.996858287722753
|
|
|
|
key: test_accuracy
|
|
value: [0.98591549 0.95774648 0.98591549 0.97183099 0.95774648 0.98591549
|
|
0.97142857 0.98571429 0.94285714 0.97142857]
|
|
|
|
mean value: 0.9716498993963782
|
|
|
|
key: train_accuracy
|
|
value: [0.9984252 0.9984252 0.9984252 1. 0.9984252 0.99685039
|
|
0.99842767 0.99842767 0.99842767 0.99842767]
|
|
|
|
mean value: 0.9984261872926261
|
|
|
|
key: test_fscore
|
|
value: [0.98630137 0.96 0.98630137 0.97222222 0.95890411 0.98591549
|
|
0.97222222 0.98591549 0.94444444 0.97222222]
|
|
|
|
mean value: 0.9724448946341673
|
|
|
|
key: train_fscore
|
|
value: [0.9984252 0.9984252 0.9984252 1. 0.99843014 0.9968652
|
|
0.99843014 0.99843014 0.99843014 0.99843014]
|
|
|
|
mean value: 0.9984291500749357
|
|
|
|
key: test_precision
|
|
value: [0.97297297 0.92307692 0.97297297 0.94594595 0.92105263 0.97222222
|
|
0.94594595 0.97222222 0.91891892 0.94594595]
|
|
|
|
mean value: 0.9491276701803018
|
|
|
|
key: train_precision
|
|
value: [0.99685535 0.99685535 0.99685535 1. 0.9968652 0.99375
|
|
0.9968652 0.9968652 0.9968652 0.9968652 ]
|
|
|
|
mean value: 0.9968642056544627
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 1. 0.97142857 1. ]
|
|
|
|
mean value: 0.9971428571428571
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98571429 0.95714286 0.98571429 0.97222222 0.95833333 0.98611111
|
|
0.97142857 0.98571429 0.94285714 0.97142857]
|
|
|
|
mean value: 0.9716666666666667
|
|
|
|
key: train_roc_auc
|
|
value: [0.99842767 0.99842767 0.99842767 1. 0.99842271 0.99684543
|
|
0.99842767 0.99842767 0.99842767 0.99842767]
|
|
|
|
mean value: 0.9984261849493086
|
|
|
|
key: test_jcc
|
|
value: [0.97297297 0.92307692 0.97297297 0.94594595 0.92105263 0.97222222
|
|
0.94594595 0.97222222 0.89473684 0.94594595]
|
|
|
|
mean value: 0.9467094624989362
|
|
|
|
key: train_jcc
|
|
value: [0.99685535 0.99685535 0.99685535 1. 0.9968652 0.99375
|
|
0.9968652 0.9968652 0.9968652 0.9968652 ]
|
|
|
|
mean value: 0.9968642056544627
|
|
|
|
MCC on Blind test: 0.31
|
|
|
|
Accuracy on Blind test: 0.84
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01675367 0.01345181 0.01251197 0.01144505 0.01196003 0.01191974
|
|
0.01172614 0.01275802 0.01214552 0.01232648]
|
|
|
|
mean value: 0.01269984245300293
|
|
|
|
key: score_time
|
|
value: [0.01150537 0.00836968 0.00818038 0.00795603 0.00822639 0.00796032
|
|
0.00794578 0.00900435 0.00807214 0.00801206]
|
|
|
|
mean value: 0.008523249626159668
|
|
|
|
key: test_mcc
|
|
value: [0.94511009 0.97220047 0.97222222 0.89315217 0.9451949 0.91587302
|
|
0.91766294 1. 0.94440028 0.97182532]
|
|
|
|
mean value: 0.9477641393225145
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.97183099 0.98591549 0.98591549 0.94366197 0.97183099 0.95774648
|
|
0.95714286 1. 0.97142857 0.98571429]
|
|
|
|
mean value: 0.9731187122736419
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.97297297 0.98630137 0.98591549 0.94594595 0.97222222 0.95774648
|
|
0.95890411 1. 0.97222222 0.98591549]
|
|
|
|
mean value: 0.9738146307604151
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.94736842 0.97297297 1. 0.8974359 0.94594595 0.94444444
|
|
0.92105263 1. 0.94594595 0.97222222]
|
|
|
|
mean value: 0.9547388481599008
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.97222222 1. 1. 0.97142857
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9943650793650793
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.97142857 0.98571429 0.98611111 0.94444444 0.97222222 0.95793651
|
|
0.95714286 1. 0.97142857 0.98571429]
|
|
|
|
mean value: 0.9732142857142857
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.94736842 0.97297297 0.97222222 0.8974359 0.94594595 0.91891892
|
|
0.92105263 1. 0.94594595 0.97222222]
|
|
|
|
mean value: 0.9494085178295705
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.23
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10958791 0.1073451 0.10643077 0.10473418 0.10406518 0.10428858
|
|
0.10580897 0.10563993 0.10519648 0.11137867]
|
|
|
|
mean value: 0.10644757747650146
|
|
|
|
key: score_time
|
|
value: [0.01716733 0.01747584 0.0184536 0.01719213 0.01720023 0.01738763
|
|
0.01717591 0.01825809 0.01838541 0.01861048]
|
|
|
|
mean value: 0.01773066520690918
|
|
|
|
key: test_mcc
|
|
value: [0.91587302 1. 1. 0.9451949 0.9451949 1.
|
|
0.94440028 1. 0.91465912 0.97182532]
|
|
|
|
mean value: 0.9637147528126756
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.95774648 1. 1. 0.97183099 0.97183099 1.
|
|
0.97142857 1. 0.95714286 0.98571429]
|
|
|
|
mean value: 0.981569416498994
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.95774648 1. 1. 0.97222222 0.97222222 1.
|
|
0.97222222 1. 0.95774648 0.98591549]
|
|
|
|
mean value: 0.9818075117370892
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.97142857 1. 1. 0.94594595 0.94594595 1.
|
|
0.94594595 1. 0.94444444 0.97222222]
|
|
|
|
mean value: 0.9725933075933075
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.94444444 1. 1. 1. 1. 1.
|
|
1. 1. 0.97142857 1. ]
|
|
|
|
mean value: 0.9915873015873016
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.95793651 1. 1. 0.97222222 0.97222222 1.
|
|
0.97142857 1. 0.95714286 0.98571429]
|
|
|
|
mean value: 0.9816666666666667
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.91891892 1. 1. 0.94594595 0.94594595 1.
|
|
0.94594595 1. 0.91891892 0.97222222]
|
|
|
|
mean value: 0.9647897897897898
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.38
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00803304 0.00870395 0.00800323 0.0079124 0.00792861 0.00882459
|
|
0.0079906 0.00846434 0.00908756 0.00920701]
|
|
|
|
mean value: 0.008415532112121583
|
|
|
|
key: score_time
|
|
value: [0.00797582 0.00854325 0.00794959 0.00855184 0.00831938 0.00841022
|
|
0.00789499 0.00829268 0.00895119 0.00839472]
|
|
|
|
mean value: 0.00832836627960205
|
|
|
|
key: test_mcc
|
|
value: [0.58237159 0.78542356 0.91587302 0.91587302 0.89315217 0.80588933
|
|
0.8660254 0.91766294 0.94285714 0.97182532]
|
|
|
|
mean value: 0.8596953472278328
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.78873239 0.88732394 0.95774648 0.95774648 0.94366197 0.90140845
|
|
0.92857143 0.95714286 0.97142857 0.98571429]
|
|
|
|
mean value: 0.9279476861167002
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.80519481 0.8974359 0.95774648 0.95774648 0.94594595 0.90410959
|
|
0.93333333 0.95890411 0.97142857 0.98591549]
|
|
|
|
mean value: 0.9317760702672916
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.75609756 0.83333333 0.97142857 0.94444444 0.8974359 0.86842105
|
|
0.875 0.92105263 0.97142857 0.97222222]
|
|
|
|
mean value: 0.9010864285479177
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.86111111 0.97222222 0.94444444 0.97142857 1. 0.94285714
|
|
1. 1. 0.97142857 1. ]
|
|
|
|
mean value: 0.9663492063492063
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.78769841 0.88611111 0.95793651 0.95793651 0.94444444 0.90198413
|
|
0.92857143 0.95714286 0.97142857 0.98571429]
|
|
|
|
mean value: 0.9278968253968254
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.67391304 0.81395349 0.91891892 0.91891892 0.8974359 0.825
|
|
0.875 0.92105263 0.94444444 0.97222222]
|
|
|
|
mean value: 0.8760859565369703
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.21
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.38699269 1.40455055 1.47870827 1.44593501 1.40105391 1.46513176
|
|
1.42459655 1.40462375 1.40781617 1.41947818]
|
|
|
|
mean value: 1.4238886833190918
|
|
|
|
key: score_time
|
|
value: [0.10027957 0.09926486 0.10140967 0.0993166 0.09710431 0.09941864
|
|
0.10116935 0.0992384 0.10060048 0.10078096]
|
|
|
|
mean value: 0.09985828399658203
|
|
|
|
key: test_mcc
|
|
value: [0.94511009 1. 1. 0.9451949 0.91885703 1.
|
|
0.94440028 1. 0.94440028 0.97182532]
|
|
|
|
mean value: 0.9669787898507702
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.97183099 1. 1. 0.97183099 0.95774648 1.
|
|
0.97142857 1. 0.97142857 0.98571429]
|
|
|
|
mean value: 0.9829979879275654
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.97297297 1. 1. 0.97222222 0.95890411 1.
|
|
0.97222222 1. 0.97222222 0.98591549]
|
|
|
|
mean value: 0.9834459242186427
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.94736842 1. 1. 0.94594595 0.92105263 1.
|
|
0.94594595 1. 0.94594595 0.97222222]
|
|
|
|
mean value: 0.9678481112691639
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.97142857 1. 1. 0.97222222 0.95833333 1.
|
|
0.97142857 1. 0.97142857 0.98571429]
|
|
|
|
mean value: 0.9830555555555556
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.94736842 1. 1. 0.94594595 0.92105263 1.
|
|
0.94594595 1. 0.94594595 0.97222222]
|
|
|
|
mean value: 0.9678481112691639
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.93753052 1.00080466 0.96736526 1.01301908 0.96891427 0.94553971
|
|
0.96478963 0.93041205 0.95536923 0.9131546 ]
|
|
|
|
mean value: 0.9596899032592774
|
|
|
|
key: score_time
|
|
value: [0.14722586 0.28152537 0.28021955 0.2021513 0.26646662 0.22738409
|
|
0.21272445 0.26650643 0.24508262 0.24609566]
|
|
|
|
mean value: 0.23753819465637208
|
|
|
|
key: test_mcc
|
|
value: [0.94511009 0.97220047 1. 0.9451949 0.91885703 1.
|
|
0.94440028 1. 0.91465912 0.94285714]
|
|
|
|
mean value: 0.9583279030242763
|
|
|
|
key: train_mcc
|
|
value: [0.96559014 0.96559014 0.96559014 0.96867592 0.96867592 0.96558776
|
|
0.97181825 0.96564279 0.97193362 0.9688601 ]
|
|
|
|
mean value: 0.9677964760324274
|
|
|
|
key: test_accuracy
|
|
value: [0.97183099 0.98591549 1. 0.97183099 0.95774648 1.
|
|
0.97142857 1. 0.95714286 0.97142857]
|
|
|
|
mean value: 0.9787323943661972
|
|
|
|
key: train_accuracy
|
|
value: [0.98267717 0.98267717 0.98267717 0.98425197 0.98425197 0.98267717
|
|
0.98584906 0.9827044 0.98584906 0.98427673]
|
|
|
|
mean value: 0.9837891843708215
|
|
|
|
key: test_fscore
|
|
value: [0.97297297 0.98630137 1. 0.97222222 0.95890411 1.
|
|
0.97222222 1. 0.95774648 0.97142857]
|
|
|
|
mean value: 0.9791797947171283
|
|
|
|
key: train_fscore
|
|
value: [0.98283931 0.98283931 0.98283931 0.98442368 0.98442368 0.98289269
|
|
0.98595944 0.98289269 0.98600311 0.98447205]
|
|
|
|
mean value: 0.9839585272255872
|
|
|
|
key: test_precision
|
|
value: [0.94736842 0.97297297 1. 0.94594595 0.92105263 1.
|
|
0.94594595 1. 0.94444444 0.97142857]
|
|
|
|
mean value: 0.9649158933369459
|
|
|
|
key: train_precision
|
|
value: [0.97222222 0.97222222 0.97222222 0.97530864 0.97530864 0.97230769
|
|
0.97832817 0.97230769 0.97538462 0.97239264]
|
|
|
|
mean value: 0.9738004762028707
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 1. 0.97142857 0.97142857]
|
|
|
|
mean value: 0.9942857142857143
|
|
|
|
key: train_recall
|
|
value: [0.99369085 0.99369085 0.99369085 0.99371069 0.99371069 0.99371069
|
|
0.99371069 0.99371069 0.99685535 0.99685535]
|
|
|
|
mean value: 0.9943336706148443
|
|
|
|
key: test_roc_auc
|
|
value: [0.97142857 0.98571429 1. 0.97222222 0.95833333 1.
|
|
0.97142857 1. 0.95714286 0.97142857]
|
|
|
|
mean value: 0.9787698412698412
|
|
|
|
key: train_roc_auc
|
|
value: [0.98269448 0.98269448 0.98269448 0.98423705 0.98423705 0.98265976
|
|
0.98584906 0.9827044 0.98584906 0.98427673]
|
|
|
|
mean value: 0.9837896553776562
|
|
|
|
key: test_jcc
|
|
value: [0.94736842 0.97297297 1. 0.94594595 0.92105263 1.
|
|
0.94594595 1. 0.91891892 0.94444444]
|
|
|
|
mean value: 0.9596649280859807
|
|
|
|
key: train_jcc
|
|
value: [0.96625767 0.96625767 0.96625767 0.96932515 0.96932515 0.96636086
|
|
0.97230769 0.96636086 0.97239264 0.96941896]
|
|
|
|
mean value: 0.9684264316010812
|
|
|
|
MCC on Blind test: 0.31
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00934434 0.0084703 0.0083704 0.00860596 0.00874734 0.00881553
|
|
0.00802827 0.00889921 0.00797343 0.00873542]
|
|
|
|
mean value: 0.008599019050598145
|
|
|
|
key: score_time
|
|
value: [0.00872207 0.00881052 0.00853944 0.00848937 0.00812554 0.00880098
|
|
0.00822377 0.00858617 0.00838947 0.00800633]
|
|
|
|
mean value: 0.008469367027282714
|
|
|
|
key: test_mcc
|
|
value: [0.60881948 0.60555556 0.63940384 0.49285714 0.6153057 0.59007669
|
|
0.45883147 0.6614769 0.57735027 0.46188022]
|
|
|
|
mean value: 0.5711557273256194
|
|
|
|
key: train_mcc
|
|
value: [0.59860964 0.58693799 0.6055625 0.58980737 0.63409349 0.60603034
|
|
0.63214752 0.60053256 0.63553444 0.6201872 ]
|
|
|
|
mean value: 0.6109443059939785
|
|
|
|
key: test_accuracy
|
|
value: [0.8028169 0.8028169 0.81690141 0.74647887 0.8028169 0.78873239
|
|
0.72857143 0.82857143 0.78571429 0.72857143]
|
|
|
|
mean value: 0.7831991951710262
|
|
|
|
key: train_accuracy
|
|
value: [0.7984252 0.79212598 0.8015748 0.79370079 0.81574803 0.8
|
|
0.81289308 0.79874214 0.81761006 0.80974843]
|
|
|
|
mean value: 0.8040568513841431
|
|
|
|
key: test_fscore
|
|
value: [0.81578947 0.80555556 0.83116883 0.74285714 0.81578947 0.80519481
|
|
0.73972603 0.83783784 0.8 0.70769231]
|
|
|
|
mean value: 0.7901611455072162
|
|
|
|
key: train_fscore
|
|
value: [0.80547112 0.80120482 0.80966767 0.80300752 0.82406015 0.81350954
|
|
0.82525698 0.80838323 0.82043344 0.8141321 ]
|
|
|
|
mean value: 0.8125126581130029
|
|
|
|
key: test_precision
|
|
value: [0.775 0.80555556 0.7804878 0.74285714 0.75609756 0.73809524
|
|
0.71052632 0.79487179 0.75 0.76666667]
|
|
|
|
mean value: 0.7620158079689531
|
|
|
|
key: train_precision
|
|
value: [0.7771261 0.76657061 0.77681159 0.76945245 0.78962536 0.7630854
|
|
0.77410468 0.77142857 0.80792683 0.7957958 ]
|
|
|
|
mean value: 0.7791927388032522
|
|
|
|
key: test_recall
|
|
value: [0.86111111 0.80555556 0.88888889 0.74285714 0.88571429 0.88571429
|
|
0.77142857 0.88571429 0.85714286 0.65714286]
|
|
|
|
mean value: 0.8241269841269842
|
|
|
|
key: train_recall
|
|
value: [0.83596215 0.83911672 0.84542587 0.83962264 0.86163522 0.87106918
|
|
0.8836478 0.8490566 0.83333333 0.83333333]
|
|
|
|
mean value: 0.8492202845068746
|
|
|
|
key: test_roc_auc
|
|
value: [0.80198413 0.80277778 0.81587302 0.74642857 0.80396825 0.79007937
|
|
0.72857143 0.82857143 0.78571429 0.72857143]
|
|
|
|
mean value: 0.7832539682539683
|
|
|
|
key: train_roc_auc
|
|
value: [0.79848422 0.79219987 0.80164375 0.79362836 0.81567565 0.7998879
|
|
0.81289308 0.79874214 0.81761006 0.80974843]
|
|
|
|
mean value: 0.8040513461500308
|
|
|
|
key: test_jcc
|
|
value: [0.68888889 0.6744186 0.71111111 0.59090909 0.68888889 0.67391304
|
|
0.58695652 0.72093023 0.66666667 0.54761905]
|
|
|
|
mean value: 0.6550302096510388
|
|
|
|
key: train_jcc
|
|
value: [0.67430025 0.66834171 0.68020305 0.67085427 0.70076726 0.68564356
|
|
0.7025 0.67839196 0.69553806 0.6865285 ]
|
|
|
|
mean value: 0.6843068622772353
|
|
|
|
MCC on Blind test: 0.22
|
|
|
|
Accuracy on Blind test: 0.63
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.07607102 0.06376791 0.0508194 0.05137157 0.05335784 0.0584743
|
|
0.05639553 0.05657625 0.0571003 0.05305147]
|
|
|
|
mean value: 0.05769855976104736
|
|
|
|
key: score_time
|
|
value: [0.01018667 0.00977993 0.00984669 0.00992823 0.00973797 0.00999999
|
|
0.00965548 0.00966048 0.00993204 0.00974274]
|
|
|
|
mean value: 0.009847021102905274
|
|
|
|
key: test_mcc
|
|
value: [0.94511009 1. 0.97222222 0.9451949 0.9451949 1.
|
|
0.94440028 1. 0.94440028 0.97182532]
|
|
|
|
mean value: 0.966834798557657
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.97183099 1. 0.98591549 0.97183099 0.97183099 1.
|
|
0.97142857 1. 0.97142857 0.98571429]
|
|
|
|
mean value: 0.9829979879275654
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.97297297 1. 0.98591549 0.97222222 0.97222222 1.
|
|
0.97222222 1. 0.97222222 0.98591549]
|
|
|
|
mean value: 0.9833692847777354
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.94736842 1. 1. 0.94594595 0.94594595 1.
|
|
0.94594595 1. 0.94594595 0.97222222]
|
|
|
|
mean value: 0.9703374427058638
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.97222222 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9972222222222222
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.97142857 1. 0.98611111 0.97222222 0.97222222 1.
|
|
0.97142857 1. 0.97142857 0.98571429]
|
|
|
|
mean value: 0.9830555555555556
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.94736842 1. 0.97222222 0.94594595 0.94594595 1.
|
|
0.94594595 1. 0.94594595 0.97222222]
|
|
|
|
mean value: 0.967559664928086
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.27
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01657271 0.04419208 0.04425573 0.05008912 0.05284476 0.044945
|
|
0.06070256 0.01946783 0.03588033 0.0560317 ]
|
|
|
|
mean value: 0.04249818325042724
|
|
|
|
key: score_time
|
|
value: [0.0105021 0.01969695 0.01938534 0.01104617 0.0147028 0.0137198
|
|
0.01671791 0.01098347 0.01099586 0.01136303]
|
|
|
|
mean value: 0.01391134262084961
|
|
|
|
key: test_mcc
|
|
value: [0.91580648 0.83214239 0.97222222 0.91587302 0.9451949 0.94511009
|
|
0.82992752 0.97182532 0.82992752 0.85749293]
|
|
|
|
mean value: 0.9015522377450673
|
|
|
|
key: train_mcc
|
|
value: [0.92448113 0.91225907 0.91812744 0.95928679 0.96867592 0.92760136
|
|
0.94029342 0.91825715 0.94341489 0.93083602]
|
|
|
|
mean value: 0.9343233182291167
|
|
|
|
key: test_accuracy
|
|
value: [0.95774648 0.91549296 0.98591549 0.95774648 0.97183099 0.97183099
|
|
0.91428571 0.98571429 0.91428571 0.92857143]
|
|
|
|
mean value: 0.9503420523138832
|
|
|
|
key: train_accuracy
|
|
value: [0.96220472 0.95590551 0.95905512 0.97952756 0.98425197 0.96377953
|
|
0.97012579 0.9591195 0.97169811 0.96540881]
|
|
|
|
mean value: 0.967107661070668
|
|
|
|
key: test_fscore
|
|
value: [0.95890411 0.91891892 0.98591549 0.95774648 0.97222222 0.97058824
|
|
0.91176471 0.98550725 0.91176471 0.92753623]
|
|
|
|
mean value: 0.9500868347880861
|
|
|
|
key: train_fscore
|
|
value: [0.96190476 0.95512821 0.95886076 0.97978227 0.98442368 0.96366509
|
|
0.9699842 0.95899054 0.97160883 0.96529968]
|
|
|
|
mean value: 0.9669648015872917
|
|
|
|
key: test_precision
|
|
value: [0.94594595 0.89473684 1. 0.94444444 0.94594595 1.
|
|
0.93939394 1. 0.93939394 0.94117647]
|
|
|
|
mean value: 0.9551037527817714
|
|
|
|
key: train_precision
|
|
value: [0.96805112 0.97068404 0.96190476 0.96923077 0.97530864 0.96825397
|
|
0.97460317 0.96202532 0.97468354 0.96835443]
|
|
|
|
mean value: 0.9693099764406033
|
|
|
|
key: test_recall
|
|
value: [0.97222222 0.94444444 0.97222222 0.97142857 1. 0.94285714
|
|
0.88571429 0.97142857 0.88571429 0.91428571]
|
|
|
|
mean value: 0.946031746031746
|
|
|
|
key: train_recall
|
|
value: [0.95583596 0.94006309 0.95583596 0.99056604 0.99371069 0.9591195
|
|
0.96540881 0.95597484 0.96855346 0.96226415]
|
|
|
|
mean value: 0.96473325000496
|
|
|
|
key: test_roc_auc
|
|
value: [0.95753968 0.91507937 0.98611111 0.95793651 0.97222222 0.97142857
|
|
0.91428571 0.98571429 0.91428571 0.92857143]
|
|
|
|
mean value: 0.9503174603174603
|
|
|
|
key: train_roc_auc
|
|
value: [0.96219471 0.9558806 0.95905006 0.97951015 0.98423705 0.96378688
|
|
0.97012579 0.9591195 0.97169811 0.96540881]
|
|
|
|
mean value: 0.9671011646132175
|
|
|
|
key: test_jcc
|
|
value: [0.92105263 0.85 0.97222222 0.91891892 0.94594595 0.94285714
|
|
0.83783784 0.97142857 0.83783784 0.86486486]
|
|
|
|
mean value: 0.906296597349229
|
|
|
|
key: train_jcc
|
|
value: [0.9266055 0.91411043 0.92097264 0.96036585 0.96932515 0.92987805
|
|
0.94171779 0.92121212 0.94478528 0.93292683]
|
|
|
|
mean value: 0.9361899652190242
|
|
|
|
MCC on Blind test: 0.26
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02060699 0.00799441 0.00798726 0.00798249 0.00766635 0.00779152
|
|
0.00775051 0.00782704 0.00777411 0.008286 ]
|
|
|
|
mean value: 0.009166669845581055
|
|
|
|
key: score_time
|
|
value: [0.0106101 0.00825071 0.0082407 0.00789952 0.00786734 0.00836182
|
|
0.00791502 0.00782728 0.00794268 0.00842905]
|
|
|
|
mean value: 0.00833442211151123
|
|
|
|
key: test_mcc
|
|
value: [0.67079854 0.75346834 0.71917468 0.70470171 0.77239298 0.67233796
|
|
0.57735027 0.69282032 0.71545476 0.65821838]
|
|
|
|
mean value: 0.693671793586133
|
|
|
|
key: train_mcc
|
|
value: [0.70041161 0.67932093 0.69527344 0.68590643 0.68703227 0.69452345
|
|
0.70268583 0.66332496 0.69063815 0.70408235]
|
|
|
|
mean value: 0.6903199416989273
|
|
|
|
key: test_accuracy
|
|
value: [0.83098592 0.87323944 0.85915493 0.84507042 0.87323944 0.83098592
|
|
0.78571429 0.84285714 0.85714286 0.82857143]
|
|
|
|
mean value: 0.8426961770623742
|
|
|
|
key: train_accuracy
|
|
value: [0.84724409 0.83779528 0.84409449 0.83937008 0.84094488 0.84409449
|
|
0.84748428 0.82861635 0.84119497 0.84748428]
|
|
|
|
mean value: 0.8418323181300451
|
|
|
|
key: test_fscore
|
|
value: [0.84615385 0.88311688 0.86486486 0.85714286 0.88607595 0.84210526
|
|
0.8 0.85333333 0.86111111 0.82352941]
|
|
|
|
mean value: 0.8517433520012585
|
|
|
|
key: train_fscore
|
|
value: [0.8562963 0.84557721 0.85419735 0.85043988 0.85037037 0.85419735
|
|
0.8579795 0.83946981 0.85255474 0.8588064 ]
|
|
|
|
mean value: 0.8519888918765984
|
|
|
|
key: test_precision
|
|
value: [0.78571429 0.82926829 0.84210526 0.78571429 0.79545455 0.7804878
|
|
0.75 0.8 0.83783784 0.84848485]
|
|
|
|
mean value: 0.8055067163924674
|
|
|
|
key: train_precision
|
|
value: [0.80726257 0.80571429 0.80110497 0.7967033 0.80392157 0.8033241
|
|
0.80273973 0.78947368 0.79564033 0.79945799]
|
|
|
|
mean value: 0.8005342524769464
|
|
|
|
key: test_recall
|
|
value: [0.91666667 0.94444444 0.88888889 0.94285714 1. 0.91428571
|
|
0.85714286 0.91428571 0.88571429 0.8 ]
|
|
|
|
mean value: 0.9064285714285714
|
|
|
|
key: train_recall
|
|
value: [0.91167192 0.88958991 0.9148265 0.91194969 0.90251572 0.91194969
|
|
0.92138365 0.89622642 0.91823899 0.92767296]
|
|
|
|
mean value: 0.9106025434993948
|
|
|
|
key: test_roc_auc
|
|
value: [0.8297619 0.87222222 0.85873016 0.84642857 0.875 0.83214286
|
|
0.78571429 0.84285714 0.85714286 0.82857143]
|
|
|
|
mean value: 0.8428571428571429
|
|
|
|
key: train_roc_auc
|
|
value: [0.8473454 0.83787671 0.8442057 0.8392556 0.84084777 0.84398746
|
|
0.84748428 0.82861635 0.84119497 0.84748428]
|
|
|
|
mean value: 0.8418298513977343
|
|
|
|
key: test_jcc
|
|
value: [0.73333333 0.79069767 0.76190476 0.75 0.79545455 0.72727273
|
|
0.66666667 0.74418605 0.75609756 0.7 ]
|
|
|
|
mean value: 0.7425613316537877
|
|
|
|
key: train_jcc
|
|
value: [0.74870466 0.73246753 0.74550129 0.73979592 0.73969072 0.74550129
|
|
0.75128205 0.72335025 0.74300254 0.75255102]
|
|
|
|
mean value: 0.742184727641747
|
|
|
|
MCC on Blind test: 0.27
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01053691 0.01432872 0.014678 0.01592255 0.013062 0.01458049
|
|
0.01466274 0.01540923 0.01347852 0.01427984]
|
|
|
|
mean value: 0.014093899726867675
|
|
|
|
key: score_time
|
|
value: [0.00840187 0.01055169 0.01028132 0.01054978 0.01052547 0.0104835
|
|
0.01046133 0.01095486 0.01046395 0.01049304]
|
|
|
|
mean value: 0.010316681861877442
|
|
|
|
key: test_mcc
|
|
value: [0.85952381 0.9186708 0.89315217 0.91587302 0.9451949 0.94511009
|
|
0.80295507 0.97182532 0.8871639 0.78301997]
|
|
|
|
mean value: 0.8922489037444675
|
|
|
|
key: train_mcc
|
|
value: [0.9401617 0.95944236 0.89984937 0.89436086 0.96558776 0.93386306
|
|
0.86603117 0.9213882 0.97501633 0.87487332]
|
|
|
|
mean value: 0.9230574140751575
|
|
|
|
key: test_accuracy
|
|
value: [0.92957746 0.95774648 0.94366197 0.95774648 0.97183099 0.97183099
|
|
0.9 0.98571429 0.94285714 0.88571429]
|
|
|
|
mean value: 0.9446680080482898
|
|
|
|
key: train_accuracy
|
|
value: [0.97007874 0.97952756 0.9496063 0.94645669 0.98267717 0.96692913
|
|
0.93081761 0.96069182 0.98742138 0.93396226]
|
|
|
|
mean value: 0.9608168672312187
|
|
|
|
key: test_fscore
|
|
value: [0.92957746 0.96 0.94117647 0.95774648 0.97222222 0.97058824
|
|
0.89552239 0.98550725 0.94444444 0.89473684]
|
|
|
|
mean value: 0.9451521792752767
|
|
|
|
key: train_fscore
|
|
value: [0.9699842 0.97978227 0.94855305 0.94498382 0.98289269 0.96692913
|
|
0.92715232 0.96075353 0.98753894 0.93786982]
|
|
|
|
mean value: 0.960643978398039
|
|
|
|
key: test_precision
|
|
value: [0.94285714 0.92307692 1. 0.94444444 0.94594595 1.
|
|
0.9375 1. 0.91891892 0.82926829]
|
|
|
|
mean value: 0.9442011667926302
|
|
|
|
key: train_precision
|
|
value: [0.97151899 0.96625767 0.96721311 0.97333333 0.97230769 0.96845426
|
|
0.97902098 0.95924765 0.97839506 0.88547486]
|
|
|
|
mean value: 0.9621223605111022
|
|
|
|
key: test_recall
|
|
value: [0.91666667 1. 0.88888889 0.97142857 1. 0.94285714
|
|
0.85714286 0.97142857 0.97142857 0.97142857]
|
|
|
|
mean value: 0.9491269841269842
|
|
|
|
key: train_recall
|
|
value: [0.96845426 0.99369085 0.93059937 0.91823899 0.99371069 0.96540881
|
|
0.88050314 0.96226415 0.99685535 0.99685535]
|
|
|
|
mean value: 0.960658095748269
|
|
|
|
key: test_roc_auc
|
|
value: [0.9297619 0.95714286 0.94444444 0.95793651 0.97222222 0.97142857
|
|
0.9 0.98571429 0.94285714 0.88571429]
|
|
|
|
mean value: 0.9447222222222222
|
|
|
|
key: train_roc_auc
|
|
value: [0.97007619 0.97954983 0.94957641 0.9465012 0.98265976 0.96693153
|
|
0.93081761 0.96069182 0.98742138 0.93396226]
|
|
|
|
mean value: 0.9608188004682261
|
|
|
|
key: test_jcc
|
|
value: [0.86842105 0.92307692 0.88888889 0.91891892 0.94594595 0.94285714
|
|
0.81081081 0.97142857 0.89473684 0.80952381]
|
|
|
|
mean value: 0.8974608906187853
|
|
|
|
key: train_jcc
|
|
value: [0.94171779 0.96036585 0.90214067 0.89570552 0.96636086 0.93597561
|
|
0.86419753 0.9244713 0.97538462 0.88300836]
|
|
|
|
mean value: 0.9249328107238487
|
|
|
|
MCC on Blind test: 0.21
|
|
|
|
Accuracy on Blind test: 0.74
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01662755 0.01433229 0.01325893 0.01569057 0.01217318 0.01252699
|
|
0.01344371 0.01365376 0.01214552 0.01519012]
|
|
|
|
mean value: 0.013904261589050292
|
|
|
|
key: score_time
|
|
value: [0.01091886 0.01051593 0.01051092 0.01051617 0.01046252 0.01055741
|
|
0.01050019 0.01043868 0.01046753 0.01048732]
|
|
|
|
mean value: 0.010537552833557128
|
|
|
|
key: test_mcc
|
|
value: [0.88730159 0.83095238 0.97220047 0.9451949 0.86802778 0.89315217
|
|
0.91766294 0.94440028 0.8871639 0.94285714]
|
|
|
|
mean value: 0.9088913538408987
|
|
|
|
key: train_mcc
|
|
value: [0.95279762 0.92778189 0.95028807 0.9533256 0.89981019 0.862672
|
|
0.89131675 0.89152985 0.9179354 0.95321203]
|
|
|
|
mean value: 0.9200669401462821
|
|
|
|
key: test_accuracy
|
|
value: [0.94366197 0.91549296 0.98591549 0.97183099 0.92957746 0.94366197
|
|
0.95714286 0.97142857 0.94285714 0.97142857]
|
|
|
|
mean value: 0.9532997987927565
|
|
|
|
key: train_accuracy
|
|
value: [0.97637795 0.96377953 0.97480315 0.97637795 0.9480315 0.92755906
|
|
0.94339623 0.94496855 0.95754717 0.97641509]
|
|
|
|
mean value: 0.9589256177883425
|
|
|
|
key: test_fscore
|
|
value: [0.94444444 0.91666667 0.98630137 0.97222222 0.93333333 0.94594595
|
|
0.95890411 0.97058824 0.94444444 0.97142857]
|
|
|
|
mean value: 0.9544279343231801
|
|
|
|
key: train_fscore
|
|
value: [0.97622821 0.96331738 0.9752322 0.97681607 0.95037594 0.93215339
|
|
0.94610778 0.94327391 0.9591528 0.97674419]
|
|
|
|
mean value: 0.959940187333688
|
|
|
|
key: test_precision
|
|
value: [0.94444444 0.91666667 0.97297297 0.94594595 0.875 0.8974359
|
|
0.92105263 1. 0.91891892 0.97142857]
|
|
|
|
mean value: 0.9363866049392365
|
|
|
|
key: train_precision
|
|
value: [0.98089172 0.97419355 0.95744681 0.96048632 0.91066282 0.87777778
|
|
0.90285714 0.97324415 0.92419825 0.96330275]
|
|
|
|
mean value: 0.9425061293853453
|
|
|
|
key: test_recall
|
|
value: [0.94444444 0.91666667 1. 1. 1. 1.
|
|
1. 0.94285714 0.97142857 0.97142857]
|
|
|
|
mean value: 0.9746825396825397
|
|
|
|
key: train_recall
|
|
value: [0.97160883 0.95268139 0.99369085 0.99371069 0.99371069 0.99371069
|
|
0.99371069 0.91509434 0.99685535 0.99056604]
|
|
|
|
mean value: 0.9795339563121243
|
|
|
|
key: test_roc_auc
|
|
value: [0.94365079 0.91547619 0.98571429 0.97222222 0.93055556 0.94444444
|
|
0.95714286 0.97142857 0.94285714 0.97142857]
|
|
|
|
mean value: 0.9534920634920635
|
|
|
|
key: train_roc_auc
|
|
value: [0.97637045 0.96376208 0.97483285 0.97635061 0.94795945 0.92745471
|
|
0.94339623 0.94496855 0.95754717 0.97641509]
|
|
|
|
mean value: 0.9589057198976251
|
|
|
|
key: test_jcc
|
|
value: [0.89473684 0.84615385 0.97297297 0.94594595 0.875 0.8974359
|
|
0.92105263 0.94285714 0.89473684 0.94444444]
|
|
|
|
mean value: 0.9135336565599723
|
|
|
|
key: train_jcc
|
|
value: [0.95356037 0.92923077 0.95166163 0.95468278 0.90544413 0.87292818
|
|
0.89772727 0.89263804 0.92151163 0.95454545]
|
|
|
|
mean value: 0.9233930246483528
|
|
|
|
MCC on Blind test: 0.13
|
|
|
|
Accuracy on Blind test: 0.49
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.11352706 0.09823751 0.09783912 0.09749627 0.09775257 0.09784293
|
|
0.0982132 0.0982163 0.09822345 0.09824324]
|
|
|
|
mean value: 0.09955916404724122
|
|
|
|
key: score_time
|
|
value: [0.01440597 0.0141964 0.014431 0.01412749 0.01414037 0.01425576
|
|
0.01428652 0.01439333 0.01421928 0.0141108 ]
|
|
|
|
mean value: 0.014256691932678223
|
|
|
|
key: test_mcc
|
|
value: [0.94511009 0.97220047 1. 0.9451949 0.9451949 1.
|
|
0.91766294 0.97182532 0.94440028 1. ]
|
|
|
|
mean value: 0.9641588882762016
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.97183099 0.98591549 1. 0.97183099 0.97183099 1.
|
|
0.95714286 0.98571429 0.97142857 1. ]
|
|
|
|
mean value: 0.981569416498994
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.97297297 0.98630137 1. 0.97222222 0.97222222 1.
|
|
0.95890411 0.98591549 0.97222222 1. ]
|
|
|
|
mean value: 0.9820760612049441
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.94736842 0.97297297 1. 0.94594595 0.94594595 1.
|
|
0.92105263 0.97222222 0.94594595 1. ]
|
|
|
|
mean value: 0.9651454085664612
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.97142857 0.98571429 1. 0.97222222 0.97222222 1.
|
|
0.95714286 0.98571429 0.97142857 1. ]
|
|
|
|
mean value: 0.9815873015873016
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.94736842 0.97297297 1. 0.94594595 0.94594595 1.
|
|
0.92105263 0.97222222 0.94594595 1. ]
|
|
|
|
mean value: 0.9651454085664612
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.23
|
|
|
|
Accuracy on Blind test: 0.84
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03574896 0.04156756 0.04362035 0.04388809 0.04569268 0.04029918
|
|
0.04188228 0.0483954 0.04589486 0.04156971]
|
|
|
|
mean value: 0.04285590648651123
|
|
|
|
key: score_time
|
|
value: [0.02013993 0.02285957 0.02204061 0.03455472 0.01919889 0.02451229
|
|
0.01914024 0.02869654 0.02967858 0.04260755]
|
|
|
|
mean value: 0.026342892646789552
|
|
|
|
key: test_mcc
|
|
value: [0.97220047 0.97220047 1. 0.89315217 0.9451949 1.
|
|
0.91766294 1. 0.91766294 0.97182532]
|
|
|
|
mean value: 0.9589899184279953
|
|
|
|
key: train_mcc
|
|
value: [0.99685535 0.99372055 0.99685535 0.99372043 0.99685531 0.99059524
|
|
1. 0.99373035 1. 0.98749951]
|
|
|
|
mean value: 0.9949832075818474
|
|
|
|
key: test_accuracy
|
|
value: [0.98591549 0.98591549 1. 0.94366197 0.97183099 1.
|
|
0.95714286 1. 0.95714286 0.98571429]
|
|
|
|
mean value: 0.9787323943661972
|
|
|
|
key: train_accuracy
|
|
value: [0.9984252 0.99685039 0.9984252 0.99685039 0.9984252 0.99527559
|
|
1. 0.99685535 1. 0.99371069]
|
|
|
|
mean value: 0.9974818006239786
|
|
|
|
key: test_fscore
|
|
value: [0.98630137 0.98630137 1. 0.94594595 0.97222222 1.
|
|
0.95890411 1. 0.95890411 0.98591549]
|
|
|
|
mean value: 0.9794494620030024
|
|
|
|
key: train_fscore
|
|
value: [0.9984252 0.99685535 0.9984252 0.9968652 0.99843014 0.99530516
|
|
1. 0.9968652 1. 0.99375 ]
|
|
|
|
mean value: 0.9974921452742781
|
|
|
|
key: test_precision
|
|
value: [0.97297297 0.97297297 1. 0.8974359 0.94594595 1.
|
|
0.92105263 1. 0.92105263 0.97222222]
|
|
|
|
mean value: 0.9603655274707906
|
|
|
|
key: train_precision
|
|
value: [0.99685535 0.99373041 0.99685535 0.99375 0.9968652 0.99065421
|
|
1. 0.99375 1. 0.98757764]
|
|
|
|
mean value: 0.9950038148468195
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98571429 0.98571429 1. 0.94444444 0.97222222 1.
|
|
0.95714286 1. 0.95714286 0.98571429]
|
|
|
|
mean value: 0.9788095238095238
|
|
|
|
key: train_roc_auc
|
|
value: [0.99842767 0.99685535 0.99842767 0.99684543 0.99842271 0.99526814
|
|
1. 0.99685535 1. 0.99371069]
|
|
|
|
mean value: 0.9974813007162272
|
|
|
|
key: test_jcc
|
|
value: [0.97297297 0.97297297 1. 0.8974359 0.94594595 1.
|
|
0.92105263 1. 0.92105263 0.97222222]
|
|
|
|
mean value: 0.9603655274707906
|
|
|
|
key: train_jcc
|
|
value: [0.99685535 0.99373041 0.99685535 0.99375 0.9968652 0.99065421
|
|
1. 0.99375 1. 0.98757764]
|
|
|
|
mean value: 0.9950038148468195
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.24773479 0.26326966 0.28095245 0.28693724 0.25053453 0.24942899
|
|
0.14829826 0.25048161 0.20905161 0.1913681 ]
|
|
|
|
mean value: 0.23780572414398193
|
|
|
|
key: score_time
|
|
value: [0.02167845 0.02865052 0.02786064 0.02753568 0.02866054 0.02848601
|
|
0.01395488 0.03657627 0.01424241 0.01387572]
|
|
|
|
mean value: 0.024152112007141114
|
|
|
|
key: test_mcc
|
|
value: [0.81050059 0.88862624 0.86205133 0.85952381 0.89315217 0.80588933
|
|
0.77142857 0.81649658 0.74316054 0.65821838]
|
|
|
|
mean value: 0.8109047540709404
|
|
|
|
key: train_mcc
|
|
value: [0.88980159 0.91197105 0.89606666 0.90870311 0.89610428 0.91188492
|
|
0.91509886 0.90902529 0.91202184 0.91509886]
|
|
|
|
mean value: 0.9065776480751302
|
|
|
|
key: test_accuracy
|
|
value: [0.90140845 0.94366197 0.92957746 0.92957746 0.94366197 0.90140845
|
|
0.88571429 0.9 0.87142857 0.82857143]
|
|
|
|
mean value: 0.9035010060362173
|
|
|
|
key: train_accuracy
|
|
value: [0.94488189 0.95590551 0.9480315 0.95433071 0.9480315 0.95590551
|
|
0.95754717 0.95440252 0.95597484 0.95754717]
|
|
|
|
mean value: 0.9532558312286435
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.94594595 0.93333333 0.92957746 0.94594595 0.90410959
|
|
0.88571429 0.90909091 0.87323944 0.83333333]
|
|
|
|
mean value: 0.9069381152904209
|
|
|
|
key: train_fscore
|
|
value: [0.94453249 0.95541401 0.9478673 0.95418641 0.9478673 0.9556962
|
|
0.95748031 0.95389507 0.9556962 0.95761381]
|
|
|
|
mean value: 0.9530249118234133
|
|
|
|
key: test_precision
|
|
value: [0.85365854 0.92105263 0.8974359 0.91666667 0.8974359 0.86842105
|
|
0.88571429 0.83333333 0.86111111 0.81081081]
|
|
|
|
mean value: 0.8745640223303894
|
|
|
|
key: train_precision
|
|
value: [0.94904459 0.96463023 0.94936709 0.95873016 0.95238095 0.96178344
|
|
0.95899054 0.96463023 0.96178344 0.95611285]
|
|
|
|
mean value: 0.957745350378981
|
|
|
|
key: test_recall
|
|
value: [0.97222222 0.97222222 0.97222222 0.94285714 1. 0.94285714
|
|
0.88571429 1. 0.88571429 0.85714286]
|
|
|
|
mean value: 0.9430952380952381
|
|
|
|
key: train_recall
|
|
value: [0.94006309 0.94637224 0.94637224 0.94968553 0.94339623 0.94968553
|
|
0.95597484 0.94339623 0.94968553 0.9591195 ]
|
|
|
|
mean value: 0.9483750967204333
|
|
|
|
key: test_roc_auc
|
|
value: [0.90039683 0.94325397 0.92896825 0.9297619 0.94444444 0.90198413
|
|
0.88571429 0.9 0.87142857 0.82857143]
|
|
|
|
mean value: 0.9034523809523809
|
|
|
|
key: train_roc_auc
|
|
value: [0.94487431 0.95589052 0.94802889 0.95433804 0.94803881 0.95591532
|
|
0.95754717 0.95440252 0.95597484 0.95754717]
|
|
|
|
mean value: 0.9532557585857985
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.8974359 0.875 0.86842105 0.8974359 0.825
|
|
0.79487179 0.83333333 0.775 0.71428571]
|
|
|
|
mean value: 0.831411702332755
|
|
|
|
key: train_jcc
|
|
value: [0.89489489 0.91463415 0.9009009 0.91238671 0.9009009 0.91515152
|
|
0.918429 0.9118541 0.91515152 0.9186747 ]
|
|
|
|
mean value: 0.9102978385449625
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.27617764 0.26752806 0.26734591 0.26702046 0.26937532 0.26938963
|
|
0.2754612 0.26950097 0.27531576 0.26875448]
|
|
|
|
mean value: 0.2705869436264038
|
|
|
|
key: score_time
|
|
value: [0.00917101 0.00919962 0.00898099 0.00887442 0.00896692 0.00959826
|
|
0.0095427 0.00884581 0.0090723 0.00909805]
|
|
|
|
mean value: 0.009135007858276367
|
|
|
|
key: test_mcc
|
|
value: [0.94511009 0.97220047 1. 0.91885703 0.9451949 1.
|
|
0.91766294 1. 0.94440028 0.94285714]
|
|
|
|
mean value: 0.9586282844964963
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 1. 1. 0.99686027]
|
|
|
|
mean value: 0.9996860274824667
|
|
|
|
key: test_accuracy
|
|
value: [0.97183099 0.98591549 1. 0.95774648 0.97183099 1.
|
|
0.95714286 1. 0.97142857 0.97142857]
|
|
|
|
mean value: 0.9787323943661972
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 1. 1. 0.99842767]
|
|
|
|
mean value: 0.9998427672955975
|
|
|
|
key: test_fscore
|
|
value: [0.97297297 0.98630137 1. 0.95890411 0.97222222 1.
|
|
0.95890411 1. 0.97222222 0.97142857]
|
|
|
|
mean value: 0.9792955577887085
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1.
|
|
1. 1. 0.9984252]
|
|
|
|
mean value: 0.9998425196850393
|
|
|
|
key: test_precision
|
|
value: [0.94736842 0.97297297 1. 0.92105263 0.94594595 1.
|
|
0.92105263 1. 0.94594595 0.97142857]
|
|
|
|
mean value: 0.9625767120503963
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 1. 1. 0.97142857]
|
|
|
|
mean value: 0.9971428571428571
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 1. 1. 0.99685535]
|
|
|
|
mean value: 0.999685534591195
|
|
|
|
key: test_roc_auc
|
|
value: [0.97142857 0.98571429 1. 0.95833333 0.97222222 1.
|
|
0.95714286 1. 0.97142857 0.97142857]
|
|
|
|
mean value: 0.9787698412698412
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 1. 1. 0.99842767]
|
|
|
|
mean value: 0.9998427672955975
|
|
|
|
key: test_jcc
|
|
value: [0.94736842 0.97297297 1. 0.92105263 0.94594595 1.
|
|
0.92105263 1. 0.94594595 0.94444444]
|
|
|
|
mean value: 0.9598782993519835
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 1. 1. 0.99685535]
|
|
|
|
mean value: 0.999685534591195
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01353073 0.01510763 0.01533842 0.01545548 0.01502848 0.01551914
|
|
0.01628375 0.01565933 0.01530766 0.01601934]
|
|
|
|
mean value: 0.015324997901916503
|
|
|
|
key: score_time
|
|
value: [0.01147842 0.01132965 0.01142693 0.01131916 0.01343727 0.01433969
|
|
0.01386809 0.01397896 0.01388669 0.01877761]
|
|
|
|
mean value: 0.013384246826171875
|
|
|
|
key: test_mcc
|
|
value: [0.48866289 0.69047619 0.8365327 0.67839806 0.78542356 0.65726707
|
|
0.61631563 0.74535599 0.37904902 0.6882472 ]
|
|
|
|
mean value: 0.656572832612207
|
|
|
|
key: train_mcc
|
|
value: [0.52013388 0.76530718 0.82524897 0.7158657 0.84342302 0.69536527
|
|
0.80154559 0.82054203 0.63315169 0.8819171 ]
|
|
|
|
mean value: 0.750250042260939
|
|
|
|
key: test_accuracy
|
|
value: [0.69014085 0.84507042 0.91549296 0.83098592 0.88732394 0.8028169
|
|
0.8 0.85714286 0.64285714 0.84285714]
|
|
|
|
mean value: 0.8114688128772636
|
|
|
|
key: train_accuracy
|
|
value: [0.71496063 0.87244094 0.90866142 0.84251969 0.91968504 0.82992126
|
|
0.89308176 0.90408805 0.78616352 0.94025157]
|
|
|
|
mean value: 0.861177388203833
|
|
|
|
key: test_fscore
|
|
value: [0.56 0.84507042 0.91176471 0.80645161 0.875 0.75
|
|
0.77419355 0.83333333 0.46808511 0.8358209 ]
|
|
|
|
mean value: 0.7659719624946587
|
|
|
|
key: train_fscore
|
|
value: [0.6021978 0.85561497 0.90169492 0.81617647 0.91570248 0.79850746
|
|
0.8815331 0.89500861 0.728 0.93851133]
|
|
|
|
mean value: 0.8332947137085833
|
|
|
|
key: test_precision
|
|
value: [1. 0.85714286 0.96875 0.92592593 0.96551724 1.
|
|
0.88888889 1. 0.91666667 0.875 ]
|
|
|
|
mean value: 0.9397891580003649
|
|
|
|
key: train_precision
|
|
value: [0.99275362 0.98360656 0.97435897 0.98230088 0.96515679 0.98165138
|
|
0.98828125 0.98859316 1. 0.96666667]
|
|
|
|
mean value: 0.982336928301226
|
|
|
|
key: test_recall
|
|
value: [0.38888889 0.83333333 0.86111111 0.71428571 0.8 0.6
|
|
0.68571429 0.71428571 0.31428571 0.8 ]
|
|
|
|
mean value: 0.6711904761904762
|
|
|
|
key: train_recall
|
|
value: [0.43217666 0.75709779 0.83911672 0.69811321 0.87106918 0.67295597
|
|
0.79559748 0.81761006 0.57232704 0.91194969]
|
|
|
|
mean value: 0.7368013808701863
|
|
|
|
key: test_roc_auc
|
|
value: [0.69444444 0.8452381 0.91626984 0.82936508 0.88611111 0.8
|
|
0.8 0.85714286 0.64285714 0.84285714]
|
|
|
|
mean value: 0.8114285714285715
|
|
|
|
key: train_roc_auc
|
|
value: [0.714516 0.87225959 0.90855207 0.84274746 0.91976172 0.83016884
|
|
0.89308176 0.90408805 0.78616352 0.94025157]
|
|
|
|
mean value: 0.8611590579925799
|
|
|
|
key: test_jcc
|
|
value: [0.38888889 0.73170732 0.83783784 0.67567568 0.77777778 0.6
|
|
0.63157895 0.71428571 0.30555556 0.71794872]
|
|
|
|
mean value: 0.6381256432411759
|
|
|
|
key: train_jcc
|
|
value: [0.43081761 0.74766355 0.82098765 0.68944099 0.8445122 0.66459627
|
|
0.78816199 0.80996885 0.57232704 0.88414634]
|
|
|
|
mean value: 0.7252622504598514
|
|
|
|
MCC on Blind test: 0.34
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01351237 0.02861714 0.03247285 0.0325036 0.0317297 0.03187609
|
|
0.03195882 0.03194857 0.03198266 0.03197289]
|
|
|
|
mean value: 0.029857468605041505
|
|
|
|
key: score_time
|
|
value: [0.01157928 0.0221467 0.01110148 0.01955748 0.02112865 0.0110445
|
|
0.01989079 0.02152538 0.01919198 0.02169585]
|
|
|
|
mean value: 0.01788620948791504
|
|
|
|
key: test_mcc
|
|
value: [0.91580648 0.91580648 0.94365079 0.91587302 0.9451949 0.9186708
|
|
0.82992752 0.97182532 0.82992752 0.8871639 ]
|
|
|
|
mean value: 0.9073846735039883
|
|
|
|
key: train_mcc
|
|
value: [0.92767212 0.94649961 0.92448113 0.95276028 0.94960617 0.92137585
|
|
0.92778765 0.91531613 0.94341489 0.93083602]
|
|
|
|
mean value: 0.9339749853030713
|
|
|
|
key: test_accuracy
|
|
value: [0.95774648 0.95774648 0.97183099 0.95774648 0.97183099 0.95774648
|
|
0.91428571 0.98571429 0.91428571 0.94285714]
|
|
|
|
mean value: 0.95317907444668
|
|
|
|
key: train_accuracy
|
|
value: [0.96377953 0.97322835 0.96220472 0.97637795 0.97480315 0.96062992
|
|
0.96383648 0.95754717 0.97169811 0.96540881]
|
|
|
|
mean value: 0.966951418808498
|
|
|
|
key: test_fscore
|
|
value: [0.95890411 0.95890411 0.97222222 0.95774648 0.97222222 0.95522388
|
|
0.91176471 0.98550725 0.91176471 0.94117647]
|
|
|
|
mean value: 0.9525436151822534
|
|
|
|
key: train_fscore
|
|
value: [0.96343402 0.9733124 0.96190476 0.97645212 0.97484277 0.96038035
|
|
0.96354992 0.95707472 0.97160883 0.96529968]
|
|
|
|
mean value: 0.9667859581195395
|
|
|
|
key: test_precision
|
|
value: [0.94594595 0.94594595 0.97222222 0.94444444 0.94594595 1.
|
|
0.93939394 1. 0.93939394 0.96969697]
|
|
|
|
mean value: 0.9602989352989353
|
|
|
|
key: train_precision
|
|
value: [0.97115385 0.96875 0.96805112 0.97492163 0.97484277 0.96805112
|
|
0.97124601 0.96784566 0.97468354 0.96835443]
|
|
|
|
mean value: 0.9707900120202521
|
|
|
|
key: test_recall
|
|
value: [0.97222222 0.97222222 0.97222222 0.97142857 1. 0.91428571
|
|
0.88571429 0.97142857 0.88571429 0.91428571]
|
|
|
|
mean value: 0.9459523809523809
|
|
|
|
key: train_recall
|
|
value: [0.95583596 0.97791798 0.95583596 0.97798742 0.97484277 0.95283019
|
|
0.95597484 0.94654088 0.96855346 0.96226415]
|
|
|
|
mean value: 0.96285836160546
|
|
|
|
key: test_roc_auc
|
|
value: [0.95753968 0.95753968 0.9718254 0.95793651 0.97222222 0.95714286
|
|
0.91428571 0.98571429 0.91428571 0.94285714]
|
|
|
|
mean value: 0.9531349206349207
|
|
|
|
key: train_roc_auc
|
|
value: [0.96376704 0.97323572 0.96219471 0.97637541 0.97480309 0.96064222
|
|
0.96383648 0.95754717 0.97169811 0.96540881]
|
|
|
|
mean value: 0.9669508759399242
|
|
|
|
key: test_jcc
|
|
value: [0.92105263 0.92105263 0.94594595 0.91891892 0.94594595 0.91428571
|
|
0.83783784 0.97142857 0.83783784 0.88888889]
|
|
|
|
mean value: 0.9103194924247555
|
|
|
|
key: train_jcc
|
|
value: [0.92944785 0.94801223 0.9266055 0.95398773 0.95092025 0.92378049
|
|
0.92966361 0.91768293 0.94478528 0.93292683]
|
|
|
|
mean value: 0.9357812693762667
|
|
|
|
MCC on Blind test: 0.26
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: /home/tanu/git/LSHTM_analysis/scripts/ml/./embb_config.py:163: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
ros_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./embb_config.py:166: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
ros_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.11364222 0.12605238 0.21460104 0.20560765 0.13292837 0.14832115
|
|
0.20759034 0.1914463 0.1992662 0.23236299]
|
|
|
|
mean value: 0.17718186378479003
|
|
|
|
key: score_time
|
|
value: [0.01099443 0.01911712 0.01961541 0.02059603 0.0108881 0.02057672
|
|
0.02168489 0.02046657 0.02121878 0.01984239]
|
|
|
|
mean value: 0.018500041961669923
|
|
|
|
key: test_mcc
|
|
value: [0.94365079 0.8594125 0.97222222 0.91587302 0.9451949 0.94511009
|
|
0.82992752 0.97182532 0.82992752 0.860309 ]
|
|
|
|
mean value: 0.9073452880284912
|
|
|
|
key: train_mcc
|
|
value: [0.92767212 0.93072627 0.93700772 0.96547312 0.96228025 0.92760136
|
|
0.94029342 0.91531613 0.946583 0.92771424]
|
|
|
|
mean value: 0.938066763071008
|
|
|
|
key: test_accuracy
|
|
value: [0.97183099 0.92957746 0.98591549 0.95774648 0.97183099 0.97183099
|
|
0.91428571 0.98571429 0.91428571 0.92857143]
|
|
|
|
mean value: 0.9531589537223341
|
|
|
|
key: train_accuracy
|
|
value: [0.96377953 0.96535433 0.96850394 0.98267717 0.98110236 0.96377953
|
|
0.97012579 0.95754717 0.97327044 0.96383648]
|
|
|
|
mean value: 0.9689976724607537
|
|
|
|
key: test_fscore
|
|
value: [0.97222222 0.93150685 0.98591549 0.95774648 0.97222222 0.97058824
|
|
0.91176471 0.98550725 0.91176471 0.92537313]
|
|
|
|
mean value: 0.9524611293354492
|
|
|
|
key: train_fscore
|
|
value: [0.96343402 0.96518987 0.96845426 0.98283931 0.98125 0.96366509
|
|
0.9699842 0.95707472 0.97314376 0.96366509]
|
|
|
|
mean value: 0.9688700325564479
|
|
|
|
key: test_precision
|
|
value: [0.97222222 0.91891892 1. 0.94444444 0.94594595 1.
|
|
0.93939394 1. 0.93939394 0.96875 ]
|
|
|
|
mean value: 0.9629069410319411
|
|
|
|
key: train_precision
|
|
value: [0.97115385 0.96825397 0.96845426 0.9752322 0.97515528 0.96825397
|
|
0.97460317 0.96784566 0.97777778 0.96825397]
|
|
|
|
mean value: 0.971498409878129
|
|
|
|
key: test_recall
|
|
value: [0.97222222 0.94444444 0.97222222 0.97142857 1. 0.94285714
|
|
0.88571429 0.97142857 0.88571429 0.88571429]
|
|
|
|
mean value: 0.9431746031746031
|
|
|
|
key: train_recall
|
|
value: [0.95583596 0.96214511 0.96845426 0.99056604 0.98742138 0.9591195
|
|
0.96540881 0.94654088 0.96855346 0.9591195 ]
|
|
|
|
mean value: 0.9663164890978712
|
|
|
|
key: test_roc_auc
|
|
value: [0.9718254 0.92936508 0.98611111 0.95793651 0.97222222 0.97142857
|
|
0.91428571 0.98571429 0.91428571 0.92857143]
|
|
|
|
mean value: 0.9531746031746031
|
|
|
|
key: train_roc_auc
|
|
value: [0.96376704 0.96534928 0.96850386 0.98266472 0.9810924 0.96378688
|
|
0.97012579 0.95754717 0.97327044 0.96383648]
|
|
|
|
mean value: 0.9689944050949348
|
|
|
|
key: test_jcc
|
|
value: [0.94594595 0.87179487 0.97222222 0.91891892 0.94594595 0.94285714
|
|
0.83783784 0.97142857 0.83783784 0.86111111]
|
|
|
|
mean value: 0.9105900405900406
|
|
|
|
key: train_jcc
|
|
value: [0.92944785 0.93272171 0.93883792 0.96625767 0.96319018 0.92987805
|
|
0.94171779 0.91768293 0.94769231 0.92987805]
|
|
|
|
mean value: 0.939730446204259
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01629114 0.0173347 0.01738286 0.0171926 0.01575541 0.02157068
|
|
0.01644921 0.01875186 0.01989651 0.02733588]
|
|
|
|
mean value: 0.018796086311340332
|
|
|
|
key: score_time
|
|
value: [0.01044035 0.01035213 0.01081347 0.01036024 0.01029181 0.01031828
|
|
0.01030374 0.01035714 0.01030326 0.01077175]
|
|
|
|
mean value: 0.010431218147277831
|
|
|
|
key: test_mcc
|
|
value: [0.68543653 0.89893315 0.9 0.57777778 0.89893315 1.
|
|
0.39056329 0.62994079 1. 0.68888889]
|
|
|
|
mean value: 0.7670473570531189
|
|
|
|
key: train_mcc
|
|
value: [0.81369939 0.8128591 0.8128591 0.82502766 0.82502766 0.78971132
|
|
0.85964432 0.83645826 0.81310714 0.85964432]
|
|
|
|
mean value: 0.8248038274086386
|
|
|
|
key: test_accuracy
|
|
value: [0.84210526 0.94736842 0.94736842 0.78947368 0.94736842 1.
|
|
0.68421053 0.78947368 1. 0.84210526]
|
|
|
|
mean value: 0.8789473684210526
|
|
|
|
key: train_accuracy
|
|
value: [0.90643275 0.90643275 0.90643275 0.9122807 0.9122807 0.89473684
|
|
0.92982456 0.91812865 0.90643275 0.92982456]
|
|
|
|
mean value: 0.9122807017543859
|
|
|
|
key: test_fscore
|
|
value: [0.82352941 0.94117647 0.94736842 0.77777778 0.94117647 1.
|
|
0.75 0.83333333 1. 0.84210526]
|
|
|
|
mean value: 0.8856467148262814
|
|
|
|
key: train_fscore
|
|
value: [0.90909091 0.90697674 0.90697674 0.91428571 0.91428571 0.89534884
|
|
0.92941176 0.91666667 0.90697674 0.92941176]
|
|
|
|
mean value: 0.9129431603508211
|
|
|
|
key: test_precision
|
|
value: [0.875 1. 0.9 0.77777778 1. 1.
|
|
0.64285714 0.71428571 1. 0.88888889]
|
|
|
|
mean value: 0.8798809523809524
|
|
|
|
key: train_precision
|
|
value: [0.88888889 0.90697674 0.90697674 0.8988764 0.8988764 0.88505747
|
|
0.92941176 0.92771084 0.89655172 0.92941176]
|
|
|
|
mean value: 0.9068738754437303
|
|
|
|
key: test_recall
|
|
value: [0.77777778 0.88888889 1. 0.77777778 0.88888889 1.
|
|
0.9 1. 1. 0.8 ]
|
|
|
|
mean value: 0.9033333333333333
|
|
|
|
key: train_recall
|
|
value: [0.93023256 0.90697674 0.90697674 0.93023256 0.93023256 0.90588235
|
|
0.92941176 0.90588235 0.91764706 0.92941176]
|
|
|
|
mean value: 0.9192886456908345
|
|
|
|
key: test_roc_auc
|
|
value: [0.83888889 0.94444444 0.95 0.78888889 0.94444444 1.
|
|
0.67222222 0.77777778 1. 0.84444444]
|
|
|
|
mean value: 0.8761111111111111
|
|
|
|
key: train_roc_auc
|
|
value: [0.90629275 0.90642955 0.90642955 0.9121751 0.9121751 0.89480164
|
|
0.92982216 0.91805746 0.90649795 0.92982216]
|
|
|
|
mean value: 0.912250341997264
|
|
|
|
key: test_jcc
|
|
value: [0.7 0.88888889 0.9 0.63636364 0.88888889 1.
|
|
0.6 0.71428571 1. 0.72727273]
|
|
|
|
mean value: 0.8055699855699856
|
|
|
|
key: train_jcc
|
|
value: [0.83333333 0.82978723 0.82978723 0.84210526 0.84210526 0.81052632
|
|
0.86813187 0.84615385 0.82978723 0.86813187]
|
|
|
|
mean value: 0.8399849459983838
|
|
|
|
MCC on Blind test: 0.19
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.37531638 0.38986826 0.4050014 0.3858676 0.38417578 0.39149809
|
|
0.40014076 0.39357424 0.41155505 0.40910983]
|
|
|
|
mean value: 0.39461073875427244
|
|
|
|
key: score_time
|
|
value: [0.01066661 0.01065397 0.01060176 0.01059651 0.01084471 0.01105618
|
|
0.01088595 0.01094103 0.01107144 0.01088524]
|
|
|
|
mean value: 0.010820341110229493
|
|
|
|
key: test_mcc
|
|
value: [0.80507649 0.89893315 0.80903983 0.57777778 0.89893315 0.80507649
|
|
0.80903983 0.78888889 1. 0.68888889]
|
|
|
|
mean value: 0.8081654497168143
|
|
|
|
key: train_mcc
|
|
value: [0.85964432 0.89480164 0.95321477 0.89480164 0.89480164 0.94158687
|
|
0.96497948 0.95321477 0.94158687 1. ]
|
|
|
|
mean value: 0.9298632010943912
|
|
|
|
key: test_accuracy
|
|
value: [0.89473684 0.94736842 0.89473684 0.78947368 0.94736842 0.89473684
|
|
0.89473684 0.89473684 1. 0.84210526]
|
|
|
|
mean value: 0.9
|
|
|
|
key: train_accuracy
|
|
value: [0.92982456 0.94736842 0.97660819 0.94736842 0.94736842 0.97076023
|
|
0.98245614 0.97660819 0.97076023 1. ]
|
|
|
|
mean value: 0.9649122807017544
|
|
|
|
key: test_fscore
|
|
value: [0.875 0.94117647 0.9 0.77777778 0.94117647 0.90909091
|
|
0.88888889 0.9 1. 0.84210526]
|
|
|
|
mean value: 0.8975215780091941
|
|
|
|
key: train_fscore
|
|
value: [0.93023256 0.94736842 0.97674419 0.94736842 0.94736842 0.97076023
|
|
0.98245614 0.97647059 0.97076023 1. ]
|
|
|
|
mean value: 0.9649529203766369
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.81818182 0.77777778 1. 0.83333333
|
|
1. 0.9 1. 0.88888889]
|
|
|
|
mean value: 0.9218181818181819
|
|
|
|
key: train_precision
|
|
value: [0.93023256 0.95294118 0.97674419 0.95294118 0.95294118 0.96511628
|
|
0.97674419 0.97647059 0.96511628 1. ]
|
|
|
|
mean value: 0.9649247606019151
|
|
|
|
key: test_recall
|
|
value: [0.77777778 0.88888889 1. 0.77777778 0.88888889 1.
|
|
0.8 0.9 1. 0.8 ]
|
|
|
|
mean value: 0.8833333333333333
|
|
|
|
key: train_recall
|
|
value: [0.93023256 0.94186047 0.97674419 0.94186047 0.94186047 0.97647059
|
|
0.98823529 0.97647059 0.97647059 1. ]
|
|
|
|
mean value: 0.9650205198358413
|
|
|
|
key: test_roc_auc
|
|
value: [0.88888889 0.94444444 0.9 0.78888889 0.94444444 0.88888889
|
|
0.9 0.89444444 1. 0.84444444]
|
|
|
|
mean value: 0.8994444444444444
|
|
|
|
key: train_roc_auc
|
|
value: [0.92982216 0.94740082 0.97660739 0.94740082 0.94740082 0.97079343
|
|
0.98248974 0.97660739 0.97079343 1. ]
|
|
|
|
mean value: 0.9649316005471956
|
|
|
|
key: test_jcc
|
|
value: [0.77777778 0.88888889 0.81818182 0.63636364 0.88888889 0.83333333
|
|
0.8 0.81818182 1. 0.72727273]
|
|
|
|
mean value: 0.8188888888888889
|
|
|
|
key: train_jcc
|
|
value: [0.86956522 0.9 0.95454545 0.9 0.9 0.94318182
|
|
0.96551724 0.95402299 0.94318182 1. ]
|
|
|
|
mean value: 0.9330014538185453
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.75
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01085544 0.00919461 0.00706816 0.00698352 0.00676513 0.00690603
|
|
0.00674391 0.0070889 0.00666404 0.00664258]
|
|
|
|
mean value: 0.0074912309646606445
|
|
|
|
key: score_time
|
|
value: [0.01064849 0.00935817 0.00804448 0.00793123 0.00782728 0.00777674
|
|
0.00780153 0.00776029 0.00765848 0.00768161]
|
|
|
|
mean value: 0.00824882984161377
|
|
|
|
key: test_mcc
|
|
value: [0.48934516 0.71611487 0.9 0.26257545 0.4719399 0.78888889
|
|
0.58655573 0.57777778 0.78888889 0.57777778]
|
|
|
|
mean value: 0.6159864454279441
|
|
|
|
key: train_mcc
|
|
value: [0.67948707 0.67737019 0.65383223 0.72260902 0.69912629 0.6553202
|
|
0.72095237 0.6878315 0.70870609 0.77152203]
|
|
|
|
mean value: 0.6976756971665955
|
|
|
|
key: test_accuracy
|
|
value: [0.73684211 0.84210526 0.94736842 0.63157895 0.73684211 0.89473684
|
|
0.78947368 0.78947368 0.89473684 0.78947368]
|
|
|
|
mean value: 0.8052631578947368
|
|
|
|
key: train_accuracy
|
|
value: [0.83625731 0.83625731 0.8245614 0.85964912 0.84795322 0.8245614
|
|
0.85964912 0.84210526 0.85380117 0.88304094]
|
|
|
|
mean value: 0.8467836257309942
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.8 0.94736842 0.53333333 0.70588235 0.9
|
|
0.81818182 0.8 0.9 0.8 ]
|
|
|
|
mean value: 0.7871432592175627
|
|
|
|
key: train_fscore
|
|
value: [0.825 0.82716049 0.81481481 0.85365854 0.84146341 0.81012658
|
|
0.85365854 0.83229814 0.84848485 0.88888889]
|
|
|
|
mean value: 0.8395554252745034
|
|
|
|
key: test_precision
|
|
value: [0.83333333 1. 0.9 0.66666667 0.75 0.9
|
|
0.75 0.8 0.9 0.8 ]
|
|
|
|
mean value: 0.8300000000000001
|
|
|
|
key: train_precision
|
|
value: [0.89189189 0.88157895 0.86842105 0.8974359 0.88461538 0.87671233
|
|
0.88607595 0.88157895 0.875 0.84210526]
|
|
|
|
mean value: 0.8785415662603702
|
|
|
|
key: test_recall
|
|
value: [0.55555556 0.66666667 1. 0.44444444 0.66666667 0.9
|
|
0.9 0.8 0.9 0.8 ]
|
|
|
|
mean value: 0.7633333333333333
|
|
|
|
key: train_recall
|
|
value: [0.76744186 0.77906977 0.76744186 0.81395349 0.80232558 0.75294118
|
|
0.82352941 0.78823529 0.82352941 0.94117647]
|
|
|
|
mean value: 0.8059644322845417
|
|
|
|
key: test_roc_auc
|
|
value: [0.72777778 0.83333333 0.95 0.62222222 0.73333333 0.89444444
|
|
0.78333333 0.78888889 0.89444444 0.78888889]
|
|
|
|
mean value: 0.8016666666666666
|
|
|
|
key: train_roc_auc
|
|
value: [0.83666211 0.83659371 0.8248974 0.85991792 0.84822161 0.82414501
|
|
0.85943912 0.84179207 0.85362517 0.88337893]
|
|
|
|
mean value: 0.8468673050615595
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.66666667 0.9 0.36363636 0.54545455 0.81818182
|
|
0.69230769 0.66666667 0.81818182 0.66666667]
|
|
|
|
mean value: 0.6637762237762238
|
|
|
|
key: train_jcc
|
|
value: [0.70212766 0.70526316 0.6875 0.74468085 0.72631579 0.68085106
|
|
0.74468085 0.71276596 0.73684211 0.8 ]
|
|
|
|
mean value: 0.7241027435610302
|
|
|
|
MCC on Blind test: 0.27
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00703979 0.006778 0.00680351 0.00683117 0.00685048 0.00680065
|
|
0.00678611 0.006809 0.00679731 0.00678182]
|
|
|
|
mean value: 0.006827783584594726
|
|
|
|
key: score_time
|
|
value: [0.00774384 0.00769997 0.00771785 0.00773907 0.00769234 0.00770354
|
|
0.00767231 0.00767827 0.00772476 0.00772119]
|
|
|
|
mean value: 0.007709312438964844
|
|
|
|
key: test_mcc
|
|
value: [0.57777778 0.68888889 0.78888889 0.03580574 0.41773368 0.36666667
|
|
0.2857738 0.62994079 0.78888889 0.72456884]
|
|
|
|
mean value: 0.5304933960318973
|
|
|
|
key: train_mcc
|
|
value: [0.58646061 0.59367966 0.57166923 0.61721762 0.62711195 0.55822989
|
|
0.65085813 0.56730506 0.58506018 0.61093648]
|
|
|
|
mean value: 0.596852879815003
|
|
|
|
key: test_accuracy
|
|
value: [0.78947368 0.84210526 0.89473684 0.52631579 0.68421053 0.68421053
|
|
0.63157895 0.78947368 0.89473684 0.84210526]
|
|
|
|
mean value: 0.7578947368421053
|
|
|
|
key: train_accuracy
|
|
value: [0.78947368 0.79532164 0.78362573 0.80701754 0.8128655 0.77777778
|
|
0.8245614 0.78362573 0.78947368 0.80116959]
|
|
|
|
mean value: 0.7964912280701755
|
|
|
|
key: test_fscore
|
|
value: [0.77777778 0.84210526 0.88888889 0.4 0.72727273 0.7
|
|
0.72 0.83333333 0.9 0.82352941]
|
|
|
|
mean value: 0.7612907402195328
|
|
|
|
key: train_fscore
|
|
value: [0.80645161 0.80662983 0.79781421 0.81767956 0.82022472 0.78651685
|
|
0.82954545 0.78362573 0.8021978 0.81521739]
|
|
|
|
mean value: 0.8065903164894157
|
|
|
|
key: test_precision
|
|
value: [0.77777778 0.8 0.88888889 0.5 0.61538462 0.7
|
|
0.6 0.71428571 0.9 1. ]
|
|
|
|
mean value: 0.7496336996336996
|
|
|
|
key: train_precision
|
|
value: [0.75 0.76842105 0.75257732 0.77894737 0.79347826 0.75268817
|
|
0.8021978 0.77906977 0.75257732 0.75757576]
|
|
|
|
mean value: 0.7687532820355886
|
|
|
|
key: test_recall
|
|
value: [0.77777778 0.88888889 0.88888889 0.33333333 0.88888889 0.7
|
|
0.9 1. 0.9 0.7 ]
|
|
|
|
mean value: 0.7977777777777778
|
|
|
|
key: train_recall
|
|
value: [0.87209302 0.84883721 0.84883721 0.86046512 0.84883721 0.82352941
|
|
0.85882353 0.78823529 0.85882353 0.88235294]
|
|
|
|
mean value: 0.8490834473324214
|
|
|
|
key: test_roc_auc
|
|
value: [0.78888889 0.84444444 0.89444444 0.51666667 0.69444444 0.68333333
|
|
0.61666667 0.77777778 0.89444444 0.85 ]
|
|
|
|
mean value: 0.7561111111111111
|
|
|
|
key: train_roc_auc
|
|
value: [0.78898769 0.79500684 0.78324213 0.80670315 0.8126539 0.77804378
|
|
0.8247606 0.78365253 0.78987688 0.80164159]
|
|
|
|
mean value: 0.7964569083447333
|
|
|
|
key: test_jcc
|
|
value: [0.63636364 0.72727273 0.8 0.25 0.57142857 0.53846154
|
|
0.5625 0.71428571 0.81818182 0.7 ]
|
|
|
|
mean value: 0.6318494005994006
|
|
|
|
key: train_jcc
|
|
value: [0.67567568 0.67592593 0.66363636 0.69158879 0.6952381 0.64814815
|
|
0.70873786 0.64423077 0.66972477 0.68807339]
|
|
|
|
mean value: 0.6760979792116991
|
|
|
|
MCC on Blind test: 0.23
|
|
|
|
Accuracy on Blind test: 0.62
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00671601 0.00724053 0.00691652 0.00643086 0.00659227 0.00642991
|
|
0.0070684 0.00696349 0.00667977 0.00645947]
|
|
|
|
mean value: 0.006749725341796875
|
|
|
|
key: score_time
|
|
value: [0.00925422 0.0137701 0.0091207 0.00863004 0.0085907 0.00899935
|
|
0.00881934 0.0092144 0.00863957 0.008605 ]
|
|
|
|
mean value: 0.00936434268951416
|
|
|
|
key: test_mcc
|
|
value: [0.4719399 0.9 0.57777778 0.25844328 0.50604808 0.56694671
|
|
0.25844328 0.36803496 0.9 0.41773368]
|
|
|
|
mean value: 0.5225367670594359
|
|
|
|
key: train_mcc
|
|
value: [0.68468598 0.59069767 0.69589603 0.63788154 0.66205542 0.65057205
|
|
0.68455664 0.67315132 0.62630196 0.66082912]
|
|
|
|
mean value: 0.6566627720071807
|
|
|
|
key: test_accuracy
|
|
value: [0.73684211 0.94736842 0.78947368 0.63157895 0.73684211 0.73684211
|
|
0.63157895 0.68421053 0.94736842 0.68421053]
|
|
|
|
mean value: 0.7526315789473684
|
|
|
|
key: train_accuracy
|
|
value: [0.84210526 0.79532164 0.84795322 0.81871345 0.83040936 0.8245614
|
|
0.84210526 0.83625731 0.8128655 0.83040936]
|
|
|
|
mean value: 0.8280701754385965
|
|
|
|
key: test_fscore
|
|
value: [0.70588235 0.94736842 0.77777778 0.58823529 0.76190476 0.66666667
|
|
0.66666667 0.72727273 0.94736842 0.625 ]
|
|
|
|
mean value: 0.7414143089452687
|
|
|
|
key: train_fscore
|
|
value: [0.84023669 0.79532164 0.84883721 0.81656805 0.82634731 0.81707317
|
|
0.83832335 0.8313253 0.80722892 0.82840237]
|
|
|
|
mean value: 0.8249663993602754
|
|
|
|
key: test_precision
|
|
value: [0.75 0.9 0.77777778 0.625 0.66666667 1.
|
|
0.63636364 0.66666667 1. 0.83333333]
|
|
|
|
mean value: 0.7855808080808081
|
|
|
|
key: train_precision
|
|
value: [0.85542169 0.8 0.84883721 0.8313253 0.85185185 0.84810127
|
|
0.85365854 0.85185185 0.82716049 0.83333333]
|
|
|
|
mean value: 0.8401541530526481
|
|
|
|
key: test_recall
|
|
value: [0.66666667 1. 0.77777778 0.55555556 0.88888889 0.5
|
|
0.7 0.8 0.9 0.5 ]
|
|
|
|
mean value: 0.7288888888888889
|
|
|
|
key: train_recall
|
|
value: [0.8255814 0.79069767 0.84883721 0.80232558 0.80232558 0.78823529
|
|
0.82352941 0.81176471 0.78823529 0.82352941]
|
|
|
|
mean value: 0.8105061559507524
|
|
|
|
key: test_roc_auc
|
|
value: [0.73333333 0.95 0.78888889 0.62777778 0.74444444 0.75
|
|
0.62777778 0.67777778 0.95 0.69444444]
|
|
|
|
mean value: 0.7544444444444445
|
|
|
|
key: train_roc_auc
|
|
value: [0.84220246 0.79534884 0.84794802 0.81880985 0.83057456 0.82435021
|
|
0.84199726 0.83611491 0.8127223 0.83036936]
|
|
|
|
mean value: 0.8280437756497948
|
|
|
|
key: test_jcc
|
|
value: [0.54545455 0.9 0.63636364 0.41666667 0.61538462 0.5
|
|
0.5 0.57142857 0.9 0.45454545]
|
|
|
|
mean value: 0.6039843489843489
|
|
|
|
key: train_jcc
|
|
value: [0.7244898 0.66019417 0.73737374 0.69 0.70408163 0.69072165
|
|
0.72164948 0.71134021 0.67676768 0.70707071]
|
|
|
|
mean value: 0.7023689064747017
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0085094 0.00796008 0.00784731 0.00864291 0.0081656 0.00886369
|
|
0.00788093 0.00836372 0.00793576 0.00788879]
|
|
|
|
mean value: 0.008205819129943847
|
|
|
|
key: score_time
|
|
value: [0.0081985 0.00789022 0.00786686 0.00799942 0.00821161 0.00802946
|
|
0.00788641 0.00793266 0.00784802 0.00792909]
|
|
|
|
mean value: 0.007979226112365723
|
|
|
|
key: test_mcc
|
|
value: [0.59554321 0.80903983 0.80903983 0.47777778 0.50604808 0.89893315
|
|
0.39056329 0.62994079 0.89893315 0.57777778]
|
|
|
|
mean value: 0.659359689118782
|
|
|
|
key: train_mcc
|
|
value: [0.68426013 0.65000183 0.66041977 0.73935782 0.70514081 0.70309379
|
|
0.72670051 0.70309379 0.6820046 0.73981073]
|
|
|
|
mean value: 0.699388376917528
|
|
|
|
key: test_accuracy
|
|
value: [0.78947368 0.89473684 0.89473684 0.73684211 0.73684211 0.94736842
|
|
0.68421053 0.78947368 0.94736842 0.78947368]
|
|
|
|
mean value: 0.8210526315789474
|
|
|
|
key: train_accuracy
|
|
value: [0.83625731 0.81871345 0.8245614 0.86549708 0.84795322 0.84795322
|
|
0.85964912 0.84795322 0.83625731 0.86549708]
|
|
|
|
mean value: 0.8450292397660819
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.9 0.9 0.73684211 0.76190476 0.95238095
|
|
0.75 0.83333333 0.95238095 0.8 ]
|
|
|
|
mean value: 0.8386842105263158
|
|
|
|
key: train_fscore
|
|
value: [0.85106383 0.83597884 0.84042553 0.87567568 0.86021505 0.85714286
|
|
0.86813187 0.85714286 0.84782609 0.87431694]
|
|
|
|
mean value: 0.8567919536384895
|
|
|
|
key: test_precision
|
|
value: [0.72727273 0.81818182 0.81818182 0.7 0.66666667 0.90909091
|
|
0.64285714 0.71428571 0.90909091 0.8 ]
|
|
|
|
mean value: 0.7705627705627706
|
|
|
|
key: train_precision
|
|
value: [0.78431373 0.76699029 0.7745098 0.81818182 0.8 0.80412371
|
|
0.81443299 0.80412371 0.78787879 0.81632653]
|
|
|
|
mean value: 0.7970881369717886
|
|
|
|
key: test_recall
|
|
value: [0.88888889 1. 1. 0.77777778 0.88888889 1.
|
|
0.9 1. 1. 0.8 ]
|
|
|
|
mean value: 0.9255555555555556
|
|
|
|
key: train_recall
|
|
value: [0.93023256 0.91860465 0.91860465 0.94186047 0.93023256 0.91764706
|
|
0.92941176 0.91764706 0.91764706 0.94117647]
|
|
|
|
mean value: 0.9263064295485636
|
|
|
|
key: test_roc_auc
|
|
value: [0.79444444 0.9 0.9 0.73888889 0.74444444 0.94444444
|
|
0.67222222 0.77777778 0.94444444 0.78888889]
|
|
|
|
mean value: 0.8205555555555556
|
|
|
|
key: train_roc_auc
|
|
value: [0.83570451 0.81812585 0.82400821 0.86504788 0.84746922 0.84835841
|
|
0.86005472 0.84835841 0.83673051 0.86593707]
|
|
|
|
mean value: 0.8449794801641587
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.81818182 0.81818182 0.58333333 0.61538462 0.90909091
|
|
0.6 0.71428571 0.90909091 0.66666667]
|
|
|
|
mean value: 0.7300882450882451
|
|
|
|
key: train_jcc
|
|
value: [0.74074074 0.71818182 0.72477064 0.77884615 0.75471698 0.75
|
|
0.76699029 0.75 0.73584906 0.77669903]
|
|
|
|
mean value: 0.7496794713094747
|
|
|
|
MCC on Blind test: 0.17
|
|
|
|
Accuracy on Blind test: 0.54
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.50620532 0.52595353 0.51024294 0.6996367 0.51282358 0.5229423
|
|
0.52637339 0.6919179 0.50890207 0.50792599]
|
|
|
|
mean value: 0.5512923717498779
|
|
|
|
key: score_time
|
|
value: [0.01093102 0.01325846 0.01348591 0.01509166 0.01097798 0.0216291
|
|
0.01327252 0.01341844 0.01341486 0.01526189]
|
|
|
|
mean value: 0.014074182510375977
|
|
|
|
key: test_mcc
|
|
value: [0.58655573 0.57777778 0.68888889 0.36666667 0.89893315 0.57777778
|
|
0.48934516 0.62994079 1. 0.68543653]
|
|
|
|
mean value: 0.6501322465616389
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.78947368 0.78947368 0.84210526 0.68421053 0.94736842 0.78947368
|
|
0.73684211 0.78947368 1. 0.84210526]
|
|
|
|
mean value: 0.8210526315789474
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.75 0.77777778 0.84210526 0.66666667 0.94117647 0.8
|
|
0.7826087 0.83333333 1. 0.85714286]
|
|
|
|
mean value: 0.8250811064318939
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.85714286 0.77777778 0.8 0.66666667 1. 0.8
|
|
0.69230769 0.71428571 1. 0.81818182]
|
|
|
|
mean value: 0.8126362526362526
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.66666667 0.77777778 0.88888889 0.66666667 0.88888889 0.8
|
|
0.9 1. 1. 0.9 ]
|
|
|
|
mean value: 0.8488888888888889
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.78333333 0.78888889 0.84444444 0.68333333 0.94444444 0.78888889
|
|
0.72777778 0.77777778 1. 0.83888889]
|
|
|
|
mean value: 0.8177777777777777
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.6 0.63636364 0.72727273 0.5 0.88888889 0.66666667
|
|
0.64285714 0.71428571 1. 0.75 ]
|
|
|
|
mean value: 0.7126334776334776
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.19
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01248384 0.00804639 0.00774622 0.00760841 0.00760221 0.00763416
|
|
0.00761247 0.00740051 0.00772691 0.00756049]
|
|
|
|
mean value: 0.00814216136932373
|
|
|
|
key: score_time
|
|
value: [0.01788831 0.00798225 0.00790691 0.00766706 0.00764203 0.00762773
|
|
0.00762391 0.00771236 0.00768185 0.00769854]
|
|
|
|
mean value: 0.008743095397949218
|
|
|
|
key: test_mcc
|
|
value: [0.80903983 1. 0.80903983 0.80903983 0.80507649 0.68888889
|
|
0.78888889 0.80507649 1. 0.68888889]
|
|
|
|
mean value: 0.8203939143333164
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.89473684 1. 0.89473684 0.89473684 0.89473684 0.84210526
|
|
0.89473684 0.89473684 1. 0.84210526]
|
|
|
|
mean value: 0.9052631578947369
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.9 1. 0.9 0.9 0.875 0.84210526
|
|
0.9 0.90909091 1. 0.84210526]
|
|
|
|
mean value: 0.9068301435406699
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.81818182 1. 0.81818182 0.81818182 1. 0.88888889
|
|
0.9 0.83333333 1. 0.88888889]
|
|
|
|
mean value: 0.8965656565656566
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 0.77777778 0.8
|
|
0.9 1. 1. 0.8 ]
|
|
|
|
mean value: 0.9277777777777778
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9 1. 0.9 0.9 0.88888889 0.84444444
|
|
0.89444444 0.88888889 1. 0.84444444]
|
|
|
|
mean value: 0.9061111111111111
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.81818182 1. 0.81818182 0.81818182 0.77777778 0.72727273
|
|
0.81818182 0.83333333 1. 0.72727273]
|
|
|
|
mean value: 0.8338383838383838
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.31
|
|
|
|
Accuracy on Blind test: 0.84
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.08216429 0.0823946 0.08215213 0.08401656 0.08392739 0.08516693
|
|
0.08451629 0.08552861 0.08529258 0.08295751]
|
|
|
|
mean value: 0.08381168842315674
|
|
|
|
key: score_time
|
|
value: [0.01642704 0.01630354 0.01634693 0.0172317 0.01676369 0.01684332
|
|
0.01709843 0.01667094 0.01670718 0.01638293]
|
|
|
|
mean value: 0.01667757034301758
|
|
|
|
key: test_mcc
|
|
value: [0.78888889 1. 0.9 0.80903983 0.89893315 1.
|
|
0.39056329 0.62994079 0.89893315 0.80903983]
|
|
|
|
mean value: 0.8125338935827562
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.89473684 1. 0.94736842 0.89473684 0.94736842 1.
|
|
0.68421053 0.78947368 0.94736842 0.89473684]
|
|
|
|
mean value: 0.9
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 1. 0.94736842 0.9 0.94117647 1.
|
|
0.75 0.83333333 0.95238095 0.88888889]
|
|
|
|
mean value: 0.910203695513293
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.88888889 1. 0.9 0.81818182 1. 1.
|
|
0.64285714 0.71428571 0.90909091 1. ]
|
|
|
|
mean value: 0.8873304473304473
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.88888889 1. 1. 1. 0.88888889 1.
|
|
0.9 1. 1. 0.8 ]
|
|
|
|
mean value: 0.9477777777777778
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.89444444 1. 0.95 0.9 0.94444444 1.
|
|
0.67222222 0.77777778 0.94444444 0.9 ]
|
|
|
|
mean value: 0.8983333333333333
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 1. 0.9 0.81818182 0.88888889 1.
|
|
0.6 0.71428571 0.90909091 0.8 ]
|
|
|
|
mean value: 0.843044733044733
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.3
|
|
|
|
Accuracy on Blind test: 0.76
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00726008 0.00728798 0.00723386 0.00722384 0.00689101 0.00721788
|
|
0.00701737 0.00681758 0.00731874 0.00693321]
|
|
|
|
mean value: 0.007120156288146972
|
|
|
|
key: score_time
|
|
value: [0.00801849 0.00767374 0.00813246 0.00769758 0.00767255 0.00790119
|
|
0.00813627 0.00808954 0.00791264 0.00815868]
|
|
|
|
mean value: 0.007939314842224121
|
|
|
|
key: test_mcc
|
|
value: [0.36803496 0.4719399 0.4719399 0.64450339 0.68888889 1.
|
|
0.1495142 0.48934516 0.19096397 0.26666667]
|
|
|
|
mean value: 0.4741797049483811
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.68421053 0.73684211 0.73684211 0.78947368 0.84210526 1.
|
|
0.57894737 0.73684211 0.57894737 0.63157895]
|
|
|
|
mean value: 0.731578947368421
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.625 0.70588235 0.70588235 0.81818182 0.84210526 1.
|
|
0.63636364 0.7826087 0.5 0.63157895]
|
|
|
|
mean value: 0.7247603066606297
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.71428571 0.75 0.75 0.69230769 0.8 1.
|
|
0.58333333 0.69230769 0.66666667 0.66666667]
|
|
|
|
mean value: 0.7315567765567765
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.55555556 0.66666667 0.66666667 1. 0.88888889 1.
|
|
0.7 0.9 0.4 0.6 ]
|
|
|
|
mean value: 0.7377777777777778
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.67777778 0.73333333 0.73333333 0.8 0.84444444 1.
|
|
0.57222222 0.72777778 0.58888889 0.63333333]
|
|
|
|
mean value: 0.731111111111111
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.45454545 0.54545455 0.54545455 0.69230769 0.72727273 1.
|
|
0.46666667 0.64285714 0.33333333 0.46153846]
|
|
|
|
mean value: 0.5869430569430569
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.26
|
|
|
|
Accuracy on Blind test: 0.75
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.09578466 1.03582072 1.0463326 1.02560258 1.02915144 1.02786326
|
|
1.02509475 1.03272462 1.03403187 1.02677202]
|
|
|
|
mean value: 1.0379178524017334
|
|
|
|
key: score_time
|
|
value: [0.08918476 0.08878589 0.09057307 0.08787775 0.08753514 0.08778787
|
|
0.08985972 0.08711696 0.08690763 0.08730698]
|
|
|
|
mean value: 0.08829357624053955
|
|
|
|
key: test_mcc
|
|
value: [1. 1. 0.9 0.9 1. 0.9
|
|
0.9 0.89893315 1. 0.80903983]
|
|
|
|
mean value: 0.930797298490688
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 1. 0.94736842 0.94736842 1. 0.94736842
|
|
0.94736842 0.94736842 1. 0.89473684]
|
|
|
|
mean value: 0.9631578947368421
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 1. 0.94736842 0.94736842 1. 0.94736842
|
|
0.94736842 0.95238095 1. 0.88888889]
|
|
|
|
mean value: 0.9630743525480367
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.9 0.9 1. 1.
|
|
1. 0.90909091 1. 1. ]
|
|
|
|
mean value: 0.9709090909090909
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 0.9 0.9 1. 1. 0.8]
|
|
|
|
mean value: 0.96
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 1. 0.95 0.95 1. 0.95
|
|
0.95 0.94444444 1. 0.9 ]
|
|
|
|
mean value: 0.9644444444444444
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 1. 0.9 0.9 1. 0.9
|
|
0.9 0.90909091 1. 0.8 ]
|
|
|
|
mean value: 0.9309090909090909
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.3
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.79450846 0.99196959 0.89163351 0.86481595 0.85897207 0.88148761
|
|
0.85166144 0.85085249 0.98532867 0.87325144]
|
|
|
|
mean value: 0.8844481229782104
|
|
|
|
key: score_time
|
|
value: [0.228971 0.21244311 0.2221725 0.17969465 0.22778034 0.23846292
|
|
0.18934894 0.2319572 0.19641733 0.20504928]
|
|
|
|
mean value: 0.21322972774505616
|
|
|
|
key: test_mcc
|
|
value: [0.9 1. 0.9 0.80903983 1. 0.9
|
|
0.9 0.89893315 1. 0.80903983]
|
|
|
|
mean value: 0.9117012819862771
|
|
|
|
key: train_mcc
|
|
value: [0.94157888 0.95346936 0.95321477 0.95321477 0.95346936 0.94158687
|
|
0.95348202 0.96497948 0.94158687 0.95348202]
|
|
|
|
mean value: 0.9510064395117223
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 1. 0.94736842 0.89473684 1. 0.94736842
|
|
0.94736842 0.94736842 1. 0.89473684]
|
|
|
|
mean value: 0.9526315789473684
|
|
|
|
key: train_accuracy
|
|
value: [0.97076023 0.97660819 0.97660819 0.97660819 0.97660819 0.97076023
|
|
0.97660819 0.98245614 0.97076023 0.97660819]
|
|
|
|
mean value: 0.975438596491228
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 1. 0.94736842 0.9 1. 0.94736842
|
|
0.94736842 0.95238095 1. 0.88888889]
|
|
|
|
mean value: 0.9530743525480367
|
|
|
|
key: train_fscore
|
|
value: [0.97109827 0.97701149 0.97674419 0.97674419 0.97701149 0.97076023
|
|
0.97674419 0.98245614 0.97076023 0.97674419]
|
|
|
|
mean value: 0.9756074606774882
|
|
|
|
key: test_precision
|
|
value: [0.9 1. 0.9 0.81818182 1. 1.
|
|
1. 0.90909091 1. 1. ]
|
|
|
|
mean value: 0.9527272727272728
|
|
|
|
key: train_precision
|
|
value: [0.96551724 0.96590909 0.97674419 0.97674419 0.96590909 0.96511628
|
|
0.96551724 0.97674419 0.96511628 0.96551724]
|
|
|
|
mean value: 0.9688835022235183
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 0.9 0.9 1. 1. 0.8]
|
|
|
|
mean value: 0.96
|
|
|
|
key: train_recall
|
|
value: [0.97674419 0.98837209 0.97674419 0.97674419 0.98837209 0.97647059
|
|
0.98823529 0.98823529 0.97647059 0.98823529]
|
|
|
|
mean value: 0.9824623803009576
|
|
|
|
key: test_roc_auc
|
|
value: [0.95 1. 0.95 0.9 1. 0.95
|
|
0.95 0.94444444 1. 0.9 ]
|
|
|
|
mean value: 0.9544444444444444
|
|
|
|
key: train_roc_auc
|
|
value: [0.97072503 0.97653899 0.97660739 0.97660739 0.97653899 0.97079343
|
|
0.97667579 0.98248974 0.97079343 0.97667579]
|
|
|
|
mean value: 0.9754445964432285
|
|
|
|
key: test_jcc
|
|
value: [0.9 1. 0.9 0.81818182 1. 0.9
|
|
0.9 0.90909091 1. 0.8 ]
|
|
|
|
mean value: 0.9127272727272727
|
|
|
|
key: train_jcc
|
|
value: [0.94382022 0.95505618 0.95454545 0.95454545 0.95505618 0.94318182
|
|
0.95454545 0.96551724 0.94318182 0.95454545]
|
|
|
|
mean value: 0.9523995280194428
|
|
|
|
MCC on Blind test: 0.27
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00803375 0.00765824 0.00742745 0.00700617 0.00725985 0.00694323
|
|
0.00772119 0.00693536 0.00706053 0.00701404]
|
|
|
|
mean value: 0.0073059797286987305
|
|
|
|
key: score_time
|
|
value: [0.00847721 0.0100956 0.00787282 0.00814962 0.00778627 0.00804114
|
|
0.00835919 0.00783134 0.00775075 0.00776839]
|
|
|
|
mean value: 0.008213233947753907
|
|
|
|
key: test_mcc
|
|
value: [0.57777778 0.68888889 0.78888889 0.03580574 0.41773368 0.36666667
|
|
0.2857738 0.62994079 0.78888889 0.72456884]
|
|
|
|
mean value: 0.5304933960318973
|
|
|
|
key: train_mcc
|
|
value: [0.58646061 0.59367966 0.57166923 0.61721762 0.62711195 0.55822989
|
|
0.65085813 0.56730506 0.58506018 0.61093648]
|
|
|
|
mean value: 0.596852879815003
|
|
|
|
key: test_accuracy
|
|
value: [0.78947368 0.84210526 0.89473684 0.52631579 0.68421053 0.68421053
|
|
0.63157895 0.78947368 0.89473684 0.84210526]
|
|
|
|
mean value: 0.7578947368421053
|
|
|
|
key: train_accuracy
|
|
value: [0.78947368 0.79532164 0.78362573 0.80701754 0.8128655 0.77777778
|
|
0.8245614 0.78362573 0.78947368 0.80116959]
|
|
|
|
mean value: 0.7964912280701755
|
|
|
|
key: test_fscore
|
|
value: [0.77777778 0.84210526 0.88888889 0.4 0.72727273 0.7
|
|
0.72 0.83333333 0.9 0.82352941]
|
|
|
|
mean value: 0.7612907402195328
|
|
|
|
key: train_fscore
|
|
value: [0.80645161 0.80662983 0.79781421 0.81767956 0.82022472 0.78651685
|
|
0.82954545 0.78362573 0.8021978 0.81521739]
|
|
|
|
mean value: 0.8065903164894157
|
|
|
|
key: test_precision
|
|
value: [0.77777778 0.8 0.88888889 0.5 0.61538462 0.7
|
|
0.6 0.71428571 0.9 1. ]
|
|
|
|
mean value: 0.7496336996336996
|
|
|
|
key: train_precision
|
|
value: [0.75 0.76842105 0.75257732 0.77894737 0.79347826 0.75268817
|
|
0.8021978 0.77906977 0.75257732 0.75757576]
|
|
|
|
mean value: 0.7687532820355886
|
|
|
|
key: test_recall
|
|
value: [0.77777778 0.88888889 0.88888889 0.33333333 0.88888889 0.7
|
|
0.9 1. 0.9 0.7 ]
|
|
|
|
mean value: 0.7977777777777778
|
|
|
|
key: train_recall
|
|
value: [0.87209302 0.84883721 0.84883721 0.86046512 0.84883721 0.82352941
|
|
0.85882353 0.78823529 0.85882353 0.88235294]
|
|
|
|
mean value: 0.8490834473324214
|
|
|
|
key: test_roc_auc
|
|
value: [0.78888889 0.84444444 0.89444444 0.51666667 0.69444444 0.68333333
|
|
0.61666667 0.77777778 0.89444444 0.85 ]
|
|
|
|
mean value: 0.7561111111111111
|
|
|
|
key: train_roc_auc
|
|
value: [0.78898769 0.79500684 0.78324213 0.80670315 0.8126539 0.77804378
|
|
0.8247606 0.78365253 0.78987688 0.80164159]
|
|
|
|
mean value: 0.7964569083447333
|
|
|
|
key: test_jcc
|
|
value: [0.63636364 0.72727273 0.8 0.25 0.57142857 0.53846154
|
|
0.5625 0.71428571 0.81818182 0.7 ]
|
|
|
|
mean value: 0.6318494005994006
|
|
|
|
key: train_jcc
|
|
value: [0.67567568 0.67592593 0.66363636 0.69158879 0.6952381 0.64814815
|
|
0.70873786 0.64423077 0.66972477 0.68807339]
|
|
|
|
mean value: 0.6760979792116991
|
|
|
|
MCC on Blind test: 0.23
|
|
|
|
Accuracy on Blind test: 0.62
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.06458902 0.03820992 0.03959632 0.03878093 0.03928733 0.04292321
|
|
0.04065013 0.04027557 0.04257679 0.03781724]
|
|
|
|
mean value: 0.04247064590454101
|
|
|
|
key: score_time
|
|
value: [0.00953722 0.00945091 0.01028013 0.01050878 0.01019955 0.01042247
|
|
0.01043653 0.01029015 0.01024365 0.01017427]
|
|
|
|
mean value: 0.010154366493225098
|
|
|
|
key: test_mcc
|
|
value: [1. 1. 0.9 0.9 1. 0.9
|
|
0.9 0.89893315 1. 0.80903983]
|
|
|
|
mean value: 0.930797298490688
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 1. 0.94736842 0.94736842 1. 0.94736842
|
|
0.94736842 0.94736842 1. 0.89473684]
|
|
|
|
mean value: 0.9631578947368421
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 1. 0.94736842 0.94736842 1. 0.94736842
|
|
0.94736842 0.95238095 1. 0.88888889]
|
|
|
|
mean value: 0.9630743525480367
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.9 0.9 1. 1.
|
|
1. 0.90909091 1. 1. ]
|
|
|
|
mean value: 0.9709090909090909
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 0.9 0.9 1. 1. 0.8]
|
|
|
|
mean value: 0.96
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 1. 0.95 0.95 1. 0.95
|
|
0.95 0.94444444 1. 0.9 ]
|
|
|
|
mean value: 0.9644444444444444
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 1. 0.9 0.9 1. 0.9
|
|
0.9 0.90909091 1. 0.8 ]
|
|
|
|
mean value: 0.9309090909090909
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.21
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01104665 0.03395987 0.03166819 0.03184342 0.02563572 0.027812
|
|
0.03157783 0.03174877 0.03313327 0.02740669]
|
|
|
|
mean value: 0.028583240509033204
|
|
|
|
key: score_time
|
|
value: [0.00994396 0.02052045 0.01887202 0.01847482 0.01112342 0.02053523
|
|
0.01040173 0.01528358 0.01049042 0.01989079]
|
|
|
|
mean value: 0.015553641319274902
|
|
|
|
key: test_mcc
|
|
value: [1. 0.80507649 0.68888889 0.9 0.89893315 0.89893315
|
|
0.80903983 0.78888889 1. 0.80903983]
|
|
|
|
mean value: 0.8598800233490951
|
|
|
|
key: train_mcc
|
|
value: [0.92982216 0.94157888 0.94157888 0.94158687 0.92982216 0.92982216
|
|
0.96497948 0.9649747 0.94158687 0.96497948]
|
|
|
|
mean value: 0.9450731645321953
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.89473684 0.84210526 0.94736842 0.94736842 0.94736842
|
|
0.89473684 0.89473684 1. 0.89473684]
|
|
|
|
mean value: 0.9263157894736842
|
|
|
|
key: train_accuracy
|
|
value: [0.96491228 0.97076023 0.97076023 0.97076023 0.96491228 0.96491228
|
|
0.98245614 0.98245614 0.97076023 0.98245614]
|
|
|
|
mean value: 0.9725146198830409
|
|
|
|
key: test_fscore
|
|
value: [1. 0.875 0.84210526 0.94736842 0.94117647 0.95238095
|
|
0.88888889 0.9 1. 0.88888889]
|
|
|
|
mean value: 0.9235808884957493
|
|
|
|
key: train_fscore
|
|
value: [0.96511628 0.97109827 0.97109827 0.97076023 0.96511628 0.96470588
|
|
0.98245614 0.98224852 0.97076023 0.98245614]
|
|
|
|
mean value: 0.9725816241532454
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.8 0.9 1. 0.90909091
|
|
1. 0.9 1. 1. ]
|
|
|
|
mean value: 0.9509090909090909
|
|
|
|
key: train_precision
|
|
value: [0.96511628 0.96551724 0.96551724 0.97647059 0.96511628 0.96470588
|
|
0.97674419 0.98809524 0.96511628 0.97674419]
|
|
|
|
mean value: 0.970914340074442
|
|
|
|
key: test_recall
|
|
value: [1. 0.77777778 0.88888889 1. 0.88888889 1.
|
|
0.8 0.9 1. 0.8 ]
|
|
|
|
mean value: 0.9055555555555556
|
|
|
|
key: train_recall
|
|
value: [0.96511628 0.97674419 0.97674419 0.96511628 0.96511628 0.96470588
|
|
0.98823529 0.97647059 0.97647059 0.98823529]
|
|
|
|
mean value: 0.9742954856361149
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.88888889 0.84444444 0.95 0.94444444 0.94444444
|
|
0.9 0.89444444 1. 0.9 ]
|
|
|
|
mean value: 0.9266666666666666
|
|
|
|
key: train_roc_auc
|
|
value: [0.96491108 0.97072503 0.97072503 0.97079343 0.96491108 0.96491108
|
|
0.98248974 0.98242134 0.97079343 0.98248974]
|
|
|
|
mean value: 0.9725170998632011
|
|
|
|
key: test_jcc
|
|
value: [1. 0.77777778 0.72727273 0.9 0.88888889 0.90909091
|
|
0.8 0.81818182 1. 0.8 ]
|
|
|
|
mean value: 0.8621212121212122
|
|
|
|
key: train_jcc
|
|
value: [0.93258427 0.94382022 0.94382022 0.94318182 0.93258427 0.93181818
|
|
0.96551724 0.96511628 0.94318182 0.96551724]
|
|
|
|
mean value: 0.9467141568774251
|
|
|
|
MCC on Blind test: 0.22
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01662064 0.00703716 0.0070796 0.00686336 0.00670743 0.00671721
|
|
0.00670624 0.0067091 0.00672913 0.00688386]
|
|
|
|
mean value: 0.007805371284484863
|
|
|
|
key: score_time
|
|
value: [0.00827503 0.00794244 0.00786853 0.00760841 0.00764394 0.0076046
|
|
0.00760007 0.00758982 0.00767159 0.007622 ]
|
|
|
|
mean value: 0.007742643356323242
|
|
|
|
key: test_mcc
|
|
value: [0.47777778 0.78888889 0.80903983 0.4719399 0.47777778 0.9
|
|
0.39056329 0.62994079 0.89893315 0.68888889]
|
|
|
|
mean value: 0.6533750299089396
|
|
|
|
key: train_mcc
|
|
value: [0.6682897 0.64075558 0.64460032 0.72230744 0.69197907 0.64522558
|
|
0.72260902 0.69674175 0.66470432 0.67154946]
|
|
|
|
mean value: 0.6768762239015662
|
|
|
|
key: test_accuracy
|
|
value: [0.73684211 0.89473684 0.89473684 0.73684211 0.73684211 0.94736842
|
|
0.68421053 0.78947368 0.94736842 0.84210526]
|
|
|
|
mean value: 0.8210526315789474
|
|
|
|
key: train_accuracy
|
|
value: [0.83040936 0.81871345 0.81871345 0.85964912 0.84210526 0.81871345
|
|
0.85964912 0.84795322 0.83040936 0.83040936]
|
|
|
|
mean value: 0.835672514619883
|
|
|
|
key: test_fscore
|
|
value: [0.73684211 0.88888889 0.9 0.70588235 0.73684211 0.94736842
|
|
0.75 0.83333333 0.95238095 0.84210526]
|
|
|
|
mean value: 0.8293643422281193
|
|
|
|
key: train_fscore
|
|
value: [0.84324324 0.82872928 0.83243243 0.86666667 0.85405405 0.83060109
|
|
0.86516854 0.85057471 0.83798883 0.84324324]
|
|
|
|
mean value: 0.8452702093088933
|
|
|
|
key: test_precision
|
|
value: [0.7 0.88888889 0.81818182 0.75 0.7 1.
|
|
0.64285714 0.71428571 0.90909091 0.88888889]
|
|
|
|
mean value: 0.8012193362193362
|
|
|
|
key: train_precision
|
|
value: [0.78787879 0.78947368 0.77777778 0.82978723 0.7979798 0.7755102
|
|
0.82795699 0.83146067 0.79787234 0.78 ]
|
|
|
|
mean value: 0.7995697489801223
|
|
|
|
key: test_recall
|
|
value: [0.77777778 0.88888889 1. 0.66666667 0.77777778 0.9
|
|
0.9 1. 1. 0.8 ]
|
|
|
|
mean value: 0.8711111111111112
|
|
|
|
key: train_recall
|
|
value: [0.90697674 0.87209302 0.89534884 0.90697674 0.91860465 0.89411765
|
|
0.90588235 0.87058824 0.88235294 0.91764706]
|
|
|
|
mean value: 0.8970588235294118
|
|
|
|
key: test_roc_auc
|
|
value: [0.73888889 0.89444444 0.9 0.73333333 0.73888889 0.95
|
|
0.67222222 0.77777778 0.94444444 0.84444444]
|
|
|
|
mean value: 0.8194444444444444
|
|
|
|
key: train_roc_auc
|
|
value: [0.82995896 0.81839945 0.81826265 0.85937073 0.84165527 0.81915185
|
|
0.85991792 0.84808482 0.83071135 0.83091655]
|
|
|
|
mean value: 0.8356429548563611
|
|
|
|
key: test_jcc
|
|
value: [0.58333333 0.8 0.81818182 0.54545455 0.58333333 0.9
|
|
0.6 0.71428571 0.90909091 0.72727273]
|
|
|
|
mean value: 0.7180952380952381
|
|
|
|
key: train_jcc
|
|
value: [0.72897196 0.70754717 0.71296296 0.76470588 0.74528302 0.71028037
|
|
0.76237624 0.74 0.72115385 0.72897196]
|
|
|
|
mean value: 0.7322253416838178
|
|
|
|
MCC on Blind test: 0.21
|
|
|
|
Accuracy on Blind test: 0.63
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0078938 0.01025677 0.01055956 0.01001406 0.01028895 0.01002979
|
|
0.01060987 0.01027226 0.01006579 0.01109862]
|
|
|
|
mean value: 0.01010894775390625
|
|
|
|
key: score_time
|
|
value: [0.00768161 0.01027536 0.01019096 0.01026964 0.01025581 0.01024055
|
|
0.01023459 0.01036429 0.01042557 0.01041293]
|
|
|
|
mean value: 0.010035133361816407
|
|
|
|
key: test_mcc
|
|
value: [0.78888889 0.80507649 0.78888889 0.9 0.9 0.80903983
|
|
0.48934516 0.50604808 1. 0.68888889]
|
|
|
|
mean value: 0.7676176228299976
|
|
|
|
key: train_mcc
|
|
value: [0.89769958 0.90744828 0.75930915 0.91870817 0.93006714 0.91967295
|
|
0.89779492 0.87613518 0.87279143 0.96497948]
|
|
|
|
mean value: 0.8944606272273846
|
|
|
|
key: test_accuracy
|
|
value: [0.89473684 0.89473684 0.89473684 0.94736842 0.94736842 0.89473684
|
|
0.73684211 0.73684211 1. 0.84210526]
|
|
|
|
mean value: 0.8789473684210526
|
|
|
|
key: train_accuracy
|
|
value: [0.94736842 0.95321637 0.86549708 0.95906433 0.96491228 0.95906433
|
|
0.94736842 0.93567251 0.93567251 0.98245614]
|
|
|
|
mean value: 0.9450292397660818
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.875 0.88888889 0.94736842 0.94736842 0.88888889
|
|
0.7826087 0.70588235 1. 0.84210526]
|
|
|
|
mean value: 0.8766999820523175
|
|
|
|
key: train_fscore
|
|
value: [0.94972067 0.95238095 0.84563758 0.95857988 0.96551724 0.95757576
|
|
0.94915254 0.93167702 0.93333333 0.98245614]
|
|
|
|
mean value: 0.9426031121967136
|
|
|
|
key: test_precision
|
|
value: [0.88888889 1. 0.88888889 0.9 0.9 1.
|
|
0.69230769 0.85714286 1. 0.88888889]
|
|
|
|
mean value: 0.9016117216117217
|
|
|
|
key: train_precision
|
|
value: [0.91397849 0.97560976 1. 0.97590361 0.95454545 0.9875
|
|
0.91304348 0.98684211 0.9625 0.97674419]
|
|
|
|
mean value: 0.9646667089295042
|
|
|
|
key: test_recall
|
|
value: [0.88888889 0.77777778 0.88888889 1. 1. 0.8
|
|
0.9 0.6 1. 0.8 ]
|
|
|
|
mean value: 0.8655555555555555
|
|
|
|
key: train_recall
|
|
value: [0.98837209 0.93023256 0.73255814 0.94186047 0.97674419 0.92941176
|
|
0.98823529 0.88235294 0.90588235 0.98823529]
|
|
|
|
mean value: 0.9263885088919288
|
|
|
|
key: test_roc_auc
|
|
value: [0.89444444 0.88888889 0.89444444 0.95 0.95 0.9
|
|
0.72777778 0.74444444 1. 0.84444444]
|
|
|
|
mean value: 0.8794444444444445
|
|
|
|
key: train_roc_auc
|
|
value: [0.94712722 0.95335157 0.86627907 0.95916553 0.96484268 0.95889193
|
|
0.94760602 0.93536252 0.93549932 0.98248974]
|
|
|
|
mean value: 0.9450615595075239
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.77777778 0.8 0.9 0.9 0.8
|
|
0.64285714 0.54545455 1. 0.72727273]
|
|
|
|
mean value: 0.7893362193362193
|
|
|
|
key: train_jcc
|
|
value: [0.90425532 0.90909091 0.73255814 0.92045455 0.93333333 0.91860465
|
|
0.90322581 0.87209302 0.875 0.96551724]
|
|
|
|
mean value: 0.8934132968812135
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01043749 0.01005483 0.01035213 0.01050234 0.00995088 0.00993705
|
|
0.0102191 0.00978518 0.01020694 0.00993204]
|
|
|
|
mean value: 0.010137796401977539
|
|
|
|
key: score_time
|
|
value: [0.01046276 0.01042175 0.01033998 0.01044917 0.01030922 0.01031208
|
|
0.0102849 0.01033711 0.01031756 0.01019359]
|
|
|
|
mean value: 0.010342812538146973
|
|
|
|
key: test_mcc
|
|
value: [0.72456884 0.78888889 0.80903983 0.9 1. 0.80507649
|
|
0.68888889 0.71611487 0.71611487 0.68888889]
|
|
|
|
mean value: 0.7837581572910308
|
|
|
|
key: train_mcc
|
|
value: [0.6741192 0.88517311 0.72063365 0.94157888 0.94157888 0.86350542
|
|
0.92982216 0.89630221 0.7838874 0.93006714]
|
|
|
|
mean value: 0.8566668051359216
|
|
|
|
key: test_accuracy
|
|
value: [0.84210526 0.89473684 0.89473684 0.94736842 1. 0.89473684
|
|
0.84210526 0.84210526 0.84210526 0.84210526]
|
|
|
|
mean value: 0.8842105263157894
|
|
|
|
key: train_accuracy
|
|
value: [0.8128655 0.94152047 0.84210526 0.97076023 0.97076023 0.92982456
|
|
0.96491228 0.94736842 0.88304094 0.96491228]
|
|
|
|
mean value: 0.9228070175438596
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 0.88888889 0.9 0.94736842 1. 0.90909091
|
|
0.84210526 0.86956522 0.86956522 0.84210526]
|
|
|
|
mean value: 0.8925832037273685
|
|
|
|
key: train_fscore
|
|
value: [0.84313725 0.94382022 0.86432161 0.97109827 0.97109827 0.93258427
|
|
0.96470588 0.94857143 0.89361702 0.96428571]
|
|
|
|
mean value: 0.9297239935602771
|
|
|
|
key: test_precision
|
|
value: [0.75 0.88888889 0.81818182 0.9 1. 0.83333333
|
|
0.88888889 0.76923077 0.76923077 0.88888889]
|
|
|
|
mean value: 0.8506643356643356
|
|
|
|
key: train_precision
|
|
value: [0.72881356 0.91304348 0.76106195 0.96551724 0.96551724 0.89247312
|
|
0.96470588 0.92222222 0.81553398 0.97590361]
|
|
|
|
mean value: 0.8904792285139268
|
|
|
|
key: test_recall
|
|
value: [1. 0.88888889 1. 1. 1. 1.
|
|
0.8 1. 1. 0.8 ]
|
|
|
|
mean value: 0.9488888888888889
|
|
|
|
key: train_recall
|
|
value: [1. 0.97674419 1. 0.97674419 0.97674419 0.97647059
|
|
0.96470588 0.97647059 0.98823529 0.95294118]
|
|
|
|
mean value: 0.97890560875513
|
|
|
|
key: test_roc_auc
|
|
value: [0.85 0.89444444 0.9 0.95 1. 0.88888889
|
|
0.84444444 0.83333333 0.83333333 0.84444444]
|
|
|
|
mean value: 0.8838888888888888
|
|
|
|
key: train_roc_auc
|
|
value: [0.81176471 0.94131327 0.84117647 0.97072503 0.97072503 0.93009576
|
|
0.96491108 0.94753762 0.88365253 0.96484268]
|
|
|
|
mean value: 0.9226744186046512
|
|
|
|
key: test_jcc
|
|
value: [0.75 0.8 0.81818182 0.9 1. 0.83333333
|
|
0.72727273 0.76923077 0.76923077 0.72727273]
|
|
|
|
mean value: 0.8094522144522145
|
|
|
|
key: train_jcc
|
|
value: [0.72881356 0.89361702 0.76106195 0.94382022 0.94382022 0.87368421
|
|
0.93181818 0.90217391 0.80769231 0.93103448]
|
|
|
|
mean value: 0.8717536072778391
|
|
|
|
MCC on Blind test: 0.3
|
|
|
|
Accuracy on Blind test: 0.87
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.07930589 0.06574082 0.06760287 0.06530023 0.06639194 0.06540418
|
|
0.06537628 0.06978536 0.06504154 0.06514907]
|
|
|
|
mean value: 0.0675098180770874
|
|
|
|
key: score_time
|
|
value: [0.01435232 0.01375031 0.01374173 0.0138588 0.01387048 0.01396298
|
|
0.01380754 0.01467633 0.01412082 0.01378512]
|
|
|
|
mean value: 0.013992643356323243
|
|
|
|
key: test_mcc
|
|
value: [0.89893315 0.80507649 0.9 0.9 1. 0.9
|
|
0.9 0.89893315 1. 0.68888889]
|
|
|
|
mean value: 0.8891831674690281
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.89473684 0.94736842 0.94736842 1. 0.94736842
|
|
0.94736842 0.94736842 1. 0.84210526]
|
|
|
|
mean value: 0.9421052631578947
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94117647 0.875 0.94736842 0.94736842 1. 0.94736842
|
|
0.94736842 0.95238095 1. 0.84210526]
|
|
|
|
mean value: 0.940013637033761
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.9 0.9 1. 1.
|
|
1. 0.90909091 1. 0.88888889]
|
|
|
|
mean value: 0.9597979797979798
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.88888889 0.77777778 1. 1. 1. 0.9
|
|
0.9 1. 1. 0.8 ]
|
|
|
|
mean value: 0.9266666666666666
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.94444444 0.88888889 0.95 0.95 1. 0.95
|
|
0.95 0.94444444 1. 0.84444444]
|
|
|
|
mean value: 0.9422222222222222
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.88888889 0.77777778 0.9 0.9 1. 0.9
|
|
0.9 0.90909091 1. 0.72727273]
|
|
|
|
mean value: 0.8903030303030303
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03435493 0.03850365 0.03137898 0.02753544 0.03600645 0.03681159
|
|
0.03525805 0.0235014 0.02443385 0.03159785]
|
|
|
|
mean value: 0.03193821907043457
|
|
|
|
key: score_time
|
|
value: [0.02038193 0.028126 0.01892233 0.02161765 0.02731395 0.02152586
|
|
0.0160749 0.01497722 0.01738143 0.03216863]
|
|
|
|
mean value: 0.02184898853302002
|
|
|
|
key: test_mcc
|
|
value: [1. 1. 0.9 0.9 1. 0.80903983
|
|
0.9 0.89893315 1. 0.80903983]
|
|
|
|
mean value: 0.9217012819862771
|
|
|
|
key: train_mcc
|
|
value: [0.97687783 0.96497948 0.97660739 0.9655126 0.97687783 0.97687158
|
|
0.98837051 0.96497948 0.98837051 1. ]
|
|
|
|
mean value: 0.9779447211260681
|
|
|
|
key: test_accuracy
|
|
value: [1. 1. 0.94736842 0.94736842 1. 0.89473684
|
|
0.94736842 0.94736842 1. 0.89473684]
|
|
|
|
mean value: 0.9578947368421052
|
|
|
|
key: train_accuracy
|
|
value: [0.98830409 0.98245614 0.98830409 0.98245614 0.98830409 0.98830409
|
|
0.99415205 0.98245614 0.99415205 1. ]
|
|
|
|
mean value: 0.9888888888888888
|
|
|
|
key: test_fscore
|
|
value: [1. 1. 0.94736842 0.94736842 1. 0.88888889
|
|
0.94736842 0.95238095 1. 0.88888889]
|
|
|
|
mean value: 0.9572263993316625
|
|
|
|
key: train_fscore
|
|
value: [0.98823529 0.98245614 0.98837209 0.98224852 0.98823529 0.98809524
|
|
0.99408284 0.98245614 0.99408284 1. ]
|
|
|
|
mean value: 0.9888264401238974
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.9 0.9 1. 1.
|
|
1. 0.90909091 1. 1. ]
|
|
|
|
mean value: 0.9709090909090909
|
|
|
|
key: train_precision
|
|
value: [1. 0.98823529 0.98837209 1. 1. 1.
|
|
1. 0.97674419 1. 1. ]
|
|
|
|
mean value: 0.9953351573187414
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 0.8 0.9 1. 1. 0.8]
|
|
|
|
mean value: 0.95
|
|
|
|
key: train_recall
|
|
value: [0.97674419 0.97674419 0.98837209 0.96511628 0.97674419 0.97647059
|
|
0.98823529 0.98823529 0.98823529 1. ]
|
|
|
|
mean value: 0.9824897400820793
|
|
|
|
key: test_roc_auc
|
|
value: [1. 1. 0.95 0.95 1. 0.9
|
|
0.95 0.94444444 1. 0.9 ]
|
|
|
|
mean value: 0.9594444444444444
|
|
|
|
key: train_roc_auc
|
|
value: [0.98837209 0.98248974 0.98830369 0.98255814 0.98837209 0.98823529
|
|
0.99411765 0.98248974 0.99411765 1. ]
|
|
|
|
mean value: 0.98890560875513
|
|
|
|
key: test_jcc
|
|
value: [1. 1. 0.9 0.9 1. 0.8
|
|
0.9 0.90909091 1. 0.8 ]
|
|
|
|
mean value: 0.9209090909090909
|
|
|
|
key: train_jcc
|
|
value: [0.97674419 0.96551724 0.97701149 0.96511628 0.97674419 0.97647059
|
|
0.98823529 0.96551724 0.98823529 1. ]
|
|
|
|
mean value: 0.9779591804644874
|
|
|
|
MCC on Blind test: 0.32
|
|
|
|
Accuracy on Blind test: 0.84
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03849268 0.04577231 0.04582977 0.04652309 0.0430398 0.0437479
|
|
0.04057932 0.04710174 0.04087782 0.04346371]
|
|
|
|
mean value: 0.04354281425476074
|
|
|
|
key: score_time
|
|
value: [0.01976991 0.01946115 0.01306677 0.01965237 0.02220845 0.02236199
|
|
0.02163577 0.02175426 0.02835393 0.01248932]
|
|
|
|
mean value: 0.020075392723083497
|
|
|
|
key: test_mcc
|
|
value: [0.4719399 0.78888889 0.47777778 0.25844328 0.47777778 0.59554321
|
|
0.26257545 0.62994079 0.89893315 0.47777778]
|
|
|
|
mean value: 0.5339598010514092
|
|
|
|
key: train_mcc
|
|
value: [0.9300862 0.88329458 0.89480164 0.90642955 0.90642955 0.91867501
|
|
0.89526317 0.91867501 0.89526317 0.91818307]
|
|
|
|
mean value: 0.9067100950674556
|
|
|
|
key: test_accuracy
|
|
value: [0.73684211 0.89473684 0.73684211 0.63157895 0.73684211 0.78947368
|
|
0.63157895 0.78947368 0.94736842 0.73684211]
|
|
|
|
mean value: 0.763157894736842
|
|
|
|
key: train_accuracy
|
|
value: [0.96491228 0.94152047 0.94736842 0.95321637 0.95321637 0.95906433
|
|
0.94736842 0.95906433 0.94736842 0.95906433]
|
|
|
|
mean value: 0.9532163742690059
|
|
|
|
key: test_fscore
|
|
value: [0.70588235 0.88888889 0.73684211 0.58823529 0.73684211 0.77777778
|
|
0.69565217 0.83333333 0.95238095 0.73684211]
|
|
|
|
mean value: 0.7652677089142292
|
|
|
|
key: train_fscore
|
|
value: [0.96470588 0.94117647 0.94736842 0.95348837 0.95348837 0.95808383
|
|
0.94610778 0.95808383 0.94610778 0.95857988]
|
|
|
|
mean value: 0.9527190633369593
|
|
|
|
key: test_precision
|
|
value: [0.75 0.88888889 0.7 0.625 0.7 0.875
|
|
0.61538462 0.71428571 0.90909091 0.77777778]
|
|
|
|
mean value: 0.7555427905427905
|
|
|
|
key: train_precision
|
|
value: [0.97619048 0.95238095 0.95294118 0.95348837 0.95348837 0.97560976
|
|
0.96341463 0.97560976 0.96341463 0.96428571]
|
|
|
|
mean value: 0.9630823844001583
|
|
|
|
key: test_recall
|
|
value: [0.66666667 0.88888889 0.77777778 0.55555556 0.77777778 0.7
|
|
0.8 1. 1. 0.7 ]
|
|
|
|
mean value: 0.7866666666666666
|
|
|
|
key: train_recall
|
|
value: [0.95348837 0.93023256 0.94186047 0.95348837 0.95348837 0.94117647
|
|
0.92941176 0.94117647 0.92941176 0.95294118]
|
|
|
|
mean value: 0.9426675786593708
|
|
|
|
key: test_roc_auc
|
|
value: [0.73333333 0.89444444 0.73888889 0.62777778 0.73888889 0.79444444
|
|
0.62222222 0.77777778 0.94444444 0.73888889]
|
|
|
|
mean value: 0.7611111111111111
|
|
|
|
key: train_roc_auc
|
|
value: [0.96497948 0.94158687 0.94740082 0.95321477 0.95321477 0.95896033
|
|
0.94726402 0.95896033 0.94726402 0.95902873]
|
|
|
|
mean value: 0.953187414500684
|
|
|
|
key: test_jcc
|
|
value: [0.54545455 0.8 0.58333333 0.41666667 0.58333333 0.63636364
|
|
0.53333333 0.71428571 0.90909091 0.58333333]
|
|
|
|
mean value: 0.6305194805194805
|
|
|
|
key: train_jcc
|
|
value: [0.93181818 0.88888889 0.9 0.91111111 0.91111111 0.91954023
|
|
0.89772727 0.91954023 0.89772727 0.92045455]
|
|
|
|
mean value: 0.9097918843608499
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.12574768 0.10982108 0.10891128 0.11000681 0.10834146 0.1092
|
|
0.10903215 0.11250973 0.11054492 0.10791087]
|
|
|
|
mean value: 0.11120259761810303
|
|
|
|
key: score_time
|
|
value: [0.00880623 0.00850511 0.00842857 0.00843477 0.00855279 0.00872946
|
|
0.00836015 0.00840807 0.00873184 0.0089736 ]
|
|
|
|
mean value: 0.008593058586120606
|
|
|
|
key: test_mcc
|
|
value: [0.9 1. 0.9 0.9 1. 0.80903983
|
|
0.68888889 0.80507649 1. 0.68888889]
|
|
|
|
mean value: 0.8691894098633082
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 1. 0.94736842 0.94736842 1. 0.89473684
|
|
0.84210526 0.89473684 1. 0.84210526]
|
|
|
|
mean value: 0.9315789473684211
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 1. 0.94736842 0.94736842 1. 0.88888889
|
|
0.84210526 0.90909091 1. 0.84210526]
|
|
|
|
mean value: 0.9324295587453483
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.9 1. 0.9 0.9 1. 1.
|
|
0.88888889 0.83333333 1. 0.88888889]
|
|
|
|
mean value: 0.9311111111111111
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 0.8 0.8 1. 1. 0.8]
|
|
|
|
mean value: 0.9400000000000001
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.95 1. 0.95 0.95 1. 0.9
|
|
0.84444444 0.88888889 1. 0.84444444]
|
|
|
|
mean value: 0.9327777777777778
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.9 1. 0.9 0.9 1. 0.8
|
|
0.72727273 0.83333333 1. 0.72727273]
|
|
|
|
mean value: 0.8787878787878788
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.00977683 0.01165676 0.01222324 0.01371932 0.01175737 0.01175189
|
|
0.01182127 0.01176047 0.01254773 0.01222157]
|
|
|
|
mean value: 0.011923646926879883
|
|
|
|
key: score_time
|
|
value: [0.01100063 0.01107335 0.01077461 0.01081491 0.0108037 0.01076961
|
|
0.01068234 0.01070929 0.01347923 0.01104093]
|
|
|
|
mean value: 0.011114859580993652
|
|
|
|
key: test_mcc
|
|
value: [0.59554321 0.45643546 0.80903983 0.54433105 0.38204659 0.56694671
|
|
0.36666667 0.25844328 0.71611487 0.48934516]
|
|
|
|
mean value: 0.5184912848824843
|
|
|
|
key: train_mcc
|
|
value: [0.94158687 0.88403644 0.97687783 0.68754923 0.95321477 0.77792524
|
|
0.82502766 0.89967314 0.77850962 0.76887959]
|
|
|
|
mean value: 0.8493280396728016
|
|
|
|
key: test_accuracy
|
|
value: [0.78947368 0.68421053 0.89473684 0.73684211 0.68421053 0.73684211
|
|
0.68421053 0.63157895 0.84210526 0.73684211]
|
|
|
|
mean value: 0.7421052631578947
|
|
|
|
key: train_accuracy
|
|
value: [0.97076023 0.94152047 0.98830409 0.8245614 0.97660819 0.87719298
|
|
0.9122807 0.94736842 0.87719298 0.87134503]
|
|
|
|
mean value: 0.9187134502923976
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.5 0.9 0.61538462 0.7 0.66666667
|
|
0.7 0.66666667 0.86956522 0.7826087 ]
|
|
|
|
mean value: 0.7200891861761427
|
|
|
|
key: train_fscore
|
|
value: [0.97076023 0.94047619 0.98823529 0.79166667 0.97674419 0.8590604
|
|
0.91017964 0.94409938 0.89005236 0.88541667]
|
|
|
|
mean value: 0.9156691016197868
|
|
|
|
key: test_precision
|
|
value: [0.72727273 1. 0.81818182 1. 0.63636364 1.
|
|
0.7 0.63636364 0.76923077 0.69230769]
|
|
|
|
mean value: 0.7979720279720279
|
|
|
|
key: train_precision
|
|
value: [0.97647059 0.96341463 1. 0.98275862 0.97674419 1.
|
|
0.92682927 1. 0.80188679 0.79439252]
|
|
|
|
mean value: 0.9422496613227801
|
|
|
|
key: test_recall
|
|
value: [0.88888889 0.33333333 1. 0.44444444 0.77777778 0.5
|
|
0.7 0.7 1. 0.9 ]
|
|
|
|
mean value: 0.7244444444444444
|
|
|
|
key: train_recall
|
|
value: [0.96511628 0.91860465 0.97674419 0.6627907 0.97674419 0.75294118
|
|
0.89411765 0.89411765 1. 1. ]
|
|
|
|
mean value: 0.9041176470588235
|
|
|
|
key: test_roc_auc
|
|
value: [0.79444444 0.66666667 0.9 0.72222222 0.68888889 0.75
|
|
0.68333333 0.62777778 0.83333333 0.72777778]
|
|
|
|
mean value: 0.7394444444444445
|
|
|
|
key: train_roc_auc
|
|
value: [0.97079343 0.94165527 0.98837209 0.825513 0.97660739 0.87647059
|
|
0.9121751 0.94705882 0.87790698 0.87209302]
|
|
|
|
mean value: 0.9188645690834473
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.33333333 0.81818182 0.44444444 0.53846154 0.5
|
|
0.53846154 0.5 0.76923077 0.64285714]
|
|
|
|
mean value: 0.5751637251637252
|
|
|
|
key: train_jcc
|
|
value: [0.94318182 0.88764045 0.97674419 0.65517241 0.95454545 0.75294118
|
|
0.83516484 0.89411765 0.80188679 0.79439252]
|
|
|
|
mean value: 0.8495787296516654
|
|
|
|
MCC on Blind test: 0.23
|
|
|
|
Accuracy on Blind test: 0.77
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01883245 0.02752137 0.02750969 0.0275383 0.02753377 0.0268929
|
|
0.02757883 0.02765822 0.0218935 0.0287528 ]
|
|
|
|
mean value: 0.026171183586120604
|
|
|
|
key: score_time
|
|
value: [0.01898217 0.01984358 0.01071167 0.01931334 0.0211885 0.02128792
|
|
0.0107007 0.01979136 0.02170277 0.02104163]
|
|
|
|
mean value: 0.018456363677978517
|
|
|
|
key: test_mcc
|
|
value: [0.89893315 0.89893315 0.9 0.57777778 0.89893315 1.
|
|
0.58655573 0.68543653 1. 0.72456884]
|
|
|
|
mean value: 0.8171138317218849
|
|
|
|
key: train_mcc
|
|
value: [0.91819425 0.92982216 0.91870817 0.91870817 0.91819425 0.90666492
|
|
0.88303694 0.90739811 0.90666492 0.92982216]
|
|
|
|
mean value: 0.9137214046919647
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.94736842 0.94736842 0.78947368 0.94736842 1.
|
|
0.78947368 0.84210526 1. 0.84210526]
|
|
|
|
mean value: 0.9052631578947368
|
|
|
|
key: train_accuracy
|
|
value: [0.95906433 0.96491228 0.95906433 0.95906433 0.95906433 0.95321637
|
|
0.94152047 0.95321637 0.95321637 0.96491228]
|
|
|
|
mean value: 0.9567251461988304
|
|
|
|
key: test_fscore
|
|
value: [0.94117647 0.94117647 0.94736842 0.77777778 0.94117647 1.
|
|
0.81818182 0.85714286 1. 0.82352941]
|
|
|
|
mean value: 0.9047529697684497
|
|
|
|
key: train_fscore
|
|
value: [0.95906433 0.96511628 0.95857988 0.95857988 0.95906433 0.95238095
|
|
0.94117647 0.95180723 0.95238095 0.96470588]
|
|
|
|
mean value: 0.9562856183972881
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.9 0.77777778 1. 1.
|
|
0.75 0.81818182 1. 1. ]
|
|
|
|
mean value: 0.9245959595959596
|
|
|
|
key: train_precision
|
|
value: [0.96470588 0.96511628 0.97590361 0.97590361 0.96470588 0.96385542
|
|
0.94117647 0.97530864 0.96385542 0.96470588]
|
|
|
|
mean value: 0.9655237110981292
|
|
|
|
key: test_recall
|
|
value: [0.88888889 0.88888889 1. 0.77777778 0.88888889 1.
|
|
0.9 0.9 1. 0.7 ]
|
|
|
|
mean value: 0.8944444444444444
|
|
|
|
key: train_recall
|
|
value: [0.95348837 0.96511628 0.94186047 0.94186047 0.95348837 0.94117647
|
|
0.94117647 0.92941176 0.94117647 0.96470588]
|
|
|
|
mean value: 0.9473461012311901
|
|
|
|
key: test_roc_auc
|
|
value: [0.94444444 0.94444444 0.95 0.78888889 0.94444444 1.
|
|
0.78333333 0.83888889 1. 0.85 ]
|
|
|
|
mean value: 0.9044444444444445
|
|
|
|
key: train_roc_auc
|
|
value: [0.95909713 0.96491108 0.95916553 0.95916553 0.95909713 0.95314637
|
|
0.94151847 0.95307798 0.95314637 0.96491108]
|
|
|
|
mean value: 0.9567236662106703
|
|
|
|
key: test_jcc
|
|
value: [0.88888889 0.88888889 0.9 0.63636364 0.88888889 1.
|
|
0.69230769 0.75 1. 0.7 ]
|
|
|
|
mean value: 0.8345337995337995
|
|
|
|
key: train_jcc
|
|
value: [0.92134831 0.93258427 0.92045455 0.92045455 0.92134831 0.90909091
|
|
0.88888889 0.90804598 0.90909091 0.93181818]
|
|
|
|
mean value: 0.9163124855685878
|
|
|
|
MCC on Blind test: 0.19
|
|
|
|
Accuracy on Blind test: 0.76
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: /home/tanu/git/LSHTM_analysis/scripts/ml/./embb_config.py:183: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rus_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./embb_config.py:186: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rus_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.19279718 0.17178965 0.18140745 0.1711936 0.17231464 0.17179489
|
|
0.17664242 0.17488527 0.17687273 0.20818853]
|
|
|
|
mean value: 0.17978863716125487
|
|
|
|
key: score_time
|
|
value: [0.01818156 0.01183128 0.01928973 0.02106404 0.02174473 0.02162886
|
|
0.01080251 0.02007103 0.02017498 0.02091074]
|
|
|
|
mean value: 0.0185699462890625
|
|
|
|
key: test_mcc
|
|
value: [1. 0.71611487 0.80903983 0.9 0.89893315 0.89893315
|
|
0.80903983 0.78888889 1. 0.9 ]
|
|
|
|
mean value: 0.8720949732742082
|
|
|
|
key: train_mcc
|
|
value: [0.92982216 0.94157888 0.9300862 0.9300862 0.94157888 0.92982216
|
|
0.94158687 0.9649747 0.94158687 0.94158687]
|
|
|
|
mean value: 0.9392709798490188
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.84210526 0.89473684 0.94736842 0.94736842 0.94736842
|
|
0.89473684 0.89473684 1. 0.94736842]
|
|
|
|
mean value: 0.9315789473684211
|
|
|
|
key: train_accuracy
|
|
value: [0.96491228 0.97076023 0.96491228 0.96491228 0.97076023 0.96491228
|
|
0.97076023 0.98245614 0.97076023 0.97076023]
|
|
|
|
mean value: 0.9695906432748538
|
|
|
|
key: test_fscore
|
|
value: [1. 0.8 0.9 0.94736842 0.94117647 0.95238095
|
|
0.88888889 0.9 1. 0.94736842]
|
|
|
|
mean value: 0.927718315396334
|
|
|
|
key: train_fscore
|
|
value: [0.96511628 0.97109827 0.96470588 0.96470588 0.97109827 0.96470588
|
|
0.97076023 0.98224852 0.97076023 0.97076023]
|
|
|
|
mean value: 0.9695959680384943
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.81818182 0.9 1. 0.90909091
|
|
1. 0.9 1. 1. ]
|
|
|
|
mean value: 0.9527272727272728
|
|
|
|
key: train_precision
|
|
value: [0.96511628 0.96551724 0.97619048 0.97619048 0.96551724 0.96470588
|
|
0.96511628 0.98809524 0.96511628 0.96511628]
|
|
|
|
mean value: 0.9696681671866823
|
|
|
|
key: test_recall
|
|
value: [1. 0.66666667 1. 1. 0.88888889 1.
|
|
0.8 0.9 1. 0.9 ]
|
|
|
|
mean value: 0.9155555555555556
|
|
|
|
key: train_recall
|
|
value: [0.96511628 0.97674419 0.95348837 0.95348837 0.97674419 0.96470588
|
|
0.97647059 0.97647059 0.97647059 0.97647059]
|
|
|
|
mean value: 0.9696169630642955
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.83333333 0.9 0.95 0.94444444 0.94444444
|
|
0.9 0.89444444 1. 0.95 ]
|
|
|
|
mean value: 0.9316666666666666
|
|
|
|
key: train_roc_auc
|
|
value: [0.96491108 0.97072503 0.96497948 0.96497948 0.97072503 0.96491108
|
|
0.97079343 0.98242134 0.97079343 0.97079343]
|
|
|
|
mean value: 0.9696032831737347
|
|
|
|
key: test_jcc
|
|
value: [1. 0.66666667 0.81818182 0.9 0.88888889 0.90909091
|
|
0.8 0.81818182 1. 0.9 ]
|
|
|
|
mean value: 0.8701010101010102
|
|
|
|
key: train_jcc
|
|
value: [0.93258427 0.94382022 0.93181818 0.93181818 0.94382022 0.93181818
|
|
0.94318182 0.96511628 0.94318182 0.94318182]
|
|
|
|
mean value: 0.9410340998170891
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02805471 0.02848077 0.05038261 0.0256362 0.0284946 0.02617216
|
|
0.02909684 0.03196239 0.02643657 0.03116679]
|
|
|
|
mean value: 0.030588364601135253
|
|
|
|
key: score_time
|
|
value: [0.01084852 0.01086903 0.01122189 0.01082182 0.01088285 0.01087236
|
|
0.01082659 0.010993 0.01088881 0.01089334]
|
|
|
|
mean value: 0.010911822319030762
|
|
|
|
key: test_mcc
|
|
value: [0.83214239 0.91580648 0.85952381 0.86205133 0.9186708 0.77460317
|
|
0.77269114 0.91465912 0.80829038 0.94440028]
|
|
|
|
mean value: 0.8602838910751969
|
|
|
|
key: train_mcc
|
|
value: [0.88350545 0.88047545 0.88677561 0.89921235 0.90558158 0.88987113
|
|
0.87117688 0.88686262 0.88065992 0.86512643]
|
|
|
|
mean value: 0.8849247400940407
|
|
|
|
key: test_accuracy
|
|
value: [0.91549296 0.95774648 0.92957746 0.92957746 0.95774648 0.88732394
|
|
0.88571429 0.95714286 0.9 0.97142857]
|
|
|
|
mean value: 0.9291750503018108
|
|
|
|
key: train_accuracy
|
|
value: [0.94173228 0.94015748 0.94330709 0.9496063 0.95275591 0.94488189
|
|
0.93553459 0.94339623 0.94025157 0.93238994]
|
|
|
|
mean value: 0.9424013271925915
|
|
|
|
key: test_fscore
|
|
value: [0.91176471 0.95652174 0.92957746 0.93333333 0.96 0.88888889
|
|
0.88235294 0.95774648 0.89230769 0.97058824]
|
|
|
|
mean value: 0.9283081479675263
|
|
|
|
key: train_fscore
|
|
value: [0.94154818 0.93968254 0.94285714 0.94952681 0.95238095 0.94435612
|
|
0.93502377 0.94303797 0.93968254 0.93141946]
|
|
|
|
mean value: 0.9419515496773954
|
|
|
|
key: test_precision
|
|
value: [0.93939394 0.97058824 0.91666667 0.8974359 0.92307692 0.88888889
|
|
0.90909091 0.94444444 0.96666667 1. ]
|
|
|
|
mean value: 0.9356252570958453
|
|
|
|
key: train_precision
|
|
value: [0.94603175 0.94871795 0.95192308 0.94952681 0.95846645 0.95192308
|
|
0.94249201 0.94904459 0.94871795 0.94498382]
|
|
|
|
mean value: 0.9491827482405085
|
|
|
|
key: test_recall
|
|
value: [0.88571429 0.94285714 0.94285714 0.97222222 1. 0.88888889
|
|
0.85714286 0.97142857 0.82857143 0.94285714]
|
|
|
|
mean value: 0.9232539682539682
|
|
|
|
key: train_recall
|
|
value: [0.93710692 0.93081761 0.93396226 0.94952681 0.94637224 0.93690852
|
|
0.92767296 0.93710692 0.93081761 0.91823899]
|
|
|
|
mean value: 0.934853084141817
|
|
|
|
key: test_roc_auc
|
|
value: [0.91507937 0.95753968 0.9297619 0.92896825 0.95714286 0.88730159
|
|
0.88571429 0.95714286 0.9 0.97142857]
|
|
|
|
mean value: 0.9290079365079364
|
|
|
|
key: train_roc_auc
|
|
value: [0.94173958 0.94017221 0.94332183 0.94960617 0.95274587 0.94486935
|
|
0.93553459 0.94339623 0.94025157 0.93238994]
|
|
|
|
mean value: 0.9424027339642481
|
|
|
|
key: test_jcc
|
|
value: [0.83783784 0.91666667 0.86842105 0.875 0.92307692 0.8
|
|
0.78947368 0.91891892 0.80555556 0.94285714]
|
|
|
|
mean value: 0.867780778175515
|
|
|
|
key: train_jcc
|
|
value: [0.88955224 0.88622754 0.89189189 0.9039039 0.90909091 0.89457831
|
|
0.87797619 0.89221557 0.88622754 0.87164179]
|
|
|
|
mean value: 0.8903305897149288
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.75583076 0.91039634 0.74130368 0.72916102 0.87464905 0.73867345
|
|
0.78031874 0.86983085 0.73920441 0.86410165]
|
|
|
|
mean value: 0.8003469944000244
|
|
|
|
key: score_time
|
|
value: [0.0140388 0.01402926 0.01415515 0.01423001 0.01435184 0.01453185
|
|
0.01459599 0.01456857 0.01456761 0.01473331]
|
|
|
|
mean value: 0.014380240440368652
|
|
|
|
key: test_mcc
|
|
value: [0.88880092 0.97220047 0.91885703 0.9186708 0.83240693 0.94365079
|
|
0.94440028 0.91766294 0.94285714 0.97182532]
|
|
|
|
mean value: 0.9251332620272206
|
|
|
|
key: train_mcc
|
|
value: [0.96881022 0.95928679 0.96558776 0.96559014 0.96228175 0.96574383
|
|
0.96564279 0.9625688 0.96243548 0.95935195]
|
|
|
|
mean value: 0.9637299511779713
|
|
|
|
key: test_accuracy
|
|
value: [0.94366197 0.98591549 0.95774648 0.95774648 0.91549296 0.97183099
|
|
0.97142857 0.95714286 0.97142857 0.98571429]
|
|
|
|
mean value: 0.9618108651911469
|
|
|
|
key: train_accuracy
|
|
value: [0.98425197 0.97952756 0.98267717 0.98267717 0.98110236 0.98267717
|
|
0.9827044 0.98113208 0.98113208 0.97955975]
|
|
|
|
mean value: 0.9817441687713564
|
|
|
|
key: test_fscore
|
|
value: [0.94444444 0.98550725 0.95890411 0.96 0.91428571 0.97222222
|
|
0.97222222 0.95890411 0.97142857 0.98550725]
|
|
|
|
mean value: 0.962342588653488
|
|
|
|
key: train_fscore
|
|
value: [0.98447205 0.97978227 0.98289269 0.98283931 0.98119122 0.98289269
|
|
0.98289269 0.98136646 0.98130841 0.97978227]
|
|
|
|
mean value: 0.981942006942752
|
|
|
|
key: test_precision
|
|
value: [0.91891892 1. 0.92105263 0.92307692 0.94117647 0.97222222
|
|
0.94594595 0.92105263 0.97142857 1. ]
|
|
|
|
mean value: 0.9514874315338712
|
|
|
|
key: train_precision
|
|
value: [0.97239264 0.96923077 0.97230769 0.97222222 0.97507788 0.96932515
|
|
0.97230769 0.96932515 0.97222222 0.96923077]
|
|
|
|
mean value: 0.9713642193926582
|
|
|
|
key: test_recall
|
|
value: [0.97142857 0.97142857 1. 1. 0.88888889 0.97222222
|
|
1. 1. 0.97142857 0.97142857]
|
|
|
|
mean value: 0.9746825396825397
|
|
|
|
key: train_recall
|
|
value: [0.99685535 0.99056604 0.99371069 0.99369085 0.9873817 0.99684543
|
|
0.99371069 0.99371069 0.99056604 0.99056604]
|
|
|
|
mean value: 0.992760351566375
|
|
|
|
key: test_roc_auc
|
|
value: [0.94404762 0.98571429 0.95833333 0.95714286 0.91587302 0.9718254
|
|
0.97142857 0.95714286 0.97142857 0.98571429]
|
|
|
|
mean value: 0.9618650793650794
|
|
|
|
key: train_roc_auc
|
|
value: [0.98423209 0.97951015 0.98265976 0.98269448 0.98111224 0.98269944
|
|
0.9827044 0.98113208 0.98113208 0.97955975]
|
|
|
|
mean value: 0.981743646211535
|
|
|
|
key: test_jcc
|
|
value: [0.89473684 0.97142857 0.92105263 0.92307692 0.84210526 0.94594595
|
|
0.94594595 0.92105263 0.94444444 0.97142857]
|
|
|
|
mean value: 0.9281217770691454
|
|
|
|
key: train_jcc
|
|
value: [0.96941896 0.96036585 0.96636086 0.96625767 0.96307692 0.96636086
|
|
0.96636086 0.96341463 0.96330275 0.96036585]
|
|
|
|
mean value: 0.964528521459756
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01126242 0.01005054 0.00817871 0.00798202 0.00799775 0.00875211
|
|
0.00785851 0.00802612 0.00777888 0.00775719]
|
|
|
|
mean value: 0.008564424514770509
|
|
|
|
key: score_time
|
|
value: [0.01088428 0.00858426 0.00835347 0.00822639 0.00815606 0.00880837
|
|
0.00810361 0.00805187 0.00803757 0.00845098]
|
|
|
|
mean value: 0.008565688133239746
|
|
|
|
key: test_mcc
|
|
value: [0.83095238 0.77991323 0.88880092 0.83214239 0.78542356 0.63412698
|
|
0.47304992 0.72501849 0.7581754 0.80032673]
|
|
|
|
mean value: 0.7507929990980453
|
|
|
|
key: train_mcc
|
|
value: [0.78353551 0.76086142 0.77667955 0.77992485 0.75779503 0.77726182
|
|
0.66166919 0.76767526 0.77154353 0.76445717]
|
|
|
|
mean value: 0.7601403313212555
|
|
|
|
key: test_accuracy
|
|
value: [0.91549296 0.88732394 0.94366197 0.91549296 0.88732394 0.81690141
|
|
0.72857143 0.85714286 0.87142857 0.9 ]
|
|
|
|
mean value: 0.8723340040241448
|
|
|
|
key: train_accuracy
|
|
value: [0.89133858 0.88031496 0.88818898 0.88976378 0.87874016 0.88818898
|
|
0.82704403 0.8836478 0.88522013 0.88207547]
|
|
|
|
mean value: 0.8794522854454514
|
|
|
|
key: test_fscore
|
|
value: [0.91428571 0.89189189 0.94444444 0.91891892 0.8974359 0.81690141
|
|
0.68852459 0.86842105 0.85714286 0.90140845]
|
|
|
|
mean value: 0.8699375226070167
|
|
|
|
key: train_fscore
|
|
value: [0.89400922 0.88198758 0.88992248 0.89130435 0.88024883 0.89060092
|
|
0.81292517 0.88544892 0.88820827 0.88372093]
|
|
|
|
mean value: 0.8798376667002142
|
|
|
|
key: test_precision
|
|
value: [0.91428571 0.84615385 0.91891892 0.89473684 0.83333333 0.82857143
|
|
0.80769231 0.80487805 0.96428571 0.88888889]
|
|
|
|
mean value: 0.8701745043015903
|
|
|
|
key: train_precision
|
|
value: [0.87387387 0.87116564 0.87767584 0.87767584 0.86809816 0.87048193
|
|
0.88518519 0.87195122 0.86567164 0.87155963]
|
|
|
|
mean value: 0.8733338966738833
|
|
|
|
key: test_recall
|
|
value: [0.91428571 0.94285714 0.97142857 0.94444444 0.97222222 0.80555556
|
|
0.6 0.94285714 0.77142857 0.91428571]
|
|
|
|
mean value: 0.8779365079365079
|
|
|
|
key: train_recall
|
|
value: [0.91509434 0.89308176 0.90251572 0.90536278 0.89274448 0.91167192
|
|
0.75157233 0.89937107 0.91194969 0.89622642]
|
|
|
|
mean value: 0.8879590500565443
|
|
|
|
key: test_roc_auc
|
|
value: [0.91547619 0.88809524 0.94404762 0.91507937 0.88611111 0.81706349
|
|
0.72857143 0.85714286 0.87142857 0.9 ]
|
|
|
|
mean value: 0.8723015873015872
|
|
|
|
key: train_roc_auc
|
|
value: [0.89130111 0.88029482 0.88816638 0.88978831 0.87876218 0.8882259
|
|
0.82704403 0.8836478 0.88522013 0.88207547]
|
|
|
|
mean value: 0.8794526119477015
|
|
|
|
key: test_jcc
|
|
value: [0.84210526 0.80487805 0.89473684 0.85 0.81395349 0.69047619
|
|
0.525 0.76744186 0.75 0.82051282]
|
|
|
|
mean value: 0.7759104513869866
|
|
|
|
key: train_jcc
|
|
value: [0.80833333 0.78888889 0.80167598 0.80392157 0.78611111 0.80277778
|
|
0.68481375 0.79444444 0.79889807 0.79166667]
|
|
|
|
mean value: 0.786153159371031
|
|
|
|
MCC on Blind test: 0.28
|
|
|
|
Accuracy on Blind test: 0.76
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00870681 0.00830054 0.00821733 0.00855494 0.00802279 0.00820494
|
|
0.00809073 0.00801396 0.0081532 0.0081017 ]
|
|
|
|
mean value: 0.0082366943359375
|
|
|
|
key: score_time
|
|
value: [0.00878096 0.00862241 0.00818205 0.0089016 0.00812292 0.00837469
|
|
0.00819492 0.00814867 0.00829911 0.00819397]
|
|
|
|
mean value: 0.008382129669189452
|
|
|
|
key: test_mcc
|
|
value: [0.69047619 0.63383658 0.69762232 0.60881948 0.53699395 0.52233453
|
|
0.48891771 0.7581754 0.65821838 0.60901553]
|
|
|
|
mean value: 0.6204410069366437
|
|
|
|
key: train_mcc
|
|
value: [0.60949181 0.61718891 0.63374209 0.58769936 0.60523202 0.64225486
|
|
0.62174197 0.61279592 0.60729861 0.63335019]
|
|
|
|
mean value: 0.6170795744074016
|
|
|
|
key: test_accuracy
|
|
value: [0.84507042 0.81690141 0.84507042 0.8028169 0.76056338 0.76056338
|
|
0.74285714 0.87142857 0.82857143 0.8 ]
|
|
|
|
mean value: 0.8073843058350101
|
|
|
|
key: train_accuracy
|
|
value: [0.8015748 0.80787402 0.81574803 0.79212598 0.8015748 0.82047244
|
|
0.80974843 0.80503145 0.80031447 0.81289308]
|
|
|
|
mean value: 0.8067357500123805
|
|
|
|
key: test_fscore
|
|
value: [0.84507042 0.8115942 0.85333333 0.81578947 0.79012346 0.77333333
|
|
0.75675676 0.88311688 0.83333333 0.81578947]
|
|
|
|
mean value: 0.8178240669465947
|
|
|
|
key: train_fscore
|
|
value: [0.81524927 0.81458967 0.82352941 0.80239521 0.80909091 0.82568807
|
|
0.81749623 0.81381381 0.81405564 0.82627737]
|
|
|
|
mean value: 0.8162185588580184
|
|
|
|
key: test_precision
|
|
value: [0.83333333 0.82352941 0.8 0.775 0.71111111 0.74358974
|
|
0.71794872 0.80952381 0.81081081 0.75609756]
|
|
|
|
mean value: 0.7780944499057842
|
|
|
|
key: train_precision
|
|
value: [0.76373626 0.78823529 0.79130435 0.76353276 0.77842566 0.80118694
|
|
0.78550725 0.77873563 0.76164384 0.77111717]
|
|
|
|
mean value: 0.7783425149199308
|
|
|
|
key: test_recall
|
|
value: [0.85714286 0.8 0.91428571 0.86111111 0.88888889 0.80555556
|
|
0.8 0.97142857 0.85714286 0.88571429]
|
|
|
|
mean value: 0.8641269841269841
|
|
|
|
key: train_recall
|
|
value: [0.87421384 0.8427673 0.85849057 0.84542587 0.84227129 0.85173502
|
|
0.85220126 0.85220126 0.87421384 0.88993711]
|
|
|
|
mean value: 0.8583457333888856
|
|
|
|
key: test_roc_auc
|
|
value: [0.8452381 0.81666667 0.84603175 0.80198413 0.75873016 0.75992063
|
|
0.74285714 0.87142857 0.82857143 0.8 ]
|
|
|
|
mean value: 0.8071428571428572
|
|
|
|
key: train_roc_auc
|
|
value: [0.80146023 0.80781898 0.81568061 0.79220979 0.80163879 0.8205216
|
|
0.80974843 0.80503145 0.80031447 0.81289308]
|
|
|
|
mean value: 0.8067317421582049
|
|
|
|
key: test_jcc
|
|
value: [0.73170732 0.68292683 0.74418605 0.68888889 0.65306122 0.63043478
|
|
0.60869565 0.79069767 0.71428571 0.68888889]
|
|
|
|
mean value: 0.6933773018607593
|
|
|
|
key: train_jcc
|
|
value: [0.68811881 0.68717949 0.7 0.67 0.67938931 0.703125
|
|
0.69132653 0.68607595 0.68641975 0.7039801 ]
|
|
|
|
mean value: 0.6895614944606016
|
|
|
|
MCC on Blind test: 0.22
|
|
|
|
Accuracy on Blind test: 0.63
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00879741 0.00835109 0.00783396 0.00787377 0.00823879 0.00841093
|
|
0.00836062 0.00857162 0.00833774 0.00834942]
|
|
|
|
mean value: 0.008312535285949708
|
|
|
|
key: score_time
|
|
value: [0.01278543 0.01182032 0.0121007 0.01211429 0.01177168 0.01202226
|
|
0.01196551 0.01235938 0.01185727 0.01215744]
|
|
|
|
mean value: 0.012095427513122559
|
|
|
|
key: test_mcc
|
|
value: [0.7468254 0.81122596 0.78640246 0.56233478 0.72811105 0.69047619
|
|
0.65714286 0.80829038 0.85749293 0.72501849]
|
|
|
|
mean value: 0.7373320483363375
|
|
|
|
key: train_mcc
|
|
value: [0.81913455 0.82781019 0.80765457 0.82304906 0.81230309 0.83247297
|
|
0.84055828 0.80334707 0.80678833 0.83153352]
|
|
|
|
mean value: 0.82046516428445
|
|
|
|
key: test_accuracy
|
|
value: [0.87323944 0.90140845 0.88732394 0.77464789 0.85915493 0.84507042
|
|
0.82857143 0.9 0.92857143 0.85714286]
|
|
|
|
mean value: 0.8655130784708249
|
|
|
|
key: train_accuracy
|
|
value: [0.90708661 0.91181102 0.9007874 0.91023622 0.90393701 0.91338583
|
|
0.91981132 0.89937107 0.9009434 0.91352201]
|
|
|
|
mean value: 0.9080891893230327
|
|
|
|
key: test_fscore
|
|
value: [0.87323944 0.90666667 0.89473684 0.8 0.87179487 0.84507042
|
|
0.82857143 0.90666667 0.92957746 0.86842105]
|
|
|
|
mean value: 0.8724744852380137
|
|
|
|
key: train_fscore
|
|
value: [0.91207154 0.91616766 0.90666667 0.91350531 0.90854573 0.91803279
|
|
0.92165899 0.90447761 0.90611028 0.91778774]
|
|
|
|
mean value: 0.9125024315633475
|
|
|
|
key: test_precision
|
|
value: [0.86111111 0.85 0.82926829 0.72727273 0.80952381 0.85714286
|
|
0.82857143 0.85 0.91666667 0.80487805]
|
|
|
|
mean value: 0.8334434941752015
|
|
|
|
key: train_precision
|
|
value: [0.86685552 0.87428571 0.85714286 0.88011696 0.86571429 0.8700565
|
|
0.9009009 0.86079545 0.8611898 0.87464387]
|
|
|
|
mean value: 0.8711701869251592
|
|
|
|
key: test_recall
|
|
value: [0.88571429 0.97142857 0.97142857 0.88888889 0.94444444 0.83333333
|
|
0.82857143 0.97142857 0.94285714 0.94285714]
|
|
|
|
mean value: 0.9180952380952381
|
|
|
|
key: train_recall
|
|
value: [0.96226415 0.96226415 0.96226415 0.94952681 0.95583596 0.97160883
|
|
0.94339623 0.95283019 0.95597484 0.96540881]
|
|
|
|
mean value: 0.9581374124556078
|
|
|
|
key: test_roc_auc
|
|
value: [0.8734127 0.90238095 0.88849206 0.77301587 0.85793651 0.8452381
|
|
0.82857143 0.9 0.92857143 0.85714286]
|
|
|
|
mean value: 0.8654761904761905
|
|
|
|
key: train_roc_auc
|
|
value: [0.90699958 0.91173144 0.90069044 0.910298 0.90401861 0.91347737
|
|
0.91981132 0.89937107 0.9009434 0.91352201]
|
|
|
|
mean value: 0.9080863242267325
|
|
|
|
key: test_jcc
|
|
value: [0.775 0.82926829 0.80952381 0.66666667 0.77272727 0.73170732
|
|
0.70731707 0.82926829 0.86842105 0.76744186]
|
|
|
|
mean value: 0.77573416376242
|
|
|
|
key: train_jcc
|
|
value: [0.83835616 0.84530387 0.82926829 0.84078212 0.83241758 0.84848485
|
|
0.85470085 0.82561308 0.82833787 0.8480663 ]
|
|
|
|
mean value: 0.8391330984999132
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.74
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02019525 0.02194238 0.02207088 0.01722932 0.0174396 0.01698589
|
|
0.0200386 0.01803017 0.01915526 0.01805472]
|
|
|
|
mean value: 0.019114208221435548
|
|
|
|
key: score_time
|
|
value: [0.00993657 0.0108788 0.01091743 0.00970459 0.00995064 0.00961328
|
|
0.00987148 0.01081896 0.00993538 0.00960827]
|
|
|
|
mean value: 0.010123538970947265
|
|
|
|
key: test_mcc
|
|
value: [0.8594125 0.89315217 0.9451949 0.8594125 0.9186708 0.72329377
|
|
0.77142857 0.860309 0.8340361 0.82992752]
|
|
|
|
mean value: 0.8494837832110852
|
|
|
|
key: train_mcc
|
|
value: [0.89928172 0.88663036 0.88976737 0.88987659 0.88663261 0.88661389
|
|
0.874283 0.87739754 0.88368712 0.88695034]
|
|
|
|
mean value: 0.8861120546487431
|
|
|
|
key: test_accuracy
|
|
value: [0.92957746 0.94366197 0.97183099 0.92957746 0.95774648 0.85915493
|
|
0.88571429 0.92857143 0.91428571 0.91428571]
|
|
|
|
mean value: 0.923440643863179
|
|
|
|
key: train_accuracy
|
|
value: [0.9496063 0.94330709 0.94488189 0.94488189 0.94330709 0.94330709
|
|
0.93710692 0.93867925 0.9418239 0.94339623]
|
|
|
|
mean value: 0.9430297627890853
|
|
|
|
key: test_fscore
|
|
value: [0.92753623 0.94594595 0.97222222 0.93150685 0.96 0.85294118
|
|
0.88571429 0.93150685 0.90909091 0.91666667]
|
|
|
|
mean value: 0.9233131136624813
|
|
|
|
key: train_fscore
|
|
value: [0.95 0.94357367 0.94505495 0.94522692 0.94339623 0.94321767
|
|
0.9375 0.93838863 0.94209703 0.94392523]
|
|
|
|
mean value: 0.9432380307696029
|
|
|
|
key: test_precision
|
|
value: [0.94117647 0.8974359 0.94594595 0.91891892 0.92307692 0.90625
|
|
0.88571429 0.89473684 0.96774194 0.89189189]
|
|
|
|
mean value: 0.9172889111161232
|
|
|
|
key: train_precision
|
|
value: [0.94409938 0.940625 0.94357367 0.9378882 0.94043887 0.94321767
|
|
0.93167702 0.94285714 0.9376947 0.93518519]
|
|
|
|
mean value: 0.9397256833165559
|
|
|
|
key: test_recall
|
|
value: [0.91428571 1. 1. 0.94444444 1. 0.80555556
|
|
0.88571429 0.97142857 0.85714286 0.94285714]
|
|
|
|
mean value: 0.9321428571428572
|
|
|
|
key: train_recall
|
|
value: [0.95597484 0.94654088 0.94654088 0.95268139 0.94637224 0.94321767
|
|
0.94339623 0.93396226 0.94654088 0.95283019]
|
|
|
|
mean value: 0.9468057456897407
|
|
|
|
key: test_roc_auc
|
|
value: [0.92936508 0.94444444 0.97222222 0.92936508 0.95714286 0.85992063
|
|
0.88571429 0.92857143 0.91428571 0.91428571]
|
|
|
|
mean value: 0.923531746031746
|
|
|
|
key: train_roc_auc
|
|
value: [0.94959625 0.94330199 0.94487927 0.94489415 0.94331191 0.94330695
|
|
0.93710692 0.93867925 0.9418239 0.94339623]
|
|
|
|
mean value: 0.9430296807729699
|
|
|
|
key: test_jcc
|
|
value: [0.86486486 0.8974359 0.94594595 0.87179487 0.92307692 0.74358974
|
|
0.79487179 0.87179487 0.83333333 0.84615385]
|
|
|
|
mean value: 0.8592862092862092
|
|
|
|
key: train_jcc
|
|
value: [0.9047619 0.89317507 0.89583333 0.89614243 0.89285714 0.89253731
|
|
0.88235294 0.88392857 0.89053254 0.89380531]
|
|
|
|
mean value: 0.8925926568521868
|
|
|
|
MCC on Blind test: 0.32
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.9707489 1.80206823 2.04756761 2.01823282 1.97800541 1.954983
|
|
2.09475994 1.9617672 1.91035748 1.84747243]
|
|
|
|
mean value: 1.9585963010787963
|
|
|
|
key: score_time
|
|
value: [0.01395583 0.01141071 0.02199006 0.01435852 0.01450992 0.01351452
|
|
0.01374483 0.01391625 0.02986383 0.01390171]
|
|
|
|
mean value: 0.016116619110107422
|
|
|
|
key: test_mcc
|
|
value: [0.88730159 0.97222222 0.91885703 0.91580648 0.94511009 0.97220047
|
|
0.94440028 0.94440028 0.97182532 0.97182532]
|
|
|
|
mean value: 0.944394907660236
|
|
|
|
key: train_mcc
|
|
value: [0.99370077 0.99372043 0.99372043 0.99685535 0.99685535 0.99685535
|
|
0.99686027 1. 0.99686027 0.99686027]
|
|
|
|
mean value: 0.9962288484844167
|
|
|
|
key: test_accuracy
|
|
value: [0.94366197 0.98591549 0.95774648 0.95774648 0.97183099 0.98591549
|
|
0.97142857 0.97142857 0.98571429 0.98571429]
|
|
|
|
mean value: 0.9717102615694165
|
|
|
|
key: train_accuracy
|
|
value: [0.99685039 0.99685039 0.99685039 0.9984252 0.9984252 0.9984252
|
|
0.99842767 1. 0.99842767 0.99842767]
|
|
|
|
mean value: 0.9981109790521467
|
|
|
|
key: test_fscore
|
|
value: [0.94285714 0.98591549 0.95890411 0.95890411 0.97297297 0.98630137
|
|
0.97222222 0.97222222 0.98591549 0.98591549]
|
|
|
|
mean value: 0.9722130628188895
|
|
|
|
key: train_fscore
|
|
value: [0.99685535 0.9968652 0.9968652 0.9984252 0.9984252 0.9984252
|
|
0.99843014 1. 0.99843014 0.99843014]
|
|
|
|
mean value: 0.9981151767848494
|
|
|
|
key: test_precision
|
|
value: [0.94285714 0.97222222 0.92105263 0.94594595 0.94736842 0.97297297
|
|
0.94594595 0.94594595 0.97222222 0.97222222]
|
|
|
|
mean value: 0.9538755672966199
|
|
|
|
key: train_precision
|
|
value: [0.99685535 0.99375 0.99375 0.99685535 0.99685535 0.99685535
|
|
0.9968652 1. 0.9968652 0.9968652 ]
|
|
|
|
mean value: 0.9965516994933066
|
|
|
|
key: test_recall
|
|
value: [0.94285714 1. 1. 0.97222222 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9915079365079364
|
|
|
|
key: train_recall
|
|
value: [0.99685535 1. 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.999685534591195
|
|
|
|
key: test_roc_auc
|
|
value: [0.94365079 0.98611111 0.95833333 0.95753968 0.97142857 0.98571429
|
|
0.97142857 0.97142857 0.98571429 0.98571429]
|
|
|
|
mean value: 0.9717063492063491
|
|
|
|
key: train_roc_auc
|
|
value: [0.99685039 0.99684543 0.99684543 0.99842767 0.99842767 0.99842767
|
|
0.99842767 1. 0.99842767 0.99842767]
|
|
|
|
mean value: 0.9981107275360594
|
|
|
|
key: test_jcc
|
|
value: [0.89189189 0.97222222 0.92105263 0.92105263 0.94736842 0.97297297
|
|
0.94594595 0.94594595 0.97222222 0.97222222]
|
|
|
|
mean value: 0.946289710763395
|
|
|
|
key: train_jcc
|
|
value: [0.99373041 0.99375 0.99375 0.99685535 0.99685535 0.99685535
|
|
0.9968652 1. 0.9968652 0.9968652 ]
|
|
|
|
mean value: 0.9962392056544627
|
|
|
|
MCC on Blind test: 0.27
|
|
|
|
Accuracy on Blind test: 0.84
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02684522 0.01258469 0.0121913 0.01171684 0.01236367 0.01212668
|
|
0.01183128 0.01180315 0.01206851 0.01324224]
|
|
|
|
mean value: 0.013677358627319336
|
|
|
|
key: score_time
|
|
value: [0.00874949 0.00807166 0.00804973 0.00800586 0.0087328 0.00819206
|
|
0.00814438 0.00796533 0.00803995 0.00842094]
|
|
|
|
mean value: 0.008237218856811524
|
|
|
|
key: test_mcc
|
|
value: [0.9451949 0.97220047 0.97220047 0.97220047 0.94511009 0.94511009
|
|
0.94440028 0.89155583 0.97182532 0.97182532]
|
|
|
|
mean value: 0.9531623215602024
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.97183099 0.98591549 0.98591549 0.98591549 0.97183099 0.97183099
|
|
0.97142857 0.94285714 0.98571429 0.98571429]
|
|
|
|
mean value: 0.9758953722334004
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.97222222 0.98550725 0.98550725 0.98630137 0.97297297 0.97297297
|
|
0.97222222 0.94594595 0.98591549 0.98591549]
|
|
|
|
mean value: 0.9765483184868466
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.94594595 1. 1. 0.97297297 0.94736842 0.94736842
|
|
0.94594595 0.8974359 0.97222222 0.97222222]
|
|
|
|
mean value: 0.9601482048850469
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.97142857 0.97142857 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9942857142857143
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.97222222 0.98571429 0.98571429 0.98571429 0.97142857 0.97142857
|
|
0.97142857 0.94285714 0.98571429 0.98571429]
|
|
|
|
mean value: 0.9757936507936508
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.94594595 0.97142857 0.97142857 0.97297297 0.94736842 0.94736842
|
|
0.94594595 0.8974359 0.97222222 0.97222222]
|
|
|
|
mean value: 0.9544339191707613
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.23
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.11004734 0.11200666 0.105335 0.10864019 0.1060133 0.10523963
|
|
0.1030283 0.10427046 0.10395288 0.10813165]
|
|
|
|
mean value: 0.10666654109954835
|
|
|
|
key: score_time
|
|
value: [0.01880479 0.01825428 0.01717353 0.01854348 0.01716185 0.01714182
|
|
0.01717067 0.01719475 0.01825452 0.01732445]
|
|
|
|
mean value: 0.017702412605285645
|
|
|
|
key: test_mcc
|
|
value: [0.94365079 1. 0.97222222 0.94365079 0.91580648 0.88880092
|
|
0.97182532 0.97182532 0.97182532 1. ]
|
|
|
|
mean value: 0.9579607157012351
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.97183099 1. 0.98591549 0.97183099 0.95774648 0.94366197
|
|
0.98571429 0.98571429 0.98571429 1. ]
|
|
|
|
mean value: 0.9788128772635815
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.97142857 1. 0.98591549 0.97222222 0.95890411 0.94285714
|
|
0.98591549 0.98591549 0.98591549 1. ]
|
|
|
|
mean value: 0.9789074017927963
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.97142857 1. 0.97222222 0.97222222 0.94594595 0.97058824
|
|
0.97222222 0.97222222 0.97222222 1. ]
|
|
|
|
mean value: 0.9749073863779746
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.97142857 1. 1. 0.97222222 0.97222222 0.91666667
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9832539682539683
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9718254 1. 0.98611111 0.9718254 0.95753968 0.94404762
|
|
0.98571429 0.98571429 0.98571429 1. ]
|
|
|
|
mean value: 0.9788492063492064
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.94444444 1. 0.97222222 0.94594595 0.92105263 0.89189189
|
|
0.97222222 0.97222222 0.97222222 1. ]
|
|
|
|
mean value: 0.9592223802750118
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.38
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00825858 0.00866151 0.00814795 0.0079174 0.00801253 0.00788665
|
|
0.00870991 0.00868416 0.00885129 0.00809765]
|
|
|
|
mean value: 0.008322763442993163
|
|
|
|
key: score_time
|
|
value: [0.00829744 0.00852799 0.00861549 0.00824738 0.0080688 0.00800133
|
|
0.0085113 0.00867772 0.00855446 0.00791121]
|
|
|
|
mean value: 0.008341312408447266
|
|
|
|
key: test_mcc
|
|
value: [0.94365079 0.8365327 0.91885703 0.86205133 0.81050059 0.91885703
|
|
0.89155583 0.8871639 0.91766294 0.8340361 ]
|
|
|
|
mean value: 0.8820868251428485
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.97183099 0.91549296 0.95774648 0.92957746 0.90140845 0.95774648
|
|
0.94285714 0.94285714 0.95714286 0.91428571]
|
|
|
|
mean value: 0.9390945674044265
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.97142857 0.91891892 0.95890411 0.93333333 0.90909091 0.95652174
|
|
0.94594595 0.94117647 0.95890411 0.91891892]
|
|
|
|
mean value: 0.941314302653335
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.97142857 0.87179487 0.92105263 0.8974359 0.85365854 1.
|
|
0.8974359 0.96969697 0.92105263 0.87179487]
|
|
|
|
mean value: 0.9175350879330341
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.97142857 0.97142857 1. 0.97222222 0.97222222 0.91666667
|
|
1. 0.91428571 1. 0.97142857]
|
|
|
|
mean value: 0.9689682539682539
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9718254 0.91626984 0.95833333 0.92896825 0.90039683 0.95833333
|
|
0.94285714 0.94285714 0.95714286 0.91428571]
|
|
|
|
mean value: 0.9391269841269841
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.94444444 0.85 0.92105263 0.875 0.83333333 0.91666667
|
|
0.8974359 0.88888889 0.92105263 0.85 ]
|
|
|
|
mean value: 0.8897874493927125
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.21
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.382689 1.36185718 1.36501622 1.36442327 1.36342573 1.37483025
|
|
1.44966221 1.39965343 1.40602112 1.3880384 ]
|
|
|
|
mean value: 1.3855616807937623
|
|
|
|
key: score_time
|
|
value: [0.09274054 0.09215426 0.09248495 0.0925467 0.09295416 0.09729934
|
|
0.10089469 0.09895921 0.09574556 0.15336585]
|
|
|
|
mean value: 0.10091452598571778
|
|
|
|
key: test_mcc
|
|
value: [0.9451949 1. 0.9451949 0.97220047 0.94511009 0.97220047
|
|
0.94440028 0.97182532 0.97182532 1. ]
|
|
|
|
mean value: 0.9667951728615773
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.97183099 1. 0.97183099 0.98591549 0.97183099 0.98591549
|
|
0.97142857 0.98571429 0.98571429 1. ]
|
|
|
|
mean value: 0.9830181086519115
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.97222222 1. 0.97222222 0.98630137 0.97297297 0.98630137
|
|
0.97222222 0.98591549 0.98591549 1. ]
|
|
|
|
mean value: 0.983407336528116
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.94594595 1. 0.94594595 0.97297297 0.94736842 0.97297297
|
|
0.94594595 0.97222222 0.97222222 1. ]
|
|
|
|
mean value: 0.967559664928086
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.97222222 1. 0.97222222 0.98571429 0.97142857 0.98571429
|
|
0.97142857 0.98571429 0.98571429 1. ]
|
|
|
|
mean value: 0.983015873015873
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.94594595 1. 0.94594595 0.97297297 0.94736842 0.97297297
|
|
0.94594595 0.97222222 0.97222222 1. ]
|
|
|
|
mean value: 0.967559664928086
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.91464925 0.9240458 0.96008396 0.89609337 0.91591477 0.93456602
|
|
0.91718268 0.93010211 0.98904967 0.94583178]
|
|
|
|
mean value: 0.9327519416809082
|
|
|
|
key: score_time
|
|
value: [0.24406385 0.26979637 0.23408437 0.26847029 0.24405241 0.2887404
|
|
0.27261448 0.26284695 0.19351506 0.25960517]
|
|
|
|
mean value: 0.25377893447875977
|
|
|
|
key: test_mcc
|
|
value: [0.91587302 1. 0.9451949 0.97220047 0.94511009 0.94365079
|
|
0.97182532 0.97182532 0.97182532 1. ]
|
|
|
|
mean value: 0.9637505209774089
|
|
|
|
key: train_mcc
|
|
value: [0.96881022 0.96250874 0.96867592 0.96867777 0.97177468 0.96881268
|
|
0.96564279 0.96872591 0.96564279 0.9625688 ]
|
|
|
|
mean value: 0.9671840283439759
|
|
|
|
key: test_accuracy
|
|
value: [0.95774648 1. 0.97183099 0.98591549 0.97183099 0.97183099
|
|
0.98571429 0.98571429 0.98571429 1. ]
|
|
|
|
mean value: 0.9816297786720323
|
|
|
|
key: train_accuracy
|
|
value: [0.98425197 0.98110236 0.98425197 0.98425197 0.98582677 0.98425197
|
|
0.9827044 0.98427673 0.9827044 0.98113208]
|
|
|
|
mean value: 0.9834754617936908
|
|
|
|
key: test_fscore
|
|
value: [0.95774648 1. 0.97222222 0.98630137 0.97297297 0.97222222
|
|
0.98591549 0.98591549 0.98591549 1. ]
|
|
|
|
mean value: 0.981921174502691
|
|
|
|
key: train_fscore
|
|
value: [0.98447205 0.98136646 0.98442368 0.984375 0.98591549 0.98442368
|
|
0.98289269 0.98442368 0.98289269 0.98136646]
|
|
|
|
mean value: 0.9836551870965667
|
|
|
|
key: test_precision
|
|
value: [0.94444444 1. 0.94594595 0.97297297 0.94736842 0.97222222
|
|
0.97222222 0.97222222 0.97222222 1. ]
|
|
|
|
mean value: 0.9699620673304884
|
|
|
|
key: train_precision
|
|
value: [0.97239264 0.96932515 0.97530864 0.9752322 0.97826087 0.97230769
|
|
0.97230769 0.97530864 0.97230769 0.96932515]
|
|
|
|
mean value: 0.9732076373366603
|
|
|
|
key: test_recall
|
|
value: [0.97142857 1. 1. 1. 1. 0.97222222
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9943650793650793
|
|
|
|
key: train_recall
|
|
value: [0.99685535 0.99371069 0.99371069 0.99369085 0.99369085 0.99684543
|
|
0.99371069 0.99371069 0.99371069 0.99371069]
|
|
|
|
mean value: 0.9943346626192886
|
|
|
|
key: test_roc_auc
|
|
value: [0.95793651 1. 0.97222222 0.98571429 0.97142857 0.9718254
|
|
0.98571429 0.98571429 0.98571429 1. ]
|
|
|
|
mean value: 0.9816269841269841
|
|
|
|
key: train_roc_auc
|
|
value: [0.98423209 0.98108248 0.98423705 0.98426681 0.98583914 0.98427177
|
|
0.9827044 0.98427673 0.9827044 0.98113208]
|
|
|
|
mean value: 0.983474693966629
|
|
|
|
key: test_jcc
|
|
value: [0.91891892 1. 0.94594595 0.97297297 0.94736842 0.94594595
|
|
0.97222222 0.97222222 0.97222222 1. ]
|
|
|
|
mean value: 0.9647818871503082
|
|
|
|
key: train_jcc
|
|
value: [0.96941896 0.96341463 0.96932515 0.96923077 0.97222222 0.96932515
|
|
0.96636086 0.96932515 0.96636086 0.96341463]
|
|
|
|
mean value: 0.9678398392651248
|
|
|
|
MCC on Blind test: 0.31
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02012014 0.00854611 0.00811195 0.00855899 0.0088582 0.00802565
|
|
0.00814772 0.00833344 0.00853729 0.00875807]
|
|
|
|
mean value: 0.009599757194519044
|
|
|
|
key: score_time
|
|
value: [0.01067162 0.00829363 0.00833917 0.00915647 0.00838065 0.00874257
|
|
0.00873756 0.00849104 0.00841093 0.00848961]
|
|
|
|
mean value: 0.008771324157714843
|
|
|
|
key: test_mcc
|
|
value: [0.69047619 0.63383658 0.69762232 0.60881948 0.53699395 0.52233453
|
|
0.48891771 0.7581754 0.65821838 0.60901553]
|
|
|
|
mean value: 0.6204410069366437
|
|
|
|
key: train_mcc
|
|
value: [0.60949181 0.61718891 0.63374209 0.58769936 0.60523202 0.64225486
|
|
0.62174197 0.61279592 0.60729861 0.63335019]
|
|
|
|
mean value: 0.6170795744074016
|
|
|
|
key: test_accuracy
|
|
value: [0.84507042 0.81690141 0.84507042 0.8028169 0.76056338 0.76056338
|
|
0.74285714 0.87142857 0.82857143 0.8 ]
|
|
|
|
mean value: 0.8073843058350101
|
|
|
|
key: train_accuracy
|
|
value: [0.8015748 0.80787402 0.81574803 0.79212598 0.8015748 0.82047244
|
|
0.80974843 0.80503145 0.80031447 0.81289308]
|
|
|
|
mean value: 0.8067357500123805
|
|
|
|
key: test_fscore
|
|
value: [0.84507042 0.8115942 0.85333333 0.81578947 0.79012346 0.77333333
|
|
0.75675676 0.88311688 0.83333333 0.81578947]
|
|
|
|
mean value: 0.8178240669465947
|
|
|
|
key: train_fscore
|
|
value: [0.81524927 0.81458967 0.82352941 0.80239521 0.80909091 0.82568807
|
|
0.81749623 0.81381381 0.81405564 0.82627737]
|
|
|
|
mean value: 0.8162185588580184
|
|
|
|
key: test_precision
|
|
value: [0.83333333 0.82352941 0.8 0.775 0.71111111 0.74358974
|
|
0.71794872 0.80952381 0.81081081 0.75609756]
|
|
|
|
mean value: 0.7780944499057842
|
|
|
|
key: train_precision
|
|
value: [0.76373626 0.78823529 0.79130435 0.76353276 0.77842566 0.80118694
|
|
0.78550725 0.77873563 0.76164384 0.77111717]
|
|
|
|
mean value: 0.7783425149199308
|
|
|
|
key: test_recall
|
|
value: [0.85714286 0.8 0.91428571 0.86111111 0.88888889 0.80555556
|
|
0.8 0.97142857 0.85714286 0.88571429]
|
|
|
|
mean value: 0.8641269841269841
|
|
|
|
key: train_recall
|
|
value: [0.87421384 0.8427673 0.85849057 0.84542587 0.84227129 0.85173502
|
|
0.85220126 0.85220126 0.87421384 0.88993711]
|
|
|
|
mean value: 0.8583457333888856
|
|
|
|
key: test_roc_auc
|
|
value: [0.8452381 0.81666667 0.84603175 0.80198413 0.75873016 0.75992063
|
|
0.74285714 0.87142857 0.82857143 0.8 ]
|
|
|
|
mean value: 0.8071428571428572
|
|
|
|
key: train_roc_auc
|
|
value: [0.80146023 0.80781898 0.81568061 0.79220979 0.80163879 0.8205216
|
|
0.80974843 0.80503145 0.80031447 0.81289308]
|
|
|
|
mean value: 0.8067317421582049
|
|
|
|
key: test_jcc
|
|
value: [0.73170732 0.68292683 0.74418605 0.68888889 0.65306122 0.63043478
|
|
0.60869565 0.79069767 0.71428571 0.68888889]
|
|
|
|
mean value: 0.6933773018607593
|
|
|
|
key: train_jcc
|
|
value: [0.68811881 0.68717949 0.7 0.67 0.67938931 0.703125
|
|
0.69132653 0.68607595 0.68641975 0.7039801 ]
|
|
|
|
mean value: 0.6895614944606016
|
|
|
|
MCC on Blind test: 0.22
|
|
|
|
Accuracy on Blind test: 0.63
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.08229804 0.05557299 0.05145884 0.07025552 0.0688889 0.06463599
|
|
0.06412101 0.06187201 0.06107402 0.06366992]
|
|
|
|
mean value: 0.06438472270965576
|
|
|
|
key: score_time
|
|
value: [0.01020217 0.01022315 0.00988221 0.01065993 0.01005268 0.01064777
|
|
0.01069522 0.00998998 0.01045489 0.00996876]
|
|
|
|
mean value: 0.010277676582336425
|
|
|
|
key: test_mcc
|
|
value: [0.9451949 1. 0.91587302 0.97220047 0.94511009 0.97220047
|
|
0.97182532 0.97182532 0.97182532 1. ]
|
|
|
|
mean value: 0.9666054881651751
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.97183099 1. 0.95774648 0.98591549 0.97183099 0.98591549
|
|
0.98571429 0.98571429 0.98571429 1. ]
|
|
|
|
mean value: 0.9830382293762576
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.97222222 1. 0.95774648 0.98630137 0.97297297 0.98630137
|
|
0.98591549 0.98591549 0.98591549 1. ]
|
|
|
|
mean value: 0.9833290892667701
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.94594595 1. 0.94444444 0.97297297 0.94736842 0.97297297
|
|
0.97222222 0.97222222 0.97222222 1. ]
|
|
|
|
mean value: 0.9700371424055635
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.97142857 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9971428571428571
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.97222222 1. 0.95793651 0.98571429 0.97142857 0.98571429
|
|
0.98571429 0.98571429 0.98571429 1. ]
|
|
|
|
mean value: 0.983015873015873
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.94594595 1. 0.91891892 0.97297297 0.94736842 0.97297297
|
|
0.97222222 0.97222222 0.97222222 1. ]
|
|
|
|
mean value: 0.9674845898530109
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.27
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01726913 0.04849577 0.04532862 0.04679036 0.04754329 0.04615712
|
|
0.04615998 0.01974487 0.06901121 0.04615664]
|
|
|
|
mean value: 0.043265700340270996
|
|
|
|
key: score_time
|
|
value: [0.01070952 0.01685834 0.02024412 0.01978326 0.017102 0.01800013
|
|
0.01840425 0.01667547 0.01130104 0.0193522 ]
|
|
|
|
mean value: 0.016843032836914063
|
|
|
|
key: test_mcc
|
|
value: [0.91580648 0.94511009 0.88730159 0.88862624 0.77460317 0.8594125
|
|
0.82857143 0.94440028 0.91465912 0.94440028]
|
|
|
|
mean value: 0.890289119006275
|
|
|
|
key: train_mcc
|
|
value: [0.9401617 0.93702568 0.95276028 0.93702693 0.9433251 0.92759921
|
|
0.93083602 0.93718106 0.94025622 0.94976067]
|
|
|
|
mean value: 0.9395932881039084
|
|
|
|
key: test_accuracy
|
|
value: [0.95774648 0.97183099 0.94366197 0.94366197 0.88732394 0.92957746
|
|
0.91428571 0.97142857 0.95714286 0.97142857]
|
|
|
|
mean value: 0.9448088531187122
|
|
|
|
key: train_accuracy
|
|
value: [0.97007874 0.96850394 0.97637795 0.96850394 0.97165354 0.96377953
|
|
0.96540881 0.96855346 0.97012579 0.97484277]
|
|
|
|
mean value: 0.9697828455405338
|
|
|
|
key: test_fscore
|
|
value: [0.95652174 0.97058824 0.94285714 0.94594595 0.88888889 0.93150685
|
|
0.91428571 0.97222222 0.95652174 0.97058824]
|
|
|
|
mean value: 0.9449926712364087
|
|
|
|
key: train_fscore
|
|
value: [0.97017268 0.96865204 0.97645212 0.96855346 0.97151899 0.96354992
|
|
0.96529968 0.96875 0.97017268 0.975 ]
|
|
|
|
mean value: 0.9698121577608168
|
|
|
|
key: test_precision
|
|
value: [0.97058824 1. 0.94285714 0.92105263 0.88888889 0.91891892
|
|
0.91428571 0.94594595 0.97058824 1. ]
|
|
|
|
mean value: 0.9473125713063794
|
|
|
|
key: train_precision
|
|
value: [0.96865204 0.965625 0.97492163 0.96551724 0.97460317 0.96815287
|
|
0.96835443 0.96273292 0.96865204 0.9689441 ]
|
|
|
|
mean value: 0.9686155436566964
|
|
|
|
key: test_recall
|
|
value: [0.94285714 0.94285714 0.94285714 0.97222222 0.88888889 0.94444444
|
|
0.91428571 1. 0.94285714 0.94285714]
|
|
|
|
mean value: 0.9434126984126984
|
|
|
|
key: train_recall
|
|
value: [0.97169811 0.97169811 0.97798742 0.97160883 0.96845426 0.95899054
|
|
0.96226415 0.97484277 0.97169811 0.98113208]
|
|
|
|
mean value: 0.9710374382477234
|
|
|
|
key: test_roc_auc
|
|
value: [0.95753968 0.97142857 0.94365079 0.94325397 0.88730159 0.92936508
|
|
0.91428571 0.97142857 0.95714286 0.97142857]
|
|
|
|
mean value: 0.9446825396825397
|
|
|
|
key: train_roc_auc
|
|
value: [0.97007619 0.9684989 0.97637541 0.96850882 0.97164851 0.963772
|
|
0.96540881 0.96855346 0.97012579 0.97484277]
|
|
|
|
mean value: 0.9697810646191696
|
|
|
|
key: test_jcc
|
|
value: [0.91666667 0.94285714 0.89189189 0.8974359 0.8 0.87179487
|
|
0.84210526 0.94594595 0.91666667 0.94285714]
|
|
|
|
mean value: 0.8968221489274121
|
|
|
|
key: train_jcc
|
|
value: [0.94207317 0.93920973 0.95398773 0.93902439 0.94461538 0.92966361
|
|
0.93292683 0.93939394 0.94207317 0.95121951]
|
|
|
|
mean value: 0.9414187462247865
|
|
|
|
MCC on Blind test: 0.26
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01100755 0.01057196 0.00890303 0.00805974 0.00824904 0.00848842
|
|
0.00868011 0.00886703 0.00807524 0.00789809]
|
|
|
|
mean value: 0.008880019187927246
|
|
|
|
key: score_time
|
|
value: [0.01157236 0.00949931 0.00915504 0.00873566 0.00878334 0.00882101
|
|
0.00864768 0.00887275 0.00798082 0.00794458]
|
|
|
|
mean value: 0.00900125503540039
|
|
|
|
key: test_mcc
|
|
value: [0.69292162 0.71961897 0.75442414 0.70310369 0.72472613 0.61348603
|
|
0.45883147 0.68572751 0.74560114 0.69985421]
|
|
|
|
mean value: 0.6798294910722227
|
|
|
|
key: train_mcc
|
|
value: [0.68941392 0.68469209 0.68590643 0.6803341 0.69349825 0.71023697
|
|
0.66497357 0.69736279 0.69416896 0.69447562]
|
|
|
|
mean value: 0.6895062697027032
|
|
|
|
key: test_accuracy
|
|
value: [0.84507042 0.85915493 0.87323944 0.84507042 0.84507042 0.8028169
|
|
0.72857143 0.82857143 0.87142857 0.84285714]
|
|
|
|
mean value: 0.8341851106639839
|
|
|
|
key: train_accuracy
|
|
value: [0.84094488 0.83937008 0.83937008 0.83622047 0.84409449 0.8519685
|
|
0.83176101 0.84591195 0.8427673 0.84433962]
|
|
|
|
mean value: 0.8416748378150845
|
|
|
|
key: test_fscore
|
|
value: [0.84931507 0.86111111 0.88 0.86075949 0.86746988 0.82051282
|
|
0.73972603 0.85 0.86567164 0.85714286]
|
|
|
|
mean value: 0.8451708899637203
|
|
|
|
key: train_fscore
|
|
value: [0.85212299 0.84955752 0.85043988 0.84750733 0.85289747 0.86094675
|
|
0.83713851 0.85502959 0.85422741 0.85376662]
|
|
|
|
mean value: 0.8513634059429992
|
|
|
|
key: test_precision
|
|
value: [0.81578947 0.83783784 0.825 0.79069767 0.76595745 0.76190476
|
|
0.71052632 0.75555556 0.90625 0.78571429]
|
|
|
|
mean value: 0.795523335171324
|
|
|
|
key: train_precision
|
|
value: [0.79726027 0.8 0.7967033 0.79178082 0.80617978 0.81058496
|
|
0.81120944 0.80726257 0.79619565 0.80501393]
|
|
|
|
mean value: 0.8022190715202817
|
|
|
|
key: test_recall
|
|
value: [0.88571429 0.88571429 0.94285714 0.94444444 1. 0.88888889
|
|
0.77142857 0.97142857 0.82857143 0.94285714]
|
|
|
|
mean value: 0.9061904761904762
|
|
|
|
key: train_recall
|
|
value: [0.91509434 0.90566038 0.91194969 0.91167192 0.90536278 0.91798107
|
|
0.86477987 0.90880503 0.92138365 0.90880503]
|
|
|
|
mean value: 0.9071493760292046
|
|
|
|
key: test_roc_auc
|
|
value: [0.84563492 0.85952381 0.87420635 0.84365079 0.84285714 0.8015873
|
|
0.72857143 0.82857143 0.87142857 0.84285714]
|
|
|
|
mean value: 0.8338888888888889
|
|
|
|
key: train_roc_auc
|
|
value: [0.84082793 0.83926552 0.8392556 0.83633911 0.84419082 0.8520723
|
|
0.83176101 0.84591195 0.8427673 0.84433962]
|
|
|
|
mean value: 0.8416731146955538
|
|
|
|
key: test_jcc
|
|
value: [0.73809524 0.75609756 0.78571429 0.75555556 0.76595745 0.69565217
|
|
0.58695652 0.73913043 0.76315789 0.75 ]
|
|
|
|
mean value: 0.7336317112320825
|
|
|
|
key: train_jcc
|
|
value: [0.74234694 0.73846154 0.73979592 0.73536896 0.74352332 0.75584416
|
|
0.71989529 0.74677003 0.74554707 0.74484536]
|
|
|
|
mean value: 0.7412398572667729
|
|
|
|
MCC on Blind test: 0.27
|
|
|
|
Accuracy on Blind test: 0.73
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01256037 0.0129602 0.0124402 0.01418591 0.01519251 0.01379037
|
|
0.0145843 0.01559997 0.01467609 0.01388359]
|
|
|
|
mean value: 0.013987350463867187
|
|
|
|
key: score_time
|
|
value: [0.00821686 0.01022291 0.01050091 0.01071692 0.01076841 0.01073503
|
|
0.01092577 0.01097298 0.01072097 0.01067448]
|
|
|
|
mean value: 0.010445523262023925
|
|
|
|
key: test_mcc
|
|
value: [0.7380153 0.94511009 0.88730159 0.81839321 0.9186708 0.86205133
|
|
0.78301997 0.91465912 0.91465912 0.97182532]
|
|
|
|
mean value: 0.8753705850345086
|
|
|
|
key: train_mcc
|
|
value: [0.86062704 0.94027246 0.90337782 0.93520499 0.91737981 0.92960161
|
|
0.82836275 0.89394851 0.84815773 0.95036243]
|
|
|
|
mean value: 0.9007295140283917
|
|
|
|
key: test_accuracy
|
|
value: [0.85915493 0.97183099 0.94366197 0.90140845 0.95774648 0.92957746
|
|
0.88571429 0.95714286 0.95714286 0.98571429]
|
|
|
|
mean value: 0.9349094567404427
|
|
|
|
key: train_accuracy
|
|
value: [0.92598425 0.97007874 0.9511811 0.96692913 0.95748031 0.96377953
|
|
0.91037736 0.94654088 0.91981132 0.97484277]
|
|
|
|
mean value: 0.9487005397910167
|
|
|
|
key: test_fscore
|
|
value: [0.87179487 0.97058824 0.94285714 0.91139241 0.96 0.93333333
|
|
0.875 0.95774648 0.95774648 0.98591549]
|
|
|
|
mean value: 0.9366374439046983
|
|
|
|
key: train_fscore
|
|
value: [0.93098385 0.97035881 0.95008052 0.96774194 0.95890411 0.9648855
|
|
0.90387858 0.94533762 0.92511013 0.97530864]
|
|
|
|
mean value: 0.9492589696376544
|
|
|
|
key: test_precision
|
|
value: [0.79069767 1. 0.94285714 0.8372093 0.92307692 0.8974359
|
|
0.96551724 0.94444444 0.94444444 0.97222222]
|
|
|
|
mean value: 0.9217905292604571
|
|
|
|
key: train_precision
|
|
value: [0.87327824 0.9628483 0.97359736 0.94311377 0.92647059 0.93491124
|
|
0.97454545 0.96710526 0.8677686 0.95757576]
|
|
|
|
mean value: 0.938121456747856
|
|
|
|
key: test_recall
|
|
value: [0.97142857 0.94285714 0.94285714 1. 1. 0.97222222
|
|
0.8 0.97142857 0.97142857 1. ]
|
|
|
|
mean value: 0.9572222222222222
|
|
|
|
key: train_recall
|
|
value: [0.99685535 0.97798742 0.92767296 0.99369085 0.99369085 0.99684543
|
|
0.8427673 0.9245283 0.99056604 0.99371069]
|
|
|
|
mean value: 0.9638315179652005
|
|
|
|
key: test_roc_auc
|
|
value: [0.86071429 0.97142857 0.94365079 0.9 0.95714286 0.92896825
|
|
0.88571429 0.95714286 0.95714286 0.98571429]
|
|
|
|
mean value: 0.9347619047619048
|
|
|
|
key: train_roc_auc
|
|
value: [0.92587247 0.97006627 0.95121818 0.96697121 0.95753725 0.96383152
|
|
0.91037736 0.94654088 0.91981132 0.97484277]
|
|
|
|
mean value: 0.9487069222070115
|
|
|
|
key: test_jcc
|
|
value: [0.77272727 0.94285714 0.89189189 0.8372093 0.92307692 0.875
|
|
0.77777778 0.91891892 0.91891892 0.97222222]
|
|
|
|
mean value: 0.883060037071665
|
|
|
|
key: train_jcc
|
|
value: [0.87087912 0.94242424 0.90490798 0.9375 0.92105263 0.93215339
|
|
0.82461538 0.89634146 0.86065574 0.95180723]
|
|
|
|
mean value: 0.9042337177323416
|
|
|
|
MCC on Blind test: 0.22
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01248407 0.01530766 0.01435423 0.01306081 0.01362562 0.01316786
|
|
0.01642776 0.01362872 0.01625896 0.0138433 ]
|
|
|
|
mean value: 0.014215898513793946
|
|
|
|
key: score_time
|
|
value: [0.01122069 0.01088285 0.01063156 0.01069832 0.01098776 0.01070976
|
|
0.01086879 0.01070619 0.01069164 0.01066446]
|
|
|
|
mean value: 0.01080620288848877
|
|
|
|
key: test_mcc
|
|
value: [0.88730159 0.91885703 0.85952381 0.79446135 0.83240693 0.7364297
|
|
0.88571429 0.91766294 0.8871639 0.81649658]
|
|
|
|
mean value: 0.8536018118121076
|
|
|
|
key: train_mcc
|
|
value: [0.94338294 0.85426212 0.93386306 0.92923073 0.95276028 0.80610644
|
|
0.93712545 0.95036243 0.94029342 0.80723238]
|
|
|
|
mean value: 0.9054619258434897
|
|
|
|
key: test_accuracy
|
|
value: [0.94366197 0.95774648 0.92957746 0.88732394 0.91549296 0.85915493
|
|
0.94285714 0.95714286 0.94285714 0.9 ]
|
|
|
|
mean value: 0.9235814889336016
|
|
|
|
key: train_accuracy
|
|
value: [0.97165354 0.92283465 0.96692913 0.96377953 0.97637795 0.89448819
|
|
0.96855346 0.97484277 0.97012579 0.89937107]
|
|
|
|
mean value: 0.950895607388699
|
|
|
|
key: test_fscore
|
|
value: [0.94285714 0.95890411 0.92957746 0.9 0.91428571 0.875
|
|
0.94285714 0.95890411 0.94117647 0.88888889]
|
|
|
|
mean value: 0.9252451043443939
|
|
|
|
key: train_fscore
|
|
value: [0.97151899 0.92804699 0.96692913 0.96477795 0.97630332 0.90414878
|
|
0.96845426 0.97530864 0.9699842 0.89152542]
|
|
|
|
mean value: 0.9516997686957204
|
|
|
|
key: test_precision
|
|
value: [0.94285714 0.92105263 0.91666667 0.81818182 0.94117647 0.79545455
|
|
0.94285714 0.92105263 0.96969697 1. ]
|
|
|
|
mean value: 0.9168996019460416
|
|
|
|
key: train_precision
|
|
value: [0.97770701 0.87052342 0.96845426 0.9375 0.9778481 0.82722513
|
|
0.97151899 0.95757576 0.97460317 0.96691176]
|
|
|
|
mean value: 0.9429867597404928
|
|
|
|
key: test_recall
|
|
value: [0.94285714 1. 0.94285714 1. 0.88888889 0.97222222
|
|
0.94285714 1. 0.91428571 0.8 ]
|
|
|
|
mean value: 0.9403968253968253
|
|
|
|
key: train_recall
|
|
value: [0.96540881 0.99371069 0.96540881 0.99369085 0.97476341 0.99684543
|
|
0.96540881 0.99371069 0.96540881 0.82704403]
|
|
|
|
mean value: 0.9641400313473405
|
|
|
|
key: test_roc_auc
|
|
value: [0.94365079 0.95833333 0.9297619 0.88571429 0.91587302 0.85753968
|
|
0.94285714 0.95714286 0.94285714 0.9 ]
|
|
|
|
mean value: 0.9233730158730159
|
|
|
|
key: train_roc_auc
|
|
value: [0.97166339 0.92272285 0.96693153 0.96382656 0.97637541 0.89464913
|
|
0.96855346 0.97484277 0.97012579 0.89937107]
|
|
|
|
mean value: 0.9509061960597583
|
|
|
|
key: test_jcc
|
|
value: [0.89189189 0.92105263 0.86842105 0.81818182 0.84210526 0.77777778
|
|
0.89189189 0.92105263 0.88888889 0.8 ]
|
|
|
|
mean value: 0.8621263847579637
|
|
|
|
key: train_jcc
|
|
value: [0.94461538 0.86575342 0.93597561 0.93195266 0.9537037 0.82506527
|
|
0.93883792 0.95180723 0.94171779 0.80428135]
|
|
|
|
mean value: 0.9093710345987801
|
|
|
|
MCC on Blind test: 0.18
|
|
|
|
Accuracy on Blind test: 0.71
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.11450005 0.09942293 0.10065579 0.09938693 0.10254526 0.09947491
|
|
0.10343981 0.10581398 0.10528588 0.10459161]
|
|
|
|
mean value: 0.10351171493530273
|
|
|
|
key: score_time
|
|
value: [0.01439619 0.01475668 0.0148623 0.01507521 0.01486588 0.01494408
|
|
0.01576829 0.01550889 0.01577449 0.01586604]
|
|
|
|
mean value: 0.015181803703308105
|
|
|
|
key: test_mcc
|
|
value: [0.97222222 1. 0.9451949 0.94511009 0.94511009 0.97220047
|
|
0.97182532 0.94440028 0.97182532 1. ]
|
|
|
|
mean value: 0.9667888678525607
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98591549 1. 0.97183099 0.97183099 0.97183099 0.98591549
|
|
0.98571429 0.97142857 0.98571429 1. ]
|
|
|
|
mean value: 0.9830181086519115
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98591549 1. 0.97222222 0.97297297 0.97297297 0.98630137
|
|
0.98591549 0.97222222 0.98591549 1. ]
|
|
|
|
mean value: 0.9834438239126644
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.97222222 1. 0.94594595 0.94736842 0.94736842 0.97297297
|
|
0.97222222 0.94594595 0.97222222 1. ]
|
|
|
|
mean value: 0.9676268373636795
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98611111 1. 0.97222222 0.97142857 0.97142857 0.98571429
|
|
0.98571429 0.97142857 0.98571429 1. ]
|
|
|
|
mean value: 0.9829761904761904
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.97222222 1. 0.94594595 0.94736842 0.94736842 0.97297297
|
|
0.97222222 0.94594595 0.97222222 1. ]
|
|
|
|
mean value: 0.9676268373636795
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.23
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.04095316 0.0422883 0.04308581 0.03728151 0.03888655 0.03602791
|
|
0.04428387 0.04721117 0.03666067 0.03684378]
|
|
|
|
mean value: 0.040352272987365725
|
|
|
|
key: score_time
|
|
value: [0.02391195 0.03111863 0.02407122 0.02295971 0.02479601 0.02285576
|
|
0.02889371 0.01769805 0.01712298 0.01948595]
|
|
|
|
mean value: 0.023291397094726562
|
|
|
|
key: test_mcc
|
|
value: [0.9451949 1. 0.97222222 0.97220047 0.94511009 0.94511009
|
|
0.94440028 0.91766294 0.97182532 0.97182532]
|
|
|
|
mean value: 0.9585551614007853
|
|
|
|
key: train_mcc
|
|
value: [1. 0.99685531 0.99059524 1. 1. 1.
|
|
0.99686027 0.99686027 0.99373035 0.99686027]
|
|
|
|
mean value: 0.9971761724807292
|
|
|
|
key: test_accuracy
|
|
value: [0.97183099 1. 0.98591549 0.98591549 0.97183099 0.97183099
|
|
0.97142857 0.95714286 0.98571429 0.98571429]
|
|
|
|
mean value: 0.9787323943661972
|
|
|
|
key: train_accuracy
|
|
value: [1. 0.9984252 0.99527559 1. 1. 1.
|
|
0.99842767 0.99842767 0.99685535 0.99842767]
|
|
|
|
mean value: 0.9985839152181448
|
|
|
|
key: test_fscore
|
|
value: [0.97222222 1. 0.98591549 0.98630137 0.97297297 0.97297297
|
|
0.97222222 0.95890411 0.98591549 0.98591549]
|
|
|
|
mean value: 0.9793342348715685
|
|
|
|
key: train_fscore
|
|
value: [1. 0.99843014 0.99530516 1. 1. 1.
|
|
0.99843014 0.99843014 0.9968652 0.99843014]
|
|
|
|
mean value: 0.9985890933230142
|
|
|
|
key: test_precision
|
|
value: [0.94594595 1. 0.97222222 0.97297297 0.94736842 0.94736842
|
|
0.94594595 0.92105263 0.97222222 0.97222222]
|
|
|
|
mean value: 0.9597321005215742
|
|
|
|
key: train_precision
|
|
value: [1. 0.9968652 0.99065421 1. 1. 1.
|
|
0.9968652 0.9968652 0.99375 0.9968652 ]
|
|
|
|
mean value: 0.9971865020654499
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.97222222 1. 0.98611111 0.98571429 0.97142857 0.97142857
|
|
0.97142857 0.95714286 0.98571429 0.98571429]
|
|
|
|
mean value: 0.9786904761904761
|
|
|
|
key: train_roc_auc
|
|
value: [1. 0.99842271 0.99526814 1. 1. 1.
|
|
0.99842767 0.99842767 0.99685535 0.99842767]
|
|
|
|
mean value: 0.998582921651489
|
|
|
|
key: test_jcc
|
|
value: [0.94594595 1. 0.97222222 0.97297297 0.94736842 0.94736842
|
|
0.94594595 0.92105263 0.97222222 0.97222222]
|
|
|
|
mean value: 0.9597321005215742
|
|
|
|
key: train_jcc
|
|
value: [1. 0.9968652 0.99065421 1. 1. 1.
|
|
0.9968652 0.9968652 0.99375 0.9968652 ]
|
|
|
|
mean value: 0.9971865020654499
|
|
|
|
MCC on Blind test: 0.24
|
|
|
|
Accuracy on Blind test: 0.85
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.28364706 0.2668128 0.25616741 0.16965199 0.18673849 0.24170947
|
|
0.24973774 0.27796173 0.29994655 0.25314331]
|
|
|
|
mean value: 0.24855165481567382
|
|
|
|
key: score_time
|
|
value: [0.02206802 0.02942705 0.0219636 0.02204132 0.01348066 0.02202201
|
|
0.02197385 0.02503347 0.02198958 0.02210307]
|
|
|
|
mean value: 0.02221026420593262
|
|
|
|
key: test_mcc
|
|
value: [0.83214239 0.84343471 0.9451949 0.70310369 0.81050059 0.69047619
|
|
0.71545476 0.84102145 0.8871639 0.74560114]
|
|
|
|
mean value: 0.801409370970335
|
|
|
|
key: train_mcc
|
|
value: [0.90553048 0.89928572 0.90247629 0.90236595 0.90558158 0.90866524
|
|
0.91509886 0.90582163 0.90566038 0.90573203]
|
|
|
|
mean value: 0.9056218165156902
|
|
|
|
key: test_accuracy
|
|
value: [0.91549296 0.91549296 0.97183099 0.84507042 0.90140845 0.84507042
|
|
0.85714286 0.91428571 0.94285714 0.87142857]
|
|
|
|
mean value: 0.8980080482897385
|
|
|
|
key: train_accuracy
|
|
value: [0.95275591 0.9496063 0.9511811 0.9511811 0.95275591 0.95433071
|
|
0.95754717 0.95283019 0.95283019 0.95283019]
|
|
|
|
mean value: 0.9527848759471104
|
|
|
|
key: test_fscore
|
|
value: [0.91176471 0.92105263 0.97222222 0.86075949 0.90909091 0.84507042
|
|
0.86111111 0.92105263 0.94117647 0.87671233]
|
|
|
|
mean value: 0.9020012927025947
|
|
|
|
key: train_fscore
|
|
value: [0.95268139 0.94936709 0.95087163 0.95102686 0.95238095 0.95418641
|
|
0.95748031 0.95238095 0.95283019 0.95253165]
|
|
|
|
mean value: 0.9525737433063429
|
|
|
|
key: test_precision
|
|
value: [0.93939394 0.85365854 0.94594595 0.79069767 0.85365854 0.85714286
|
|
0.83783784 0.85365854 0.96969697 0.84210526]
|
|
|
|
mean value: 0.8743796097350147
|
|
|
|
key: train_precision
|
|
value: [0.9556962 0.95541401 0.95846645 0.95253165 0.95846645 0.9556962
|
|
0.95899054 0.96153846 0.95283019 0.95859873]
|
|
|
|
mean value: 0.9568228883329967
|
|
|
|
key: test_recall
|
|
value: [0.88571429 1. 1. 0.94444444 0.97222222 0.83333333
|
|
0.88571429 1. 0.91428571 0.91428571]
|
|
|
|
mean value: 0.9349999999999999
|
|
|
|
key: train_recall
|
|
value: [0.94968553 0.94339623 0.94339623 0.94952681 0.94637224 0.95268139
|
|
0.95597484 0.94339623 0.95283019 0.94654088]
|
|
|
|
mean value: 0.9483800567426542
|
|
|
|
key: test_roc_auc
|
|
value: [0.91507937 0.91666667 0.97222222 0.84365079 0.90039683 0.8452381
|
|
0.85714286 0.91428571 0.94285714 0.87142857]
|
|
|
|
mean value: 0.8978968253968254
|
|
|
|
key: train_roc_auc
|
|
value: [0.95276075 0.94961609 0.95119338 0.9511785 0.95274587 0.95432812
|
|
0.95754717 0.95283019 0.95283019 0.95283019]
|
|
|
|
mean value: 0.9527860444814793
|
|
|
|
key: test_jcc
|
|
value: [0.83783784 0.85365854 0.94594595 0.75555556 0.83333333 0.73170732
|
|
0.75609756 0.85365854 0.88888889 0.7804878 ]
|
|
|
|
mean value: 0.8237171317659122
|
|
|
|
key: train_jcc
|
|
value: [0.90963855 0.90361446 0.90634441 0.90662651 0.90909091 0.91238671
|
|
0.918429 0.90909091 0.90990991 0.90936556]
|
|
|
|
mean value: 0.9094496925922325
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.28502631 0.27786803 0.2669673 0.26025629 0.26016879 0.26106143
|
|
0.26042795 0.26144886 0.26163244 0.25996804]
|
|
|
|
mean value: 0.2654825448989868
|
|
|
|
key: score_time
|
|
value: [0.00984454 0.00959039 0.00850868 0.00844693 0.00874805 0.00843954
|
|
0.00860071 0.00844836 0.00860429 0.00847673]
|
|
|
|
mean value: 0.00877082347869873
|
|
|
|
key: test_mcc
|
|
value: [0.9451949 1. 0.9451949 0.97220047 0.94511009 0.94365079
|
|
0.94440028 0.97182532 0.97182532 0.97182532]
|
|
|
|
mean value: 0.9611227372545662
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 0.99685531
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9996855314765581
|
|
|
|
key: test_accuracy
|
|
value: [0.97183099 1. 0.97183099 0.98591549 0.97183099 0.97183099
|
|
0.97142857 0.98571429 0.98571429 0.98571429]
|
|
|
|
mean value: 0.9801810865191147
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 0.9984252 1.
|
|
1. 1. 1. ]
|
|
|
|
mean value: 0.9998425196850393
|
|
|
|
key: test_fscore
|
|
value: [0.97222222 1. 0.97222222 0.98630137 0.97297297 0.97222222
|
|
0.97222222 0.98591549 0.98591549 0.98591549]
|
|
|
|
mean value: 0.9805909710598115
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 0.99842022
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9998420221169037
|
|
|
|
key: test_precision
|
|
value: [0.94594595 1. 0.94594595 0.97297297 0.94736842 0.97222222
|
|
0.94594595 0.97222222 0.97222222 0.97222222]
|
|
|
|
mean value: 0.9647068120752331
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 1. 0.97222222
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9972222222222222
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 0.99684543
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9996845425867508
|
|
|
|
key: test_roc_auc
|
|
value: [0.97222222 1. 0.97222222 0.98571429 0.97142857 0.9718254
|
|
0.97142857 0.98571429 0.98571429 0.98571429]
|
|
|
|
mean value: 0.9801984126984127
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 0.99842271
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9998422712933754
|
|
|
|
key: test_jcc
|
|
value: [0.94594595 1. 0.94594595 0.97297297 0.94736842 0.94594595
|
|
0.94594595 0.97222222 0.97222222 0.97222222]
|
|
|
|
mean value: 0.9620791844476055
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 0.99684543
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9996845425867508
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01253176 0.01580763 0.01565361 0.01494575 0.01510954 0.01479077
|
|
0.01480365 0.01490068 0.014853 0.01492882]
|
|
|
|
mean value: 0.014832520484924316
|
|
|
|
key: score_time
|
|
value: [0.01115751 0.01106358 0.01114917 0.01116085 0.01098204 0.01356149
|
|
0.01323414 0.01438808 0.01097846 0.01121426]
|
|
|
|
mean value: 0.011888957023620606
|
|
|
|
key: test_mcc
|
|
value: [0.60561605 0.74766718 0.65726707 0.44129696 0.5532359 0.58548477
|
|
0.66212219 0.72501849 0.78301997 0.5923057 ]
|
|
|
|
mean value: 0.6353034273077338
|
|
|
|
key: train_mcc
|
|
value: [0.76004007 0.72245201 0.66278625 0.65560022 0.57404094 0.83304874
|
|
0.80325449 0.83067537 0.79045363 0.75380319]
|
|
|
|
mean value: 0.7386154903774159
|
|
|
|
key: test_accuracy
|
|
value: [0.78873239 0.85915493 0.8028169 0.67605634 0.73239437 0.77464789
|
|
0.81428571 0.85714286 0.88571429 0.77142857]
|
|
|
|
mean value: 0.7962374245472836
|
|
|
|
key: train_accuracy
|
|
value: [0.86771654 0.8503937 0.80629921 0.80314961 0.7496063 0.91338583
|
|
0.89465409 0.91037736 0.88836478 0.86477987]
|
|
|
|
mean value: 0.8548727281731293
|
|
|
|
key: test_fscore
|
|
value: [0.74576271 0.83333333 0.75 0.54901961 0.64150943 0.73333333
|
|
0.77966102 0.84375 0.875 0.71428571]
|
|
|
|
mean value: 0.7465655151571342
|
|
|
|
key: train_fscore
|
|
value: [0.84892086 0.83005367 0.76116505 0.75633528 0.66666667 0.90756303
|
|
0.88388215 0.90289608 0.87694974 0.84532374]
|
|
|
|
mean value: 0.8279756265504205
|
|
|
|
key: test_precision
|
|
value: [0.91666667 1. 1. 0.93333333 1. 0.91666667
|
|
0.95833333 0.93103448 0.96551724 0.95238095]
|
|
|
|
mean value: 0.9573932676518884
|
|
|
|
key: train_precision
|
|
value: [0.99159664 0.9626556 0.99492386 0.98979592 0.99375 0.97122302
|
|
0.98455598 0.98513011 0.97683398 0.98739496]
|
|
|
|
mean value: 0.9837860069030633
|
|
|
|
key: test_recall
|
|
value: [0.62857143 0.71428571 0.6 0.38888889 0.47222222 0.61111111
|
|
0.65714286 0.77142857 0.8 0.57142857]
|
|
|
|
mean value: 0.6215079365079366
|
|
|
|
key: train_recall
|
|
value: [0.74213836 0.72955975 0.6163522 0.61198738 0.50157729 0.85173502
|
|
0.80188679 0.83333333 0.79559748 0.73899371]
|
|
|
|
mean value: 0.7223161319762712
|
|
|
|
key: test_roc_auc
|
|
value: [0.78650794 0.85714286 0.8 0.68015873 0.73611111 0.77698413
|
|
0.81428571 0.85714286 0.88571429 0.77142857]
|
|
|
|
mean value: 0.7965476190476191
|
|
|
|
key: train_roc_auc
|
|
value: [0.86791461 0.85058429 0.80659881 0.80284904 0.74921632 0.91328889
|
|
0.89465409 0.91037736 0.88836478 0.86477987]
|
|
|
|
mean value: 0.8548628057853699
|
|
|
|
key: test_jcc
|
|
value: [0.59459459 0.71428571 0.6 0.37837838 0.47222222 0.57894737
|
|
0.63888889 0.72972973 0.77777778 0.55555556]
|
|
|
|
mean value: 0.6040380229853914
|
|
|
|
key: train_jcc
|
|
value: [0.7375 0.70948012 0.61442006 0.60815047 0.5 0.83076923
|
|
0.79192547 0.82298137 0.7808642 0.73208723]
|
|
|
|
mean value: 0.7128178143252082
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02688622 0.03091717 0.02439308 0.03099203 0.03109002 0.03095508
|
|
0.03092408 0.01178432 0.01177502 0.011724 ]
|
|
|
|
mean value: 0.0241441011428833
|
|
|
|
key: score_time
|
|
value: [0.01939631 0.01934052 0.01894784 0.01915073 0.01910853 0.01867843
|
|
0.01939607 0.01078653 0.01074886 0.01071095]
|
|
|
|
mean value: 0.016626477241516113
|
|
|
|
key: test_mcc
|
|
value: [0.91580648 0.94511009 0.88730159 0.94365079 0.8031746 0.85952381
|
|
0.85749293 0.94440028 0.8871639 0.94440028]
|
|
|
|
mean value: 0.8988024758003245
|
|
|
|
key: train_mcc
|
|
value: [0.93078373 0.92442835 0.93397556 0.9059564 0.9433251 0.92457213
|
|
0.92469291 0.94654556 0.92469291 0.92771424]
|
|
|
|
mean value: 0.9286686898188696
|
|
|
|
key: test_accuracy
|
|
value: [0.95774648 0.97183099 0.94366197 0.97183099 0.90140845 0.92957746
|
|
0.92857143 0.97142857 0.94285714 0.97142857]
|
|
|
|
mean value: 0.9490342052313884
|
|
|
|
key: train_accuracy
|
|
value: [0.96535433 0.96220472 0.96692913 0.95275591 0.97165354 0.96220472
|
|
0.96226415 0.97327044 0.96226415 0.96383648]
|
|
|
|
mean value: 0.9642737582330511
|
|
|
|
key: test_fscore
|
|
value: [0.95652174 0.97058824 0.94285714 0.97222222 0.90140845 0.92957746
|
|
0.92753623 0.97222222 0.94117647 0.97058824]
|
|
|
|
mean value: 0.9484698414985508
|
|
|
|
key: train_fscore
|
|
value: [0.96518987 0.96214511 0.96671949 0.95192308 0.97151899 0.96178344
|
|
0.96190476 0.9733124 0.96190476 0.96366509]
|
|
|
|
mean value: 0.9640066993032763
|
|
|
|
key: test_precision
|
|
value: [0.97058824 1. 0.94285714 0.97222222 0.91428571 0.94285714
|
|
0.94117647 0.94594595 0.96969697 1. ]
|
|
|
|
mean value: 0.959962984374749
|
|
|
|
key: train_precision
|
|
value: [0.97133758 0.96518987 0.97444089 0.96742671 0.97460317 0.97106109
|
|
0.97115385 0.97178683 0.97115385 0.96825397]
|
|
|
|
mean value: 0.9706407819970189
|
|
|
|
key: test_recall
|
|
value: [0.94285714 0.94285714 0.94285714 0.97222222 0.88888889 0.91666667
|
|
0.91428571 1. 0.91428571 0.94285714]
|
|
|
|
mean value: 0.9377777777777777
|
|
|
|
key: train_recall
|
|
value: [0.9591195 0.9591195 0.9591195 0.93690852 0.96845426 0.95268139
|
|
0.95283019 0.97484277 0.95283019 0.9591195 ]
|
|
|
|
mean value: 0.9575025296113326
|
|
|
|
key: test_roc_auc
|
|
value: [0.95753968 0.97142857 0.94365079 0.9718254 0.9015873 0.9297619
|
|
0.92857143 0.97142857 0.94285714 0.97142857]
|
|
|
|
mean value: 0.9490079365079365
|
|
|
|
key: train_roc_auc
|
|
value: [0.96536416 0.96220959 0.96694145 0.95273099 0.97164851 0.96218975
|
|
0.96226415 0.97327044 0.96226415 0.96383648]
|
|
|
|
mean value: 0.9642719679384164
|
|
|
|
key: test_jcc
|
|
value: [0.91666667 0.94285714 0.89189189 0.94594595 0.82051282 0.86842105
|
|
0.86486486 0.94594595 0.88888889 0.94285714]
|
|
|
|
mean value: 0.9028852363062889
|
|
|
|
key: train_jcc
|
|
value: [0.93272171 0.92705167 0.93558282 0.90825688 0.94461538 0.92638037
|
|
0.9266055 0.94801223 0.9266055 0.92987805]
|
|
|
|
mean value: 0.930571013017483
|
|
|
|
MCC on Blind test: 0.26
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: /home/tanu/git/LSHTM_analysis/scripts/ml/./embb_config.py:203: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rouC_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./embb_config.py:206: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rouC_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.19827652 0.19935083 0.11779952 0.14097071 0.20005989 0.23936892
|
|
0.23391032 0.19965649 0.19985843 0.20182014]
|
|
|
|
mean value: 0.19310717582702636
|
|
|
|
key: score_time
|
|
value: [0.02154684 0.02093148 0.01095414 0.0209043 0.01941705 0.01915574
|
|
0.01094842 0.02140784 0.02034569 0.02129364]
|
|
|
|
mean value: 0.01869051456451416
|
|
|
|
key: test_mcc
|
|
value: [0.91580648 0.94511009 0.88730159 0.91580648 0.77460317 0.88730159
|
|
0.91465912 0.94440028 0.91465912 0.94440028]
|
|
|
|
mean value: 0.9044048212742926
|
|
|
|
key: train_mcc
|
|
value: [0.94330695 0.93702568 0.94646152 0.94330695 0.9433251 0.93078099
|
|
0.95287259 0.94341489 0.94339623 0.95287259]
|
|
|
|
mean value: 0.9436763480331438
|
|
|
|
key: test_accuracy
|
|
value: [0.95774648 0.97183099 0.94366197 0.95774648 0.88732394 0.94366197
|
|
0.95714286 0.97142857 0.95714286 0.97142857]
|
|
|
|
mean value: 0.9519114688128772
|
|
|
|
key: train_accuracy
|
|
value: [0.97165354 0.96850394 0.97322835 0.97165354 0.97165354 0.96535433
|
|
0.97641509 0.97169811 0.97169811 0.97641509]
|
|
|
|
mean value: 0.9718273659188827
|
|
|
|
key: test_fscore
|
|
value: [0.95652174 0.97058824 0.94285714 0.95890411 0.88888889 0.94444444
|
|
0.95652174 0.97222222 0.95652174 0.97058824]
|
|
|
|
mean value: 0.9518058495981279
|
|
|
|
key: train_fscore
|
|
value: [0.97169811 0.96865204 0.97322835 0.97160883 0.97151899 0.96507937
|
|
0.97652582 0.97178683 0.97169811 0.97652582]
|
|
|
|
mean value: 0.9718322272766338
|
|
|
|
key: test_precision
|
|
value: [0.97058824 1. 0.94285714 0.94594595 0.88888889 0.94444444
|
|
0.97058824 0.94594595 0.97058824 1. ]
|
|
|
|
mean value: 0.9579847073964721
|
|
|
|
key: train_precision
|
|
value: [0.97169811 0.965625 0.97476341 0.97160883 0.97460317 0.97124601
|
|
0.97196262 0.96875 0.97169811 0.97196262]
|
|
|
|
mean value: 0.9713917880800539
|
|
|
|
key: test_recall
|
|
value: [0.94285714 0.94285714 0.94285714 0.97222222 0.88888889 0.94444444
|
|
0.94285714 1. 0.94285714 0.94285714]
|
|
|
|
mean value: 0.9462698412698413
|
|
|
|
key: train_recall
|
|
value: [0.97169811 0.97169811 0.97169811 0.97160883 0.96845426 0.95899054
|
|
0.98113208 0.97484277 0.97169811 0.98113208]
|
|
|
|
mean value: 0.9722952998829435
|
|
|
|
key: test_roc_auc
|
|
value: [0.95753968 0.97142857 0.94365079 0.95753968 0.88730159 0.94365079
|
|
0.95714286 0.97142857 0.95714286 0.97142857]
|
|
|
|
mean value: 0.9518253968253968
|
|
|
|
key: train_roc_auc
|
|
value: [0.97165347 0.9684989 0.97323076 0.97165347 0.97164851 0.96534432
|
|
0.97641509 0.97169811 0.97169811 0.97641509]
|
|
|
|
mean value: 0.9718255857786243
|
|
|
|
key: test_jcc
|
|
value: [0.91666667 0.94285714 0.89189189 0.92105263 0.8 0.89473684
|
|
0.91666667 0.94594595 0.91666667 0.94285714]
|
|
|
|
mean value: 0.9089341597236333
|
|
|
|
key: train_jcc
|
|
value: [0.94495413 0.93920973 0.94785276 0.94478528 0.94461538 0.93251534
|
|
0.95412844 0.94512195 0.94495413 0.95412844]
|
|
|
|
mean value: 0.9452265574126474
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.81
|