19492 lines
965 KiB
Text
19492 lines
965 KiB
Text
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_data.py:550: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
mask_check.sort_values(by = ['ligand_distance'], ascending = True, inplace = True)
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
|
|
from pandas import MultiIndex, Int64Index
|
|
1.22.4
|
|
1.4.1
|
|
|
|
aaindex_df contains non-numerical data
|
|
|
|
Total no. of non-numerial columns: 2
|
|
|
|
Selecting numerical data only
|
|
|
|
PASS: successfully selected numerical columns only for aaindex_df
|
|
|
|
Now checking for NA in the remaining aaindex_cols
|
|
|
|
Counting aaindex_df cols with NA
|
|
ncols with NA: 4 columns
|
|
Dropping these...
|
|
Original ncols: 127
|
|
|
|
Revised df ncols: 123
|
|
|
|
Checking NA in revised df...
|
|
|
|
PASS: cols with NA successfully dropped from aaindex_df
|
|
Proceeding with combining aa_df with other features_df
|
|
|
|
PASS: ncols match
|
|
Expected ncols: 123
|
|
Got: 123
|
|
|
|
Total no. of columns in clean aa_df: 123
|
|
|
|
Proceeding to merge, expected nrows in merged_df: 817
|
|
|
|
PASS: my_features_df and aa_df successfully combined
|
|
nrows: 817
|
|
ncols: 269
|
|
count of NULL values before imputation
|
|
|
|
or_mychisq 244
|
|
log10_or_mychisq 244
|
|
dtype: int64
|
|
count of NULL values AFTER imputation
|
|
|
|
mutationinformation 0
|
|
or_rawI 0
|
|
logorI 0
|
|
dtype: int64
|
|
|
|
PASS: OR values imputed, data ready for ML
|
|
|
|
No. of numerical features: 45
|
|
No. of categorical features: 7
|
|
|
|
index: 0
|
|
ind: 1
|
|
|
|
Mask count check: True
|
|
|
|
index: 1
|
|
ind: 2
|
|
|
|
Mask count check: True
|
|
Original Data
|
|
Counter({1: 309, 0: 158}) Data dim: (467, 52)
|
|
|
|
-------------------------------------------------------------
|
|
Successfully split data: UQ [no aa_index but active site included] training
|
|
actual values: training set
|
|
imputed values: blind test set
|
|
Train data size: (467, 52)
|
|
Test data size: (350, 52)
|
|
y_train numbers: Counter({1: 309, 0: 158})
|
|
y_train ratio: 0.511326860841424
|
|
|
|
y_test_numbers: Counter({0: 315, 1: 35})
|
|
y_test ratio: 9.0
|
|
-------------------------------------------------------------
|
|
Simple Random OverSampling
|
|
Counter({1: 309, 0: 309})
|
|
(618, 52)
|
|
Simple Random UnderSampling
|
|
Counter({0: 158, 1: 158})
|
|
(316, 52)
|
|
Simple Combined Over and UnderSampling
|
|
Counter({0: 309, 1: 309})
|
|
(618, 52)
|
|
SMOTE_NC OverSampling
|
|
Counter({1: 309, 0: 309})
|
|
(618, 52)
|
|
|
|
#####################################################################
|
|
|
|
Running ML analysis: UQ [without AA index but with active site annotations]
|
|
Gene name: katG
|
|
Drug name: isoniazid
|
|
|
|
Output directory: /home/tanu/git/Data/isoniazid/output/ml/uq_v1/
|
|
|
|
Sanity checks:
|
|
Total input features: 52
|
|
|
|
Training data size: (467, 52)
|
|
Test data size: (350, 52)
|
|
|
|
Target feature numbers (training data): Counter({1: 309, 0: 158})
|
|
Target features ratio (training data: 0.511326860841424
|
|
|
|
Target feature numbers (test data): Counter({0: 315, 1: 35})
|
|
Target features ratio (test data): 9.0
|
|
|
|
#####################################################################
|
|
|
|
|
|
================================================================
|
|
|
|
Strucutral features (n): 36
|
|
These are:
|
|
Common stablity features: ['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist']
|
|
FoldX columns: ['electro_rr', 'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr', 'disulfide_mm', 'disulfide_sm', 'disulfide_ss', 'hbonds_rr', 'hbonds_mm', 'hbonds_sm', 'hbonds_ss', 'partcov_rr', 'partcov_mm', 'partcov_sm', 'partcov_ss', 'vdwclashes_rr', 'vdwclashes_mm', 'vdwclashes_sm', 'vdwclashes_ss', 'volumetric_rr', 'volumetric_mm', 'volumetric_ss']
|
|
Other struc columns: ['rsa', 'kd_values', 'rd_values']
|
|
================================================================
|
|
|
|
Evolutionary features (n): 3
|
|
These are:
|
|
['consurf_score', 'snap2_score', 'provean_score']
|
|
================================================================
|
|
|
|
Genomic features (n): 6
|
|
These are:
|
|
['maf', 'logorI']
|
|
['lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique']
|
|
================================================================
|
|
|
|
Categorical features (n): 7
|
|
These are:
|
|
['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site']
|
|
================================================================
|
|
|
|
|
|
Pass: No. of features match
|
|
|
|
#####################################################################
|
|
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02167606 0.02372026 0.03166604 0.02357769 0.02548194 0.02195692
|
|
0.02136278 0.02161574 0.02221417 0.02264333]
|
|
|
|
mean value: 0.023591494560241698
|
|
|
|
key: score_time
|
|
value: [0.0109992 0.01075363 0.01093793 0.01066351 0.01062679 0.01058674
|
|
0.01058102 0.01062608 0.0105927 0.01066446]
|
|
|
|
mean value: 0.010703206062316895
|
|
|
|
key: test_mcc
|
|
value: [0.90662544 0.66402366 0.60908698 0.90662544 0.86070252 0.66337469
|
|
0.67402153 0.80215054 0.66040066 0.85943956]
|
|
|
|
mean value: 0.7606451028769974
|
|
|
|
key: train_mcc
|
|
value: [0.83338837 0.82273265 0.789683 0.77877628 0.76217448 0.80630977
|
|
0.79579908 0.77434754 0.7963019 0.80086095]
|
|
|
|
mean value: 0.7960374023577294
|
|
|
|
key: test_accuracy
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[0.95744681 0.85106383 0.82978723 0.95744681 0.93617021 0.85106383
|
|
0.85106383 0.91304348 0.84782609 0.93478261]
|
|
|
|
mean value: 0.8929694727104533
|
|
|
|
key: train_accuracy
|
|
value: [0.92619048 0.92142857 0.90714286 0.90238095 0.8952381 0.91428571
|
|
0.90952381 0.90023753 0.90973872 0.91211401]
|
|
|
|
mean value: 0.9098280737473137
|
|
|
|
key: test_fscore
|
|
value: [0.96875 0.88888889 0.87878788 0.96875 0.95384615 0.89552239
|
|
0.8852459 0.93548387 0.8852459 0.95238095]
|
|
|
|
mean value: 0.9212901936210006
|
|
|
|
key: train_fscore
|
|
value: [0.94532628 0.94240838 0.93169877 0.92869565 0.92334495 0.93728223
|
|
0.93425606 0.92682927 0.93379791 0.93542757]
|
|
|
|
mean value: 0.9339067066812484
|
|
|
|
key: test_precision
|
|
value: [0.93939394 0.875 0.82857143 0.93939394 0.91176471 0.83333333
|
|
0.9 0.93548387 0.9 0.90909091]
|
|
|
|
mean value: 0.8972032126633644
|
|
|
|
key: train_precision
|
|
value: [0.92733564 0.91525424 0.90784983 0.8989899 0.89527027 0.90878378
|
|
0.9 0.89864865 0.90540541 0.91156463]
|
|
|
|
mean value: 0.9069102339726427
|
|
|
|
key: test_recall
|
|
value: [1. 0.90322581 0.93548387 1. 1. 0.96774194
|
|
0.87096774 0.93548387 0.87096774 1. ]
|
|
|
|
mean value: 0.9483870967741935
|
|
|
|
key: train_recall
|
|
value: [0.96402878 0.97122302 0.95683453 0.96043165 0.95323741 0.9676259
|
|
0.97122302 0.95683453 0.96402878 0.96057348]
|
|
|
|
mean value: 0.962604110260179
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.8266129 0.78024194 0.9375 0.90625 0.79637097
|
|
0.84173387 0.90107527 0.83548387 0.90625 ]
|
|
|
|
mean value: 0.8669018817204301
|
|
|
|
key: train_roc_auc
|
|
value: [0.90807073 0.89758334 0.88334684 0.87458202 0.86746378 0.88874253
|
|
0.87997771 0.87352216 0.88411229 0.88873744]
|
|
|
|
mean value: 0.8846138841461438
|
|
|
|
key: test_jcc
|
|
value: [0.93939394 0.8 0.78378378 0.93939394 0.91176471 0.81081081
|
|
0.79411765 0.87878788 0.79411765 0.90909091]
|
|
|
|
mean value: 0.8561261261261262
|
|
|
|
key: train_jcc
|
|
value: [0.89632107 0.89108911 0.87213115 0.86688312 0.85760518 0.88196721
|
|
0.87662338 0.86363636 0.87581699 0.87868852]
|
|
|
|
mean value: 0.8760762092991343
|
|
|
|
MCC on Blind test: 0.23
|
|
|
|
Accuracy on Blind test: 0.45
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.74151611 1.08886981 0.69769788 0.70535755 0.87346554 0.72519445
|
|
0.73284912 0.83741045 0.65530038 0.68675303]
|
|
|
|
mean value: 0.7744414329528808
|
|
|
|
key: score_time
|
|
value: [0.01378059 0.01389503 0.01416969 0.01405454 0.01443934 0.0140748
|
|
0.01437092 0.01120043 0.0144248 0.01425123]
|
|
|
|
mean value: 0.013866138458251954
|
|
|
|
key: test_mcc
|
|
value: [1. 0.8566725 1. 0.95299692 0.90662544 0.76032282
|
|
0.90524194 0.9085301 0.85513419 0.85513419]
|
|
|
|
mean value: 0.90006580934109
|
|
|
|
key: train_mcc
|
|
value: [0.93593571 0.96269263 0.94130059 0.93593571 0.95736701 0.95734993
|
|
0.94131391 0.9469026 0.95756757 0.95740101]
|
|
|
|
mean value: 0.9493766673456756
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.93617021 1. 0.9787234 0.95744681 0.89361702
|
|
0.95744681 0.95652174 0.93478261 0.93478261]
|
|
|
|
mean value: 0.9549491211840888
|
|
|
|
key: train_accuracy
|
|
value: [0.97142857 0.98333333 0.97380952 0.97142857 0.98095238 0.98095238
|
|
0.97380952 0.97624703 0.98099762 0.98099762]
|
|
|
|
mean value: 0.9773956565999321
|
|
|
|
key: test_fscore
|
|
value: [1. 0.95238095 1. 0.98412698 0.96875 0.92307692
|
|
0.96774194 0.96666667 0.95081967 0.95081967]
|
|
|
|
mean value: 0.9664382805997692
|
|
|
|
key: train_fscore
|
|
value: [0.97857143 0.98747764 0.980322 0.97857143 0.98571429 0.98566308
|
|
0.98039216 0.98214286 0.98571429 0.98571429]
|
|
|
|
mean value: 0.983028345294684
|
|
|
|
key: test_precision
|
|
value: [1. 0.9375 1. 0.96875 0.93939394 0.88235294
|
|
0.96774194 1. 0.96666667 0.93548387]
|
|
|
|
mean value: 0.959788935368869
|
|
|
|
key: train_precision
|
|
value: [0.97163121 0.98220641 0.97508897 0.97163121 0.9787234 0.98214286
|
|
0.97173145 0.9751773 0.9787234 0.98220641]
|
|
|
|
mean value: 0.9769262610088233
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 1. 1. 1. 0.96774194
|
|
0.96774194 0.93548387 0.93548387 0.96666667]
|
|
|
|
mean value: 0.9740860215053764
|
|
|
|
key: train_recall
|
|
value: [0.98561151 0.99280576 0.98561151 0.98561151 0.99280576 0.98920863
|
|
0.98920863 0.98920863 0.99280576 0.98924731]
|
|
|
|
mean value: 0.9892125009669683
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.92137097 1. 0.96875 0.9375 0.85887097
|
|
0.95262097 0.96774194 0.9344086 0.92083333]
|
|
|
|
mean value: 0.9462096774193549
|
|
|
|
key: train_roc_auc
|
|
value: [0.96463674 0.97879724 0.96815787 0.96463674 0.97527612 0.97699868
|
|
0.9664353 0.97012879 0.97542386 0.97701802]
|
|
|
|
mean value: 0.9717509367831001
|
|
|
|
key: test_jcc
|
|
value: [1. 0.90909091 1. 0.96875 0.93939394 0.85714286
|
|
0.9375 0.93548387 0.90625 0.90625 ]
|
|
|
|
mean value: 0.9359861576595447
|
|
|
|
key: train_jcc
|
|
value: [0.95804196 0.97526502 0.96140351 0.95804196 0.97183099 0.97173145
|
|
0.96153846 0.96491228 0.97183099 0.97183099]
|
|
|
|
mean value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
0.9666427591273636
|
|
|
|
MCC on Blind test: 0.14
|
|
|
|
Accuracy on Blind test: 0.32
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01048064 0.00996375 0.00781631 0.00742173 0.00739574 0.00739932
|
|
0.00733399 0.00878787 0.0088315 0.00836849]
|
|
|
|
mean value: 0.008379936218261719
|
|
|
|
key: score_time
|
|
value: [0.01066589 0.00898504 0.0085032 0.0080471 0.00794482 0.0084784
|
|
0.00797868 0.00964499 0.00877905 0.00852466]
|
|
|
|
mean value: 0.008755183219909668
|
|
|
|
key: test_mcc
|
|
value: [0.8566725 0.50614703 0.62096774 0.76032282 0.81048387 0.71572581
|
|
0.59764284 0.75776742 0.60430108 0.36514837]
|
|
|
|
mean value: 0.6595179479313003
|
|
|
|
key: train_mcc
|
|
value: [0.70671585 0.70811111 0.71695894 0.68716403 0.71727396 0.73126698
|
|
0.71138479 0.71852622 0.74194944 0.54109586]
|
|
|
|
mean value: 0.6980447184919443
|
|
|
|
key: test_accuracy
|
|
value: [0.93617021 0.76595745 0.82978723 0.89361702 0.91489362 0.87234043
|
|
0.80851064 0.89130435 0.82608696 0.67391304]
|
|
|
|
mean value: 0.8412580943570768
|
|
|
|
key: train_accuracy
|
|
value: [0.87142857 0.86666667 0.86904762 0.85714286 0.87142857 0.87857143
|
|
0.86904762 0.87173397 0.88361045 0.74821853]
|
|
|
|
mean value: 0.8586896278701505
|
|
|
|
key: test_fscore
|
|
value: [0.95238095 0.81355932 0.87096774 0.92307692 0.93548387 0.90322581
|
|
0.84745763 0.91803279 0.87096774 0.71698113]
|
|
|
|
mean value: 0.8752133904861458
|
|
|
|
key: train_fscore
|
|
value: [0.90721649 0.8974359 0.89833641 0.89010989 0.90145985 0.90744102
|
|
0.89981785 0.90145985 0.91139241 0.77916667]
|
|
|
|
mean value: 0.8893836343169823
|
|
|
|
key: test_precision
|
|
value: [0.9375 0.85714286 0.87096774 0.88235294 0.93548387 0.90322581
|
|
0.89285714 0.93333333 0.87096774 0.82608696]
|
|
|
|
mean value: 0.8909918392321865
|
|
|
|
key: train_precision
|
|
value: [0.86842105 0.9141791 0.92395437 0.90671642 0.91481481 0.91575092
|
|
0.91143911 0.91481481 0.91636364 0.93034826]
|
|
|
|
mean value: 0.9116802502485006
|
|
|
|
key: test_recall
|
|
value: [0.96774194 0.77419355 0.87096774 0.96774194 0.93548387 0.90322581
|
|
0.80645161 0.90322581 0.87096774 0.63333333]
|
|
|
|
mean value: 0.8633333333333333
|
|
|
|
key: train_recall
|
|
value: [0.94964029 0.88129496 0.87410072 0.87410072 0.88848921 0.89928058
|
|
0.88848921 0.88848921 0.90647482 0.6702509 ]
|
|
|
|
mean value: 0.8720610608287563
|
|
|
|
key: test_roc_auc
|
|
value: [0.92137097 0.76209677 0.81048387 0.85887097 0.90524194 0.8578629
|
|
0.80947581 0.88494624 0.80215054 0.69166667]
|
|
|
|
mean value: 0.8304166666666667
|
|
|
|
key: train_roc_auc
|
|
value: [0.83397507 0.85966157 0.86662782 0.84902219 0.86325869 0.86865437
|
|
0.85973756 0.86382502 0.87281783 0.78582967]
|
|
|
|
mean value: 0.8523409805276452
|
|
|
|
key: test_jcc
|
|
value: [0.90909091 0.68571429 0.77142857 0.85714286 0.87878788 0.82352941
|
|
0.73529412 0.84848485 0.77142857 0.55882353]
|
|
|
|
mean value: 0.7839724980901451
|
|
|
|
key: train_jcc
|
|
value: [0.83018868 0.81395349 0.81543624 0.8019802 0.82059801 0.83056478
|
|
0.81788079 0.82059801 0.8372093 0.63822526]
|
|
|
|
mean value: 0.8026634757590374
|
|
|
|
MCC on Blind test: 0.22
|
|
|
|
Accuracy on Blind test: 0.56
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00818086 0.0079577 0.00792146 0.00760269 0.00763583 0.00755167
|
|
0.00756431 0.00766754 0.00792885 0.00768661]
|
|
|
|
mean value: 0.00776975154876709
|
|
|
|
key: score_time
|
|
value: [0.00826931 0.00858402 0.0080297 0.00802231 0.00803876 0.00794721
|
|
0.00808096 0.00816536 0.00821137 0.0079844 ]
|
|
|
|
mean value: 0.008133339881896972
|
|
|
|
key: test_mcc
|
|
value: [0.76746995 0.61207663 0.31752781 0.71206211 0.76032282 0.6139232
|
|
0.66402366 0.59332241 0.38733878 0.70954337]
|
|
|
|
mean value: 0.6137610732708011
|
|
|
|
key: train_mcc
|
|
value: [0.62791789 0.64521328 0.66619129 0.63945586 0.63982246 0.63982246
|
|
0.6506538 0.65794031 0.65846852 0.63442864]
|
|
|
|
mean value: 0.6459914516114823
|
|
|
|
key: test_accuracy
|
|
value: [0.89361702 0.82978723 0.70212766 0.87234043 0.89361702 0.82978723
|
|
0.85106383 0.82608696 0.73913043 0.86956522]
|
|
|
|
mean value: 0.8307123034227567
|
|
|
|
key: train_accuracy
|
|
value: [0.83809524 0.8452381 0.85238095 0.84285714 0.84285714 0.84285714
|
|
0.84761905 0.85035629 0.85035629 0.84085511]
|
|
|
|
mean value: 0.8453472457866757
|
|
|
|
key: test_fscore
|
|
value: [0.91803279 0.875 0.78125 0.90909091 0.92307692 0.88235294
|
|
0.88888889 0.875 0.8125 0.90625 ]
|
|
|
|
mean value: 0.8771442449118437
|
|
|
|
key: train_fscore
|
|
value: [0.88316151 0.88773748 0.89007092 0.8862069 0.88581315 0.88581315
|
|
0.88965517 0.89156627 0.89081456 0.88468158]
|
|
|
|
mean value: 0.8875520685563664
|
|
|
|
key: test_precision
|
|
value: [0.93333333 0.84848485 0.75757576 0.85714286 0.88235294 0.81081081
|
|
0.875 0.84848485 0.78787879 0.85294118]
|
|
|
|
mean value: 0.8454005361358302
|
|
|
|
key: train_precision
|
|
value: [0.84539474 0.8538206 0.87762238 0.85099338 0.85333333 0.85333333
|
|
0.85430464 0.85478548 0.85953177 0.85099338]
|
|
|
|
mean value: 0.8554113020989377
|
|
|
|
key: test_recall
|
|
value: [0.90322581 0.90322581 0.80645161 0.96774194 0.96774194 0.96774194
|
|
0.90322581 0.90322581 0.83870968 0.96666667]
|
|
|
|
mean value: 0.9127956989247312
|
|
|
|
key: train_recall
|
|
value: [0.92446043 0.92446043 0.9028777 0.92446043 0.92086331 0.92086331
|
|
0.92805755 0.93165468 0.92446043 0.92114695]
|
|
|
|
mean value: 0.9223305226786314
|
|
|
|
key: test_roc_auc
|
|
value: [0.8891129 0.7953629 0.65322581 0.82762097 0.85887097 0.76512097
|
|
0.8266129 0.78494624 0.68602151 0.82708333]
|
|
|
|
mean value: 0.7913978494623656
|
|
|
|
key: train_roc_auc
|
|
value: [0.79673726 0.80730064 0.82819941 0.80377951 0.80550208 0.80550208
|
|
0.8090992 0.81198118 0.81537707 0.80212277]
|
|
|
|
mean value: 0.8085601200017799
|
|
|
|
key: test_jcc
|
|
value: [0.84848485 0.77777778 0.64102564 0.83333333 0.85714286 0.78947368
|
|
0.8 0.77777778 0.68421053 0.82857143]
|
|
|
|
mean value: 0.783779787463998
|
|
|
|
key: train_jcc
|
|
value: [0.79076923 0.79813665 0.80191693 0.79566563 0.79503106 0.79503106
|
|
0.80124224 0.80434783 0.803125 0.79320988]
|
|
|
|
mean value: 0.7978475494770487
|
|
|
|
MCC on Blind test: 0.24
|
|
|
|
Accuracy on Blind test: 0.47
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00735164 0.00838447 0.00824022 0.00810122 0.00807238 0.0080514
|
|
0.00810838 0.00790358 0.00764203 0.00775075]
|
|
|
|
mean value: 0.00796060562133789
|
|
|
|
key: score_time
|
|
value: [0.09431863 0.01160264 0.01148391 0.01503587 0.01451468 0.0130167
|
|
0.01415229 0.01103735 0.01092792 0.01100278]
|
|
|
|
mean value: 0.02070927619934082
|
|
|
|
key: test_mcc
|
|
value: [0.76746995 0.76034808 0.4031367 0.65994312 0.71025956 0.61207663
|
|
0.56769924 0.58251534 0.49033059 0.48102958]
|
|
|
|
mean value: 0.6034808785180602
|
|
|
|
key: train_mcc
|
|
value: [0.69858559 0.69632669 0.75172804 0.69676775 0.73520628 0.71297421
|
|
0.70164234 0.70915156 0.73690278 0.72050578]
|
|
|
|
mean value: 0.7159791011797761
|
|
|
|
key: test_accuracy
|
|
value: [0.89361702 0.89361702 0.74468085 0.85106383 0.87234043 0.82978723
|
|
0.80851064 0.80434783 0.7826087 0.76086957]
|
|
|
|
mean value: 0.8241443108233117
|
|
|
|
key: train_accuracy
|
|
value: [0.86666667 0.86666667 0.89047619 0.86666667 0.88333333 0.87380952
|
|
0.86904762 0.87173397 0.88361045 0.87648456]
|
|
|
|
mean value: 0.8748495645288994
|
|
|
|
key: test_fscore
|
|
value: [0.91803279 0.92063492 0.81818182 0.89230769 0.90625 0.875
|
|
0.85714286 0.84745763 0.84375 0.81355932]
|
|
|
|
mean value: 0.8692317024305076
|
|
|
|
key: train_fscore
|
|
value: [0.90070922 0.9020979 0.91901408 0.90175439 0.91388401 0.90718039
|
|
0.90401396 0.90526316 0.91358025 0.90812721]
|
|
|
|
mean value: 0.9075624559641324
|
|
|
|
key: test_precision
|
|
value: [0.93333333 0.90625 0.77142857 0.85294118 0.87878788 0.84848485
|
|
0.84375 0.89285714 0.81818182 0.82758621]
|
|
|
|
mean value: 0.8573600976440733
|
|
|
|
key: train_precision
|
|
value: [0.88811189 0.87755102 0.9 0.88013699 0.89347079 0.88395904
|
|
0.8779661 0.88356164 0.89619377 0.89547038]
|
|
|
|
mean value: 0.887642163000012
|
|
|
|
key: test_recall
|
|
value: [0.90322581 0.93548387 0.87096774 0.93548387 0.93548387 0.90322581
|
|
0.87096774 0.80645161 0.87096774 0.8 ]
|
|
|
|
mean value: 0.8832258064516129
|
|
|
|
key: train_recall
|
|
value: [0.91366906 0.92805755 0.93884892 0.92446043 0.9352518 0.93165468
|
|
0.93165468 0.92805755 0.93165468 0.92114695]
|
|
|
|
mean value: 0.9284456305923003
|
|
|
|
key: test_roc_auc
|
|
value: [0.8891129 0.87399194 0.68548387 0.81149194 0.84274194 0.7953629
|
|
0.77923387 0.80322581 0.73548387 0.74375 ]
|
|
|
|
mean value: 0.7959879032258065
|
|
|
|
key: train_roc_auc
|
|
value: [0.84415848 0.83726821 0.86731178 0.83899078 0.85847097 0.84610903
|
|
0.83906677 0.84514766 0.86093223 0.85493967]
|
|
|
|
mean value: 0.8492395591157109
|
|
|
|
key: test_jcc
|
|
value: [0.84848485 0.85294118 0.69230769 0.80555556 0.82857143 0.77777778
|
|
0.75 0.73529412 0.72972973 0.68571429]
|
|
|
|
mean value: 0.7706376612258965
|
|
|
|
key: train_jcc
|
|
value: [0.81935484 0.82165605 0.85016287 0.82108626 0.84142395 0.83012821
|
|
0.82484076 0.82692308 0.84090909 0.83171521]
|
|
|
|
mean value: 0.8308200313963069
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.45
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01428485 0.0115571 0.0116024 0.01192141 0.0118351 0.01425028
|
|
0.01206684 0.01400876 0.01199913 0.01239347]
|
|
|
|
mean value: 0.012591934204101563
|
|
|
|
key: score_time
|
|
value: [0.00854349 0.00847244 0.00854373 0.00855494 0.00946355 0.00848031
|
|
0.00859547 0.00851321 0.0085175 0.00893998]
|
|
|
|
mean value: 0.00866246223449707
|
|
|
|
key: test_mcc
|
|
value: [0.8566725 0.71206211 0.50611184 0.76032282 0.66337469 0.6139232
|
|
0.65994312 0.64852426 0.38733878 0.72168784]
|
|
|
|
mean value: 0.6529961162737778
|
|
|
|
key: train_mcc
|
|
value: [0.69022744 0.66164278 0.68466145 0.65612626 0.66739922 0.67302425
|
|
0.67350891 0.66972224 0.68052658 0.67334868]
|
|
|
|
mean value: 0.6730187805126882
|
|
|
|
key: test_accuracy
|
|
value: [0.93617021 0.87234043 0.78723404 0.89361702 0.85106383 0.82978723
|
|
0.85106383 0.84782609 0.73913043 0.86956522]
|
|
|
|
mean value: 0.8477798334875115
|
|
|
|
key: train_accuracy
|
|
value: [0.86428571 0.85238095 0.86190476 0.85 0.8547619 0.85714286
|
|
0.85714286 0.85510689 0.85985748 0.85748219]
|
|
|
|
mean value: 0.8570065603438525
|
|
|
|
key: test_fscore
|
|
value: [0.95238095 0.90909091 0.84848485 0.92307692 0.89552239 0.88235294
|
|
0.89230769 0.88888889 0.8125 0.90909091]
|
|
|
|
mean value: 0.8913696452557295
|
|
|
|
key: train_fscore
|
|
value: [0.90289608 0.89419795 0.90136054 0.89303905 0.89608177 0.89761092
|
|
0.89830508 0.89678511 0.89948893 0.89795918]
|
|
|
|
mean value: 0.8977724625814629
|
|
|
|
key: test_precision
|
|
value: [0.9375 0.85714286 0.8 0.88235294 0.83333333 0.81081081
|
|
0.85294118 0.875 0.78787879 0.83333333]
|
|
|
|
mean value: 0.8470293240146182
|
|
|
|
key: train_precision
|
|
value: [0.85760518 0.85064935 0.85483871 0.84565916 0.85113269 0.8538961
|
|
0.84935897 0.84664537 0.85436893 0.85436893]
|
|
|
|
mean value: 0.8518523398136467
|
|
|
|
key: test_recall
|
|
value: [0.96774194 0.96774194 0.90322581 0.96774194 0.96774194 0.96774194
|
|
0.93548387 0.90322581 0.83870968 1. ]
|
|
|
|
mean value: 0.9419354838709677
|
|
|
|
key: train_recall
|
|
value: [0.95323741 0.94244604 0.95323741 0.94604317 0.94604317 0.94604317
|
|
0.95323741 0.95323741 0.94964029 0.94623656]
|
|
|
|
mean value: 0.9489402026765684
|
|
|
|
key: test_roc_auc
|
|
value: [0.92137097 0.82762097 0.7328629 0.85887097 0.79637097 0.76512097
|
|
0.81149194 0.81827957 0.68602151 0.8125 ]
|
|
|
|
mean value: 0.8030510752688172
|
|
|
|
key: train_roc_auc
|
|
value: [0.82168913 0.80925119 0.818168 0.8040075 0.81104975 0.81457088
|
|
0.81112575 0.80878654 0.81747749 0.81466758]
|
|
|
|
mean value: 0.813079379384182
|
|
|
|
key: test_jcc
|
|
value: [0.90909091 0.83333333 0.73684211 0.85714286 0.81081081 0.78947368
|
|
0.80555556 0.8 0.68421053 0.83333333]
|
|
|
|
mean value: 0.8059793115056273
|
|
|
|
key: train_jcc
|
|
value: [0.82298137 0.80864198 0.82043344 0.80674847 0.8117284 0.81424149
|
|
0.81538462 0.81288344 0.81733746 0.81481481]
|
|
|
|
mean value: 0.8145195452770848
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.45
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.31461072 1.40823627 1.28151274 1.4231658 1.334095 1.30517697
|
|
1.41587329 1.28833318 1.49593544 1.34908724]
|
|
|
|
mean value: 1.3616026639938354
|
|
|
|
key: score_time
|
|
value: [0.01176286 0.01351857 0.0135088 0.01388788 0.01229548 0.01362157
|
|
0.01102948 0.01351404 0.01373792 0.01853848]
|
|
|
|
mean value: 0.013541507720947265
|
|
|
|
key: test_mcc
|
|
value: [1. 0.8084425 0.90662544 1. 0.95299692 0.76032282
|
|
0.90524194 0.90107527 0.74930844 0.80833333]
|
|
|
|
mean value: 0.8792346661083966
|
|
|
|
key: train_mcc
|
|
value: [0.9680267 0.95736701 0.94674008 0.9680267 0.96269263 0.9680267
|
|
0.9628398 0.96296053 0.95222181 0.99470992]
|
|
|
|
mean value: 0.9643611879690016
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.91489362 0.95744681 1. 0.9787234 0.89361702
|
|
0.95744681 0.95652174 0.89130435 0.91304348]
|
|
|
|
mean value: 0.9462997224791859
|
|
|
|
key: train_accuracy
|
|
value: [0.98571429 0.98095238 0.97619048 0.98571429 0.98333333 0.98571429
|
|
0.98333333 0.98337292 0.97862233 0.9976247 ]
|
|
|
|
mean value: 0.9840572333446442
|
|
|
|
key: test_fscore
|
|
value: [1. 0.9375 0.96875 1. 0.98412698 0.92307692
|
|
0.96774194 0.96774194 0.92063492 0.93333333]
|
|
|
|
mean value: 0.9602906032139903
|
|
|
|
key: train_fscore
|
|
value: [0.98924731 0.98571429 0.98220641 0.98924731 0.98747764 0.98924731
|
|
0.98738739 0.98752228 0.98389982 0.99820467]
|
|
|
|
mean value: 0.9880154423532531
|
|
|
|
key: test_precision
|
|
value: [1. 0.90909091 0.93939394 1. 0.96875 0.88235294
|
|
0.96774194 0.96774194 0.90625 0.93333333]
|
|
|
|
mean value: 0.9474654993962395
|
|
|
|
key: train_precision
|
|
value: [0.98571429 0.9787234 0.97183099 0.98571429 0.98220641 0.98571429
|
|
0.98916968 0.97879859 0.97864769 1. ]
|
|
|
|
mean value: 0.9836519601503051
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 1. 1. 1. 0.96774194
|
|
0.96774194 0.96774194 0.93548387 0.93333333]
|
|
|
|
mean value: 0.9739784946236559
|
|
|
|
key: train_recall
|
|
value: [0.99280576 0.99280576 0.99280576 0.99280576 0.99280576 0.99280576
|
|
0.98561151 0.99640288 0.98920863 0.99641577]
|
|
|
|
mean value: 0.9924473324566154
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.89012097 0.9375 1. 0.96875 0.85887097
|
|
0.95262097 0.95053763 0.86774194 0.90416667]
|
|
|
|
mean value: 0.9330309139784947
|
|
|
|
key: train_roc_auc
|
|
value: [0.98231837 0.97527612 0.96823386 0.98231837 0.97879724 0.98231837
|
|
0.98224238 0.97722242 0.9736253 0.99820789]
|
|
|
|
mean value: 0.980056031046588
|
|
|
|
key: test_jcc
|
|
value: [1. 0.88235294 0.93939394 1. 0.96875 0.85714286
|
|
0.9375 0.9375 0.85294118 0.875 ]
|
|
|
|
mean value: 0.9250580914183856
|
|
|
|
key: train_jcc
|
|
value: [0.9787234 0.97183099 0.96503497 0.9787234 0.97526502 0.9787234
|
|
0.97508897 0.97535211 0.96830986 0.99641577]
|
|
|
|
mean value: 0.9763467891796095
|
|
|
|
MCC on Blind test: 0.13
|
|
|
|
Accuracy on Blind test: 0.31
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01342225 0.01069498 0.00975204 0.01033854 0.00994968 0.01040697
|
|
0.01035452 0.01077914 0.01066208 0.01090336]
|
|
|
|
mean value: 0.010726356506347656
|
|
|
|
key: score_time
|
|
value: [0.01061678 0.00818062 0.00800824 0.00842381 0.00850368 0.00858855
|
|
0.00848293 0.00844717 0.00845146 0.00849843]
|
|
|
|
mean value: 0.008620166778564453
|
|
|
|
key: test_mcc
|
|
value: [0.95299692 0.8566725 0.91188882 1. 0.86091836 0.8566725
|
|
0.87213027 0.95250095 0.90107527 0.80833333]
|
|
|
|
mean value: 0.8973188916801316
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9787234 0.93617021 0.95744681 1. 0.93617021 0.93617021
|
|
0.93617021 0.97826087 0.95652174 0.91304348]
|
|
|
|
mean value: 0.9528677150786309
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98412698 0.95238095 0.96666667 1. 0.95081967 0.95238095
|
|
0.94915254 0.98360656 0.96774194 0.93333333]
|
|
|
|
mean value: 0.9640209596253838
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96875 0.9375 1. 1. 0.96666667 0.9375
|
|
1. 1. 0.96774194 0.93333333]
|
|
|
|
mean value: 0.9711491935483871
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 0.93548387 1. 0.93548387 0.96774194
|
|
0.90322581 0.96774194 0.96774194 0.93333333]
|
|
|
|
mean value: 0.9578494623655914
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96875 0.92137097 0.96774194 1. 0.93649194 0.92137097
|
|
0.9516129 0.98387097 0.95053763 0.90416667]
|
|
|
|
mean value: 0.9505913978494623
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96875 0.90909091 0.93548387 1. 0.90625 0.90909091
|
|
0.90322581 0.96774194 0.9375 0.875 ]
|
|
|
|
mean value: 0.9312133431085043
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.2
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10349464 0.09808111 0.10309243 0.10495615 0.10384583 0.10514021
|
|
0.10287976 0.10301304 0.10425162 0.10234761]
|
|
|
|
mean value: 0.10311024188995362
|
|
|
|
key: score_time
|
|
value: [0.01685739 0.01713133 0.01867747 0.01792812 0.01854682 0.01873803
|
|
0.01870346 0.01733375 0.01832008 0.01786637]
|
|
|
|
mean value: 0.018010282516479494
|
|
|
|
key: test_mcc
|
|
value: [0.90662544 0.8084425 0.81503725 0.90662544 0.86070252 0.76032282
|
|
0.81048387 0.85009261 0.8059304 0.90571105]
|
|
|
|
mean value: 0.8429973908395795
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.95744681 0.91489362 0.91489362 0.95744681 0.93617021 0.89361702
|
|
0.91489362 0.93478261 0.91304348 0.95652174]
|
|
|
|
mean value: 0.9293709528214616
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96875 0.9375 0.93939394 0.96875 0.95384615 0.92307692
|
|
0.93548387 0.95238095 0.93939394 0.96774194]
|
|
|
|
mean value: 0.9486317714543521
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.93939394 0.90909091 0.88571429 0.93939394 0.91176471 0.88235294
|
|
0.93548387 0.9375 0.88571429 0.9375 ]
|
|
|
|
mean value: 0.9163908877333925
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 1. 1. 1. 0.96774194
|
|
0.93548387 0.96774194 1. 1. ]
|
|
|
|
mean value: 0.9838709677419355
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.89012097 0.875 0.9375 0.90625 0.85887097
|
|
0.90524194 0.9172043 0.86666667 0.9375 ]
|
|
|
|
mean value: 0.9031854838709678
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.93939394 0.88235294 0.88571429 0.93939394 0.91176471 0.85714286
|
|
0.87878788 0.90909091 0.88571429 0.9375 ]
|
|
|
|
mean value: 0.9026855742296919
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.36
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00836992 0.00826001 0.00835204 0.00824237 0.00821042 0.00798821
|
|
0.00828552 0.00832677 0.00851941 0.00838804]
|
|
|
|
mean value: 0.008294272422790527
|
|
|
|
key: score_time
|
|
value: [0.00871825 0.00869298 0.00867295 0.0086019 0.00861168 0.00866127
|
|
0.0086937 0.00873017 0.0088346 0.0087533 ]
|
|
|
|
mean value: 0.008697080612182616
|
|
|
|
key: test_mcc
|
|
value: [0.86091836 0.71206211 0.65309894 0.81952077 0.8084425 0.65994312
|
|
0.50614703 0.60602162 0.44695591 0.72379255]
|
|
|
|
mean value: 0.6796902925193711
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.93617021 0.87234043 0.82978723 0.91489362 0.91489362 0.85106383
|
|
0.76595745 0.80434783 0.76086957 0.86956522]
|
|
|
|
mean value: 0.8519888991674376
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.95081967 0.90909091 0.86206897 0.93333333 0.9375 0.89230769
|
|
0.81355932 0.84210526 0.82539683 0.89655172]
|
|
|
|
mean value: 0.8862733707106872
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96666667 0.85714286 0.92592593 0.96551724 0.90909091 0.85294118
|
|
0.85714286 0.92307692 0.8125 0.92857143]
|
|
|
|
mean value: 0.8998575985467466
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.93548387 0.96774194 0.80645161 0.90322581 0.96774194 0.93548387
|
|
0.77419355 0.77419355 0.83870968 0.86666667]
|
|
|
|
mean value: 0.8769892473118279
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.93649194 0.82762097 0.84072581 0.9203629 0.89012097 0.81149194
|
|
0.76209677 0.82043011 0.71935484 0.87083333]
|
|
|
|
mean value: 0.8399529569892473
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.90625 0.83333333 0.75757576 0.875 0.88235294 0.80555556
|
|
0.68571429 0.72727273 0.7027027 0.8125 ]
|
|
|
|
mean value: 0.7988257303330832
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.09
|
|
|
|
Accuracy on Blind test: 0.38
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.29370928 1.2518785 1.24979663 1.24180865 1.26994014 1.25986075
|
|
1.2572484 1.2555747 1.23349094 1.23494911]
|
|
|
|
mean value: 1.2548257112503052
|
|
|
|
key: score_time
|
|
value: [0.09408879 0.09164119 0.08997083 0.09628367 0.09728193 0.1462996
|
|
0.09323502 0.08956718 0.08982635 0.08968997]
|
|
|
|
mean value: 0.09778845310211182
|
|
|
|
key: test_mcc
|
|
value: [1. 0.8566725 1. 1. 0.90662544 0.81503725
|
|
1. 0.95250095 0.95087679 0.85513419]
|
|
|
|
mean value: 0.9336847119207848
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.93617021 1. 1. 0.95744681 0.91489362
|
|
1. 0.97826087 0.97826087 0.93478261]
|
|
|
|
mean value: 0.9699814986123959
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.95238095 1. 1. 0.96875 0.93939394
|
|
1. 0.98360656 0.98412698 0.95081967]
|
|
|
|
mean value: 0.9779078105410073
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.9375 1. 1. 0.93939394 0.88571429
|
|
1. 1. 0.96875 0.93548387]
|
|
|
|
mean value: 0.9666842096075967
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 1. 1. 1. 1.
|
|
1. 0.96774194 1. 0.96666667]
|
|
|
|
mean value: 0.9902150537634409
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.92137097 1. 1. 0.9375 0.875
|
|
1. 0.98387097 0.96666667 0.92083333]
|
|
|
|
mean value: 0.9605241935483871
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.90909091 1. 1. 0.93939394 0.88571429
|
|
1. 0.96774194 0.96875 0.90625 ]
|
|
|
|
mean value: 0.9576941069683005
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.07
|
|
|
|
Accuracy on Blind test: 0.18
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
|
|
key: fit_time
|
|
value: [1.75414562 0.86297989 0.94754958 0.91948628 0.92758203 1.00107074
|
|
0.93352938 0.92137861 0.88540673 0.90511346]
|
|
|
|
mean value: 1.0058242321014403
|
|
|
|
key: score_time
|
|
value: [0.23915219 0.2850039 0.25384307 0.23436403 0.24242306 0.2717557
|
|
0.25083756 0.22900653 0.23912811 0.27642059]
|
|
|
|
mean value: 0.2521934747695923
|
|
|
|
key: test_mcc
|
|
value: [1. 0.8084425 0.90662544 1. 0.90662544 0.81503725
|
|
1. 0.90107527 0.95087679 0.80651412]
|
|
|
|
mean value: 0.9095196821072326
|
|
|
|
key: train_mcc
|
|
value: [0.94694186 0.96278526 0.94694186 0.94694186 0.94694186 0.96278526
|
|
0.95221511 0.95793986 0.95769694 0.96282875]
|
|
|
|
mean value: 0.9544018630875426
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.91489362 0.95744681 1. 0.95744681 0.91489362
|
|
1. 0.95652174 0.97826087 0.91304348]
|
|
|
|
mean value: 0.9592506938020352
|
|
|
|
key: train_accuracy
|
|
value: [0.97619048 0.98333333 0.97619048 0.97619048 0.97619048 0.98333333
|
|
0.97857143 0.98099762 0.98099762 0.98337292]
|
|
|
|
mean value: 0.9795368171021377
|
|
|
|
key: test_fscore
|
|
value: [1. 0.9375 0.96875 1. 0.96875 0.93939394
|
|
1. 0.96774194 0.98412698 0.93548387]
|
|
|
|
mean value: 0.9701746729972536
|
|
|
|
key: train_fscore
|
|
value: [0.9822695 0.98752228 0.9822695 0.9822695 0.9822695 0.98752228
|
|
0.98401421 0.9858156 0.98576512 0.98756661]
|
|
|
|
mean value: 0.9847284121907804
|
|
|
|
key: test_precision
|
|
value: [1. 0.90909091 0.93939394 1. 0.93939394 0.88571429
|
|
1. 0.96774194 0.96875 0.90625 ]
|
|
|
|
mean value: 0.9516335009076945
|
|
|
|
key: train_precision
|
|
value: [0.96853147 0.97879859 0.96853147 0.96853147 0.96853147 0.97879859
|
|
0.97192982 0.97202797 0.97535211 0.97887324]
|
|
|
|
mean value: 0.9729906195972802
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 1. 1. 1. 1.
|
|
1. 0.96774194 1. 0.96666667]
|
|
|
|
mean value: 0.9902150537634409
|
|
|
|
key: train_recall
|
|
value: [0.99640288 0.99640288 0.99640288 0.99640288 0.99640288 0.99640288
|
|
0.99640288 1. 0.99640288 0.99641577]
|
|
|
|
mean value: 0.9967638792192053
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.89012097 0.9375 1. 0.9375 0.875
|
|
1. 0.95053763 0.96666667 0.88958333]
|
|
|
|
mean value: 0.9446908602150538
|
|
|
|
key: train_roc_auc
|
|
value: [0.9665113 0.97707468 0.9665113 0.9665113 0.9665113 0.97707468
|
|
0.97003242 0.97202797 0.97372591 0.97708112]
|
|
|
|
mean value: 0.9713061984493545
|
|
|
|
key: test_jcc
|
|
value: [1. 0.88235294 0.93939394 1. 0.93939394 0.88571429
|
|
1. 0.9375 0.96875 0.87878788]
|
|
|
|
mean value: 0.9431892984466514
|
|
|
|
key: train_jcc
|
|
value: [0.96515679 0.97535211 0.96515679 0.96515679 0.96515679 0.97535211
|
|
0.96853147 0.97202797 0.97192982 0.9754386 ]
|
|
|
|
mean value: 0.9699259264664533
|
|
|
|
MCC on Blind test: 0.08
|
|
|
|
Accuracy on Blind test: 0.19
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01817513 0.0076189 0.00755739 0.00759125 0.0075407 0.00757432
|
|
0.00772119 0.0076313 0.0076437 0.00751853]
|
|
|
|
mean value: 0.008657240867614746
|
|
|
|
key: score_time
|
|
value: [0.0108695 0.00802517 0.00807548 0.00796461 0.00793648 0.00793743
|
|
0.00874805 0.00799894 0.00802493 0.00804806]
|
|
|
|
mean value: 0.008362865447998047
|
|
|
|
key: test_mcc
|
|
value: [0.76746995 0.61207663 0.31752781 0.71206211 0.76032282 0.6139232
|
|
0.66402366 0.59332241 0.38733878 0.70954337]
|
|
|
|
mean value: 0.6137610732708011
|
|
|
|
key: train_mcc
|
|
value: [0.62791789 0.64521328 0.66619129 0.63945586 0.63982246 0.63982246
|
|
0.6506538 0.65794031 0.65846852 0.63442864]
|
|
|
|
mean value: 0.6459914516114823
|
|
|
|
key: test_accuracy
|
|
value: [0.89361702 0.82978723 0.70212766 0.87234043 0.89361702 0.82978723
|
|
0.85106383 0.82608696 0.73913043 0.86956522]
|
|
|
|
mean value: 0.8307123034227567
|
|
|
|
key: train_accuracy
|
|
value: [0.83809524 0.8452381 0.85238095 0.84285714 0.84285714 0.84285714
|
|
0.84761905 0.85035629 0.85035629 0.84085511]
|
|
|
|
mean value: 0.8453472457866757
|
|
|
|
key: test_fscore
|
|
value: [0.91803279 0.875 0.78125 0.90909091 0.92307692 0.88235294
|
|
0.88888889 0.875 0.8125 0.90625 ]
|
|
|
|
mean value: 0.8771442449118437
|
|
|
|
key: train_fscore
|
|
value: [0.88316151 0.88773748 0.89007092 0.8862069 0.88581315 0.88581315
|
|
0.88965517 0.89156627 0.89081456 0.88468158]
|
|
|
|
mean value: 0.8875520685563664
|
|
|
|
key: test_precision
|
|
value: [0.93333333 0.84848485 0.75757576 0.85714286 0.88235294 0.81081081
|
|
0.875 0.84848485 0.78787879 0.85294118]
|
|
|
|
mean value: 0.8454005361358302
|
|
|
|
key: train_precision
|
|
value: [0.84539474 0.8538206 0.87762238 0.85099338 0.85333333 0.85333333
|
|
0.85430464 0.85478548 0.85953177 0.85099338]
|
|
|
|
mean value: 0.8554113020989377
|
|
|
|
key: test_recall
|
|
value: [0.90322581 0.90322581 0.80645161 0.96774194 0.96774194 0.96774194
|
|
0.90322581 0.90322581 0.83870968 0.96666667]
|
|
|
|
mean value: 0.9127956989247312
|
|
|
|
key: train_recall
|
|
value: [0.92446043 0.92446043 0.9028777 0.92446043 0.92086331 0.92086331
|
|
0.92805755 0.93165468 0.92446043 0.92114695]
|
|
|
|
mean value: 0.9223305226786314
|
|
|
|
key: test_roc_auc
|
|
value: [0.8891129 0.7953629 0.65322581 0.82762097 0.85887097 0.76512097
|
|
0.8266129 0.78494624 0.68602151 0.82708333]
|
|
|
|
mean value: 0.7913978494623656
|
|
|
|
key: train_roc_auc
|
|
value: [0.79673726 0.80730064 0.82819941 0.80377951 0.80550208 0.80550208
|
|
0.8090992 0.81198118 0.81537707 0.80212277]
|
|
|
|
mean value: 0.8085601200017799
|
|
|
|
key: test_jcc
|
|
value: [0.84848485 0.77777778 0.64102564 0.83333333 0.85714286 0.78947368
|
|
0.8 0.77777778 0.68421053 0.82857143]
|
|
|
|
mean value: 0.783779787463998
|
|
|
|
key: train_jcc
|
|
value: [0.79076923 0.79813665 0.80191693 0.79566563 0.79503106 0.79503106
|
|
0.80124224 0.80434783 0.803125 0.79320988]
|
|
|
|
mean value: 0.7978475494770487
|
|
|
|
MCC on Blind test: 0.24
|
|
|
|
Accuracy on Blind test: 0.47
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.08977652 0.0446949 0.05097675 0.05265212 0.04966545 0.04864025
|
|
0.2227385 0.04262686 0.0463593 0.04706073]
|
|
|
|
mean value: 0.06951913833618165
|
|
|
|
key: score_time
|
|
value: [0.00969934 0.00960755 0.00962806 0.0097065 0.00962687 0.01001763
|
|
0.01041269 0.01037621 0.0100019 0.01042318]
|
|
|
|
mean value: 0.009949994087219239
|
|
|
|
key: test_mcc
|
|
value: [1. 0.8566725 1. 1. 0.90524194 0.86070252
|
|
1. 0.95250095 0.95087679 0.85513419]
|
|
|
|
mean value: 0.9381128880260178
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.93617021 1. 1. 0.95744681 0.93617021
|
|
1. 0.97826087 0.97826087 0.93478261]
|
|
|
|
mean value: 0.972109158186864
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.95238095 1. 1. 0.96774194 0.95384615
|
|
1. 0.98360656 0.98412698 0.95081967]
|
|
|
|
mean value: 0.9792522255346158
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.9375 1. 1. 0.96774194 0.91176471
|
|
1. 1. 0.96875 0.93548387]
|
|
|
|
mean value: 0.9721240512333966
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 1. 1. 0.96774194 1.
|
|
1. 0.96774194 1. 0.96666667]
|
|
|
|
mean value: 0.986989247311828
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.92137097 1. 1. 0.95262097 0.90625
|
|
1. 0.98387097 0.96666667 0.92083333]
|
|
|
|
mean value: 0.9651612903225807
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.90909091 1. 1. 0.9375 0.91176471
|
|
1. 0.96774194 0.96875 0.90625 ]
|
|
|
|
mean value: 0.9601097550457133
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.08
|
|
|
|
Accuracy on Blind test: 0.2
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01633263 0.01602507 0.03087282 0.03793883 0.03829098 0.03755164
|
|
0.03849244 0.04578662 0.03847647 0.0386765 ]
|
|
|
|
mean value: 0.033844399452209475
|
|
|
|
key: score_time
|
|
value: [0.01047325 0.01068068 0.02036643 0.01072168 0.01989603 0.02082086
|
|
0.02522516 0.01081634 0.0206635 0.02184916]
|
|
|
|
mean value: 0.017151308059692384
|
|
|
|
key: test_mcc
|
|
value: [0.95436677 0.8566725 1. 1. 0.90662544 0.81503725
|
|
1. 0.9085301 0.90107527 0.75776742]
|
|
|
|
mean value: 0.9100074758399945
|
|
|
|
key: train_mcc
|
|
value: [0.94131391 0.95204958 0.93598399 0.94131391 0.94674008 0.95734993
|
|
0.93066133 0.9469026 0.9469923 0.95754545]
|
|
|
|
mean value: 0.9456853089391832
|
|
|
|
key: test_accuracy
|
|
value: [0.9787234 0.93617021 1. 1. 0.95744681 0.91489362
|
|
1. 0.95652174 0.95652174 0.89130435]
|
|
|
|
mean value: 0.9591581868640148
|
|
|
|
key: train_accuracy
|
|
value: [0.97380952 0.97857143 0.97142857 0.97380952 0.97619048 0.98095238
|
|
0.96904762 0.97624703 0.97624703 0.98099762]
|
|
|
|
mean value: 0.9757301210270332
|
|
|
|
key: test_fscore
|
|
value: [0.98360656 0.95238095 1. 1. 0.96875 0.93939394
|
|
1. 0.96666667 0.96774194 0.91803279]
|
|
|
|
mean value: 0.9696572838187725
|
|
|
|
key: train_fscore
|
|
value: [0.98039216 0.98395722 0.97864769 0.98039216 0.98220641 0.98566308
|
|
0.97690941 0.98214286 0.98220641 0.9858156 ]
|
|
|
|
mean value: 0.9818332987468832
|
|
|
|
key: test_precision
|
|
value: [1. 0.9375 1. 1. 0.93939394 0.88571429
|
|
1. 1. 0.96774194 0.90322581]
|
|
|
|
mean value: 0.9633575967043709
|
|
|
|
key: train_precision
|
|
value: [0.97173145 0.97526502 0.96830986 0.97173145 0.97183099 0.98214286
|
|
0.96491228 0.9751773 0.97183099 0.9754386 ]
|
|
|
|
mean value: 0.972837078548064
|
|
|
|
key: test_recall
|
|
value: [0.96774194 0.96774194 1. 1. 1. 1.
|
|
1. 0.93548387 0.96774194 0.93333333]
|
|
|
|
mean value: 0.9772043010752688
|
|
|
|
key: train_recall
|
|
value: [0.98920863 0.99280576 0.98920863 0.98920863 0.99280576 0.98920863
|
|
0.98920863 0.98920863 0.99280576 0.99641577]
|
|
|
|
mean value: 0.991008483535752
|
|
|
|
key: test_roc_auc
|
|
value: [0.98387097 0.92137097 1. 1. 0.9375 0.875
|
|
1. 0.96774194 0.95053763 0.87291667]
|
|
|
|
mean value: 0.9508938172043011
|
|
|
|
key: train_roc_auc
|
|
value: [0.9664353 0.97175499 0.96291418 0.9664353 0.96823386 0.97699868
|
|
0.95939305 0.97012879 0.96843085 0.97356 ]
|
|
|
|
mean value: 0.9684285006076279
|
|
|
|
key: test_jcc
|
|
value: [0.96774194 0.90909091 1. 1. 0.93939394 0.88571429
|
|
1. 0.93548387 0.9375 0.84848485]
|
|
|
|
mean value: 0.9423409789135595
|
|
|
|
key: train_jcc
|
|
value: [0.96153846 0.96842105 0.95818815 0.96153846 0.96503497 0.97173145
|
|
0.95486111 0.96491228 0.96503497 0.97202797]
|
|
|
|
mean value: 0.9643288871692625
|
|
|
|
MCC on Blind test: 0.13
|
|
|
|
Accuracy on Blind test: 0.3
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.0186336 0.00766277 0.00761127 0.00744534 0.00747466 0.00750709
|
|
0.00777817 0.00826359 0.00808811 0.00809813]
|
|
|
|
mean value: 0.00885627269744873
|
|
|
|
key: score_time
|
|
value: [0.00869727 0.00830197 0.00809574 0.00786757 0.00828147 0.00783634
|
|
0.0086298 0.00836444 0.00861764 0.00868964]
|
|
|
|
mean value: 0.008338189125061036
|
|
|
|
key: test_mcc
|
|
value: [0.8566725 0.65994312 0.45918373 0.76032282 0.66337469 0.6139232
|
|
0.52620968 0.64852426 0.50537634 0.76764947]
|
|
|
|
mean value: 0.6461179816200634
|
|
|
|
key: train_mcc
|
|
value: [0.62766379 0.63945586 0.68424763 0.64471064 0.6504316 0.67304969
|
|
0.67293578 0.65214979 0.67466169 0.65101792]
|
|
|
|
mean value: 0.6570324374013666
|
|
|
|
key: test_accuracy
|
|
value: [0.93617021 0.85106383 0.76595745 0.89361702 0.85106383 0.82978723
|
|
0.78723404 0.84782609 0.7826087 0.89130435]
|
|
|
|
mean value: 0.8436632747456059
|
|
|
|
key: train_accuracy
|
|
value: [0.83809524 0.84285714 0.86190476 0.8452381 0.84761905 0.85714286
|
|
0.85714286 0.847981 0.85748219 0.847981 ]
|
|
|
|
mean value: 0.8503444180522566
|
|
|
|
key: test_fscore
|
|
value: [0.95238095 0.89230769 0.83076923 0.92307692 0.89552239 0.88235294
|
|
0.83870968 0.88888889 0.83870968 0.92307692]
|
|
|
|
mean value: 0.8865795294575493
|
|
|
|
key: train_fscore
|
|
value: [0.88356164 0.8862069 0.9 0.88850772 0.89003436 0.89655172
|
|
0.89726027 0.89041096 0.89726027 0.89003436]
|
|
|
|
mean value: 0.8919828218593321
|
|
|
|
key: test_precision
|
|
value: [0.9375 0.85294118 0.79411765 0.88235294 0.83333333 0.81081081
|
|
0.83870968 0.875 0.83870968 0.85714286]
|
|
|
|
mean value: 0.8520618120831593
|
|
|
|
key: train_precision
|
|
value: [0.84313725 0.85099338 0.86423841 0.84918033 0.85197368 0.86092715
|
|
0.85620915 0.8496732 0.85620915 0.85478548]
|
|
|
|
mean value: 0.8537327189194519
|
|
|
|
key: test_recall
|
|
value: [0.96774194 0.93548387 0.87096774 0.96774194 0.96774194 0.96774194
|
|
0.83870968 0.90322581 0.83870968 1. ]
|
|
|
|
mean value: 0.9258064516129032
|
|
|
|
key: train_recall
|
|
value: [0.92805755 0.92446043 0.93884892 0.93165468 0.93165468 0.9352518
|
|
0.94244604 0.9352518 0.94244604 0.92831541]
|
|
|
|
mean value: 0.9338387354632423
|
|
|
|
key: test_roc_auc
|
|
value: [0.92137097 0.81149194 0.71673387 0.85887097 0.79637097 0.76512097
|
|
0.76310484 0.81827957 0.75268817 0.84375 ]
|
|
|
|
mean value: 0.8047782258064516
|
|
|
|
key: train_roc_auc
|
|
value: [0.79501469 0.80377951 0.82505826 0.80385551 0.80737663 0.81973858
|
|
0.81629344 0.80678674 0.81737687 0.80922813]
|
|
|
|
mean value: 0.8104508362630897
|
|
|
|
key: test_jcc
|
|
value: [0.90909091 0.80555556 0.71052632 0.85714286 0.81081081 0.78947368
|
|
0.72222222 0.8 0.72222222 0.85714286]
|
|
|
|
mean value: 0.7984187434187434
|
|
|
|
key: train_jcc
|
|
value: [0.79141104 0.79566563 0.81818182 0.79938272 0.80185759 0.8125
|
|
0.8136646 0.80246914 0.8136646 0.80185759]
|
|
|
|
mean value: 0.80506547104786
|
|
|
|
MCC on Blind test: 0.21
|
|
|
|
Accuracy on Blind test: 0.45
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00986862 0.01313043 0.0124433 0.01303792 0.01351666 0.01400781
|
|
0.01293039 0.01448417 0.01343918 0.01248717]
|
|
|
|
mean value: 0.012934565544128418
|
|
|
|
key: score_time
|
|
value: [0.00865817 0.00993657 0.0099678 0.01048064 0.01074982 0.0105195
|
|
0.01045227 0.01052117 0.01057601 0.01054454]
|
|
|
|
mean value: 0.010240650177001953
|
|
|
|
key: test_mcc
|
|
value: [1. 0.8566725 1. 0.95436677 0.90662544 0.81503725
|
|
0.90524194 0.9085301 0.7725558 0.85513419]
|
|
|
|
mean value: 0.8974163989404769
|
|
|
|
key: train_mcc
|
|
value: [0.93593571 0.9627116 0.92552437 0.92120646 0.92557595 0.85221677
|
|
0.93598399 0.94195411 0.93206488 0.89469123]
|
|
|
|
mean value: 0.9227865066682192
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.93617021 1. 0.9787234 0.95744681 0.91489362
|
|
0.95744681 0.95652174 0.89130435 0.93478261]
|
|
|
|
mean value: 0.9527289546716003
|
|
|
|
key: train_accuracy
|
|
value: [0.97142857 0.98333333 0.96666667 0.96428571 0.96666667 0.93333333
|
|
0.97142857 0.97387173 0.96912114 0.95249406]
|
|
|
|
mean value: 0.965262979300984
|
|
|
|
key: test_fscore
|
|
value: [1. 0.95238095 1. 0.98360656 0.96875 0.93939394
|
|
0.96774194 0.96666667 0.91525424 0.95081967]
|
|
|
|
mean value: 0.9644613960721762
|
|
|
|
key: train_fscore
|
|
value: [0.97857143 0.98743268 0.97482014 0.97277677 0.97526502 0.95172414
|
|
0.97864769 0.98053097 0.97640653 0.96527778]
|
|
|
|
mean value: 0.9741453144247227
|
|
|
|
key: test_precision
|
|
value: [1. 0.9375 1. 1. 0.93939394 0.88571429
|
|
0.96774194 1. 0.96428571 0.93548387]
|
|
|
|
mean value: 0.9630119745845552
|
|
|
|
key: train_precision
|
|
value: [0.97163121 0.98566308 0.97482014 0.98168498 0.95833333 0.91390728
|
|
0.96830986 0.96515679 0.98534799 0.93602694]
|
|
|
|
mean value: 0.9640881606737391
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 1. 0.96774194 1. 1.
|
|
0.96774194 0.93548387 0.87096774 0.96666667]
|
|
|
|
mean value: 0.9676344086021506
|
|
|
|
key: train_recall
|
|
value: [0.98561151 0.98920863 0.97482014 0.96402878 0.99280576 0.99280576
|
|
0.98920863 0.99640288 0.9676259 0.99641577]
|
|
|
|
mean value: 0.984893375622083
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.92137097 1. 0.98387097 0.9375 0.875
|
|
0.95262097 0.96774194 0.90215054 0.92083333]
|
|
|
|
mean value: 0.946108870967742
|
|
|
|
key: train_roc_auc
|
|
value: [0.96463674 0.98051981 0.96276218 0.96440875 0.95414936 0.90485358
|
|
0.96291418 0.9632364 0.96982694 0.93130648]
|
|
|
|
mean value: 0.9558614420708662
|
|
|
|
key: test_jcc
|
|
value: [1. 0.90909091 1. 0.96774194 0.93939394 0.88571429
|
|
0.9375 0.93548387 0.84375 0.90625 ]
|
|
|
|
mean value: 0.9324924940650747
|
|
|
|
key: train_jcc
|
|
value: [0.95804196 0.9751773 0.95087719 0.94699647 0.95172414 0.90789474
|
|
0.95818815 0.96180556 0.95390071 0.93288591]
|
|
|
|
mean value: 0.9497492121318976
|
|
|
|
MCC on Blind test: 0.09
|
|
|
|
Accuracy on Blind test: 0.27
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0119803 0.012357 0.01353669 0.01262236 0.01239181 0.01188588
|
|
0.01398087 0.01292968 0.01270461 0.01215243]
|
|
|
|
mean value: 0.01265416145324707
|
|
|
|
key: score_time
|
|
value: [0.01043653 0.01049995 0.01050258 0.0104773 0.01051712 0.01048827
|
|
0.0106318 0.01071954 0.01067996 0.01075029]
|
|
|
|
mean value: 0.010570335388183593
|
|
|
|
key: test_mcc
|
|
value: [1. 0.8084425 0.87213027 0.95299692 0.90662544 0.78063446
|
|
0.95299692 0.85009261 0.81245565 0.76471368]
|
|
|
|
mean value: 0.8701088462869901
|
|
|
|
key: train_mcc
|
|
value: [0.93057824 0.96269263 0.86379539 0.93066133 0.86786568 0.85610492
|
|
0.94674008 0.88991881 0.94166847 0.91286344]
|
|
|
|
mean value: 0.9102888984394315
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.91489362 0.93617021 0.9787234 0.95744681 0.89361702
|
|
0.9787234 0.93478261 0.91304348 0.89130435]
|
|
|
|
mean value: 0.9398704902867715
|
|
|
|
key: train_accuracy
|
|
value: [0.96904762 0.98333333 0.93571429 0.96904762 0.94047619 0.93095238
|
|
0.97619048 0.95011876 0.97387173 0.95961995]
|
|
|
|
mean value: 0.9588372356068318
|
|
|
|
key: test_fscore
|
|
value: [1. 0.9375 0.94915254 0.98412698 0.96875 0.91525424
|
|
0.98412698 0.95238095 0.93333333 0.91525424]
|
|
|
|
mean value: 0.9539879270917406
|
|
|
|
key: train_fscore
|
|
value: [0.97682709 0.98747764 0.94990724 0.97690941 0.95667244 0.94579439
|
|
0.98220641 0.96347826 0.98025135 0.96892139]
|
|
|
|
mean value: 0.9688445621247324
|
|
|
|
key: test_precision
|
|
value: [1. 0.90909091 1. 0.96875 0.93939394 0.96428571
|
|
0.96875 0.9375 0.96551724 0.93103448]
|
|
|
|
mean value: 0.9584322286908494
|
|
|
|
key: train_precision
|
|
value: [0.96819788 0.98220641 0.98084291 0.96491228 0.92307692 0.9844358
|
|
0.97183099 0.93265993 0.97849462 0.98880597]
|
|
|
|
mean value: 0.9675463711254643
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 0.90322581 1. 1. 0.87096774
|
|
1. 0.96774194 0.90322581 0.9 ]
|
|
|
|
mean value: 0.9512903225806452
|
|
|
|
key: train_recall
|
|
value: [0.98561151 0.99280576 0.92086331 0.98920863 0.99280576 0.91007194
|
|
0.99280576 0.99640288 0.98201439 0.94982079]
|
|
|
|
mean value: 0.971241071658802
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.89012097 0.9516129 0.96875 0.9375 0.90423387
|
|
0.96875 0.9172043 0.91827957 0.8875 ]
|
|
|
|
mean value: 0.9343951612903226
|
|
|
|
key: train_roc_auc
|
|
value: [0.96111561 0.97879724 0.94282602 0.95939305 0.91541696 0.94095146
|
|
0.96823386 0.92827137 0.97002817 0.96434701]
|
|
|
|
mean value: 0.9529380774427173
|
|
|
|
key: test_jcc
|
|
value: [1. 0.88235294 0.90322581 0.96875 0.93939394 0.84375
|
|
0.96875 0.90909091 0.875 0.84375 ]
|
|
|
|
mean value: 0.9134063596112932
|
|
|
|
key: train_jcc
|
|
value: [0.95470383 0.97526502 0.90459364 0.95486111 0.91694352 0.89716312
|
|
0.96503497 0.9295302 0.96126761 0.93971631]
|
|
|
|
mean value: 0.9399079327337388
|
|
|
|
MCC on Blind test: 0.07
|
|
|
|
Accuracy on Blind test: 0.21
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.1008575 0.08776975 0.08655286 0.0874176 0.08782935 0.08918238
|
|
0.09195852 0.09141636 0.09183121 0.08885765]
|
|
|
|
mean value: 0.09036731719970703
|
|
|
|
key: score_time
|
|
value: [0.01442814 0.0153048 0.01412559 0.01522112 0.01434422 0.0145371
|
|
0.01523519 0.01551008 0.0142715 0.01540041]
|
|
|
|
mean value: 0.014837813377380372
|
|
|
|
key: test_mcc
|
|
value: [0.90524194 0.8566725 0.95436677 1. 0.90662544 0.81503725
|
|
0.95436677 0.95250095 0.95087679 0.75806977]
|
|
|
|
mean value: 0.9053758183184529
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.95744681 0.93617021 0.9787234 1. 0.95744681 0.91489362
|
|
0.9787234 0.97826087 0.97826087 0.89130435]
|
|
|
|
mean value: 0.9571230342275671
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96774194 0.95238095 0.98360656 1. 0.96875 0.93939394
|
|
0.98360656 0.98360656 0.98412698 0.92063492]
|
|
|
|
mean value: 0.9683848404151815
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96774194 0.9375 1. 1. 0.93939394 0.88571429
|
|
1. 1. 0.96875 0.87878788]
|
|
|
|
mean value: 0.9577888039379975
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96774194 0.96774194 0.96774194 1. 1. 1.
|
|
0.96774194 0.96774194 1. 0.96666667]
|
|
|
|
mean value: 0.9805376344086022
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.95262097 0.92137097 0.98387097 1. 0.9375 0.875
|
|
0.98387097 0.98387097 0.96666667 0.85833333]
|
|
|
|
mean value: 0.9463104838709677
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.9375 0.90909091 0.96774194 1. 0.93939394 0.88571429
|
|
0.96774194 0.96774194 0.96875 0.85294118]
|
|
|
|
mean value: 0.9396616117121336
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.05
|
|
|
|
Accuracy on Blind test: 0.19
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03681731 0.03343534 0.04631495 0.04816437 0.05419993 0.04136348
|
|
0.03020048 0.03114557 0.05245137 0.04301977]
|
|
|
|
mean value: 0.04171125888824463
|
|
|
|
key: score_time
|
|
value: [0.02169442 0.01837158 0.02844691 0.01603293 0.03581977 0.02635193
|
|
0.0178473 0.01740122 0.02229071 0.01603532]
|
|
|
|
mean value: 0.02202920913696289
|
|
|
|
key: test_mcc
|
|
value: [0.95299692 0.8566725 1. 1. 0.8566725 0.81503725
|
|
0.91188882 0.95250095 0.95087679 0.85927505]
|
|
|
|
mean value: 0.9155920774240871
|
|
|
|
key: train_mcc
|
|
value: [0.97879832 1. 0.99468526 0.98945277 0.98408467 0.99468526
|
|
0.98945277 0.99472781 0.98940987 0.98946562]
|
|
|
|
mean value: 0.9904762341887853
|
|
|
|
key: test_accuracy
|
|
value: [0.9787234 0.93617021 1. 1. 0.93617021 0.91489362
|
|
0.95744681 0.97826087 0.97826087 0.93478261]
|
|
|
|
mean value: 0.9614708603145236
|
|
|
|
key: train_accuracy
|
|
value: [0.99047619 1. 0.99761905 0.9952381 0.99285714 0.99761905
|
|
0.9952381 0.9976247 0.99524941 0.99524941]
|
|
|
|
mean value: 0.995717113448705
|
|
|
|
key: test_fscore
|
|
value: [0.98412698 0.95238095 1. 1. 0.95238095 0.93939394
|
|
0.96666667 0.98360656 0.98412698 0.94915254]
|
|
|
|
mean value: 0.9711835578826409
|
|
|
|
key: train_fscore
|
|
value: [0.99285714 1. 0.99820467 0.99638989 0.99463327 0.99820467
|
|
0.99638989 0.9981982 0.99640288 0.99640288]
|
|
|
|
mean value: 0.9967683489274677
|
|
|
|
key: test_precision
|
|
value: [0.96875 0.9375 1. 1. 0.9375 0.88571429
|
|
1. 1. 0.96875 0.96551724]
|
|
|
|
mean value: 0.9663731527093596
|
|
|
|
key: train_precision
|
|
value: [0.9858156 1. 0.99641577 1. 0.98932384 0.99641577
|
|
1. 1. 0.99640288 1. ]
|
|
|
|
mean value: 0.9964373865169729
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 1. 1. 0.96774194 1.
|
|
0.93548387 0.96774194 1. 0.93333333]
|
|
|
|
mean value: 0.9772043010752688
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 0.99280576 1. 1.
|
|
0.99280576 0.99640288 0.99640288 0.99283154]
|
|
|
|
mean value: 0.9971248807405688
|
|
|
|
key: test_roc_auc
|
|
value: [0.96875 0.92137097 1. 1. 0.92137097 0.875
|
|
0.96774194 0.98387097 0.96666667 0.93541667]
|
|
|
|
mean value: 0.954018817204301
|
|
|
|
key: train_roc_auc
|
|
value: [0.98591549 1. 0.99647887 0.99640288 0.98943662 0.99647887
|
|
0.99640288 0.99820144 0.99470494 0.99641577]
|
|
|
|
mean value: 0.995043775936127
|
|
|
|
key: test_jcc
|
|
value: [0.96875 0.90909091 1. 1. 0.90909091 0.88571429
|
|
0.93548387 0.96774194 0.96875 0.90322581]
|
|
|
|
mean value: 0.9447847716799329
|
|
|
|
key: train_jcc
|
|
value: [0.9858156 1. 0.99641577 0.99280576 0.98932384 0.99641577
|
|
0.99280576 0.99640288 0.99283154 0.99283154]
|
|
|
|
mean value: 0.9935648458398372
|
|
|
|
MCC on Blind test: 0.07
|
|
|
|
Accuracy on Blind test: 0.19
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.06595278 0.07601976 0.07935166 0.12056971 0.07869649 0.06834197
|
|
0.13297486 0.15896058 0.14390469 0.13192463]
|
|
|
|
mean value: 0.10566971302032471
|
|
|
|
key: score_time
|
|
value: [0.01232171 0.01868176 0.01198316 0.01882172 0.01206756 0.01199055
|
|
0.01884794 0.02548599 0.02592111 0.0252378 ]
|
|
|
|
mean value: 0.018135929107666017
|
|
|
|
key: test_mcc
|
|
value: [0.90662544 0.60908698 0.4512753 0.65994312 0.71206211 0.6139232
|
|
0.66402366 0.59332241 0.43161973 0.76764947]
|
|
|
|
mean value: 0.6409531430663058
|
|
|
|
key: train_mcc
|
|
value: [0.80273059 0.7991351 0.79087061 0.79295441 0.78611575 0.79743374
|
|
0.78683895 0.80017613 0.80374289 0.79643548]
|
|
|
|
mean value: 0.7956433649163105
|
|
|
|
key: test_accuracy
|
|
value: [0.95744681 0.82978723 0.76595745 0.85106383 0.87234043 0.82978723
|
|
0.85106383 0.82608696 0.76086957 0.89130435]
|
|
|
|
mean value: 0.8435707678075856
|
|
|
|
key: train_accuracy
|
|
value: [0.91190476 0.90952381 0.90714286 0.90714286 0.9047619 0.90952381
|
|
0.9047619 0.90973872 0.91211401 0.90973872]
|
|
|
|
mean value: 0.9086353353693021
|
|
|
|
key: test_fscore
|
|
value: [0.96875 0.87878788 0.8358209 0.89230769 0.90909091 0.88235294
|
|
0.88888889 0.875 0.83076923 0.92307692]
|
|
|
|
mean value: 0.8884845359620381
|
|
|
|
key: train_fscore
|
|
value: [0.93653516 0.93537415 0.93287435 0.93356048 0.93150685 0.93493151
|
|
0.93174061 0.93537415 0.93653516 0.9347079 ]
|
|
|
|
mean value: 0.9343140331061971
|
|
|
|
key: test_precision
|
|
value: [0.93939394 0.82857143 0.77777778 0.85294118 0.85714286 0.81081081
|
|
0.875 0.84848485 0.79411765 0.85714286]
|
|
|
|
mean value: 0.8441383342853931
|
|
|
|
key: train_precision
|
|
value: [0.89508197 0.88709677 0.89438944 0.88673139 0.88888889 0.89215686
|
|
0.88636364 0.88709677 0.89508197 0.89768977]
|
|
|
|
mean value: 0.8910577470317502
|
|
|
|
key: test_recall
|
|
value: [1. 0.93548387 0.90322581 0.93548387 0.96774194 0.96774194
|
|
0.90322581 0.90322581 0.87096774 1. ]
|
|
|
|
mean value: 0.9387096774193548
|
|
|
|
key: train_recall
|
|
value: [0.98201439 0.98920863 0.97482014 0.98561151 0.97841727 0.98201439
|
|
0.98201439 0.98920863 0.98201439 0.97491039]
|
|
|
|
mean value: 0.9820234135272428
|
|
|
|
key: test_roc_auc
|
|
value: [0.9375 0.78024194 0.7016129 0.81149194 0.82762097 0.76512097
|
|
0.8266129 0.78494624 0.70215054 0.84375 ]
|
|
|
|
mean value: 0.7981048387096774
|
|
|
|
key: train_roc_auc
|
|
value: [0.87833114 0.87136488 0.87473402 0.86956632 0.86949032 0.87481001
|
|
0.86776776 0.87222669 0.87911908 0.87830027]
|
|
|
|
mean value: 0.8735710488300057
|
|
|
|
key: test_jcc
|
|
value: [0.93939394 0.78378378 0.71794872 0.80555556 0.83333333 0.78947368
|
|
0.8 0.77777778 0.71052632 0.85714286]
|
|
|
|
mean value: 0.8014935964935965
|
|
|
|
key: train_jcc
|
|
value: [0.88064516 0.87859425 0.87419355 0.87539936 0.87179487 0.8778135
|
|
0.87220447 0.87859425 0.88064516 0.87741935]
|
|
|
|
mean value: 0.8767303934692845
|
|
|
|
MCC on Blind test: 0.21
|
|
|
|
Accuracy on Blind test: 0.42
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.2230103 0.21394682 0.20187664 0.20994234 0.20898438 0.21629405
|
|
0.21086693 0.20910215 0.21160555 0.20683503]
|
|
|
|
mean value: 0.21124641895294188
|
|
|
|
key: score_time
|
|
value: [0.00933719 0.00840378 0.00872827 0.00917697 0.00930619 0.00924182
|
|
0.00842547 0.00914001 0.00950432 0.00904679]
|
|
|
|
mean value: 0.009031081199645996
|
|
|
|
key: test_mcc
|
|
value: [1. 0.8566725 1. 1. 0.95299692 0.81503725
|
|
1. 0.95250095 0.95087679 0.80833333]
|
|
|
|
mean value: 0.9336417737001077
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.93617021 1. 1. 0.9787234 0.91489362
|
|
1. 0.97826087 0.97826087 0.91304348]
|
|
|
|
mean value: 0.9699352451433858
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.95238095 1. 1. 0.98412698 0.93939394
|
|
1. 0.98360656 0.98412698 0.93333333]
|
|
|
|
mean value: 0.9776968750739242
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.9375 1. 1. 0.96875 0.88571429
|
|
1. 1. 0.96875 0.93333333]
|
|
|
|
mean value: 0.9694047619047619
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 1. 1. 1. 1.
|
|
1. 0.96774194 1. 0.93333333]
|
|
|
|
mean value: 0.9868817204301076
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.92137097 1. 1. 0.96875 0.875
|
|
1. 0.98387097 0.96666667 0.90416667]
|
|
|
|
mean value: 0.9619825268817205
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.90909091 1. 1. 0.96875 0.88571429
|
|
1. 0.96774194 0.96875 0.875 ]
|
|
|
|
mean value: 0.9575047130289066
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.08
|
|
|
|
Accuracy on Blind test: 0.19
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.0117166 0.01313019 0.01318526 0.01325989 0.01305079 0.01312709
|
|
0.01312232 0.01324248 0.01329851 0.0137887 ]
|
|
|
|
mean value: 0.01309218406677246
|
|
|
|
key: score_time
|
|
value: [0.0111506 0.01089978 0.01084971 0.0108676 0.01087546 0.01084447
|
|
0.01105189 0.01162434 0.01162648 0.01165462]
|
|
|
|
mean value: 0.011144495010375977
|
|
|
|
key: test_mcc
|
|
value: [0.46502704 0.68913865 0.66402366 0.71206211 0.6139232 0.67402153
|
|
0.62096774 0.74844698 0.44695591 0.53674504]
|
|
|
|
mean value: 0.6171311872005444
|
|
|
|
key: train_mcc
|
|
value: [0.6778431 0.7128472 0.85474068 0.79307454 0.73273261 0.88954988
|
|
0.79770673 0.82923345 0.77993671 0.88249782]
|
|
|
|
mean value: 0.7950162701330918
|
|
|
|
key: test_accuracy
|
|
value: [0.70212766 0.85106383 0.85106383 0.87234043 0.82978723 0.85106383
|
|
0.82978723 0.89130435 0.76086957 0.7826087 ]
|
|
|
|
mean value: 0.8222016651248844
|
|
|
|
key: train_accuracy
|
|
value: [0.82142857 0.8452381 0.93333333 0.9047619 0.88095238 0.95
|
|
0.90714286 0.9239905 0.90261283 0.94536817]
|
|
|
|
mean value: 0.9014828639294198
|
|
|
|
key: test_fscore
|
|
value: [0.73076923 0.88135593 0.88888889 0.90909091 0.88235294 0.8852459
|
|
0.87096774 0.92307692 0.82539683 0.82758621]
|
|
|
|
mean value: 0.8624731501074018
|
|
|
|
key: train_fscore
|
|
value: [0.84662577 0.86973948 0.94871795 0.92647059 0.91582492 0.96188748
|
|
0.92844037 0.94425087 0.92794376 0.95779817]
|
|
|
|
mean value: 0.9227699340095629
|
|
|
|
key: test_precision
|
|
value: [0.9047619 0.92857143 0.875 0.85714286 0.81081081 0.9
|
|
0.87096774 0.88235294 0.8125 0.85714286]
|
|
|
|
mean value: 0.8699250541541813
|
|
|
|
key: train_precision
|
|
value: [0.98104265 0.98190045 0.96641791 0.94736842 0.86075949 0.97069597
|
|
0.94756554 0.91554054 0.90721649 0.98120301]
|
|
|
|
mean value: 0.9459710488360232
|
|
|
|
key: test_recall
|
|
value: [0.61290323 0.83870968 0.90322581 0.96774194 0.96774194 0.87096774
|
|
0.87096774 0.96774194 0.83870968 0.8 ]
|
|
|
|
mean value: 0.8638709677419355
|
|
|
|
key: train_recall
|
|
value: [0.74460432 0.78057554 0.93165468 0.90647482 0.97841727 0.95323741
|
|
0.91007194 0.97482014 0.94964029 0.93548387]
|
|
|
|
mean value: 0.906498027384544
|
|
|
|
key: test_roc_auc
|
|
value: [0.74395161 0.85685484 0.8266129 0.82762097 0.76512097 0.84173387
|
|
0.81048387 0.85053763 0.71935484 0.775 ]
|
|
|
|
mean value: 0.8017271505376344
|
|
|
|
key: train_roc_auc
|
|
value: [0.85821765 0.87620326 0.9341372 0.90394164 0.83427906 0.94844969
|
|
0.9057402 0.89999748 0.88041455 0.9501363 ]
|
|
|
|
mean value: 0.8991517025527074
|
|
|
|
key: test_jcc
|
|
value: [0.57575758 0.78787879 0.8 0.83333333 0.78947368 0.79411765
|
|
0.77142857 0.85714286 0.7027027 0.70588235]
|
|
|
|
mean value: 0.7617717512454355
|
|
|
|
key: train_jcc
|
|
value: [0.73404255 0.76950355 0.90243902 0.8630137 0.8447205 0.92657343
|
|
0.86643836 0.89438944 0.86557377 0.91901408]
|
|
|
|
mean value: 0.8585708395886121
|
|
|
|
MCC on Blind test: 0.13
|
|
|
|
Accuracy on Blind test: 0.63
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02074528 0.02051353 0.01858282 0.02953506 0.02967763 0.0314672
|
|
0.02946687 0.02941847 0.02958179 0.02945447]
|
|
|
|
mean value: 0.026844310760498046
|
|
|
|
key: score_time
|
|
value: [0.02140307 0.01061296 0.01083541 0.02066278 0.0109098 0.01821399
|
|
0.02039957 0.01891303 0.02117467 0.01968718]
|
|
|
|
mean value: 0.017281246185302735
|
|
|
|
key: test_mcc
|
|
value: [0.95299692 0.8084425 0.8566725 0.95299692 0.90662544 0.76032282
|
|
0.90662544 0.80215054 0.75776742 0.85513419]
|
|
|
|
mean value: 0.8559734697377736
|
|
|
|
key: train_mcc
|
|
value: [0.92003671 0.92030205 0.87684521 0.89326029 0.93085643 0.90414739
|
|
0.88770942 0.9151442 0.88322214 0.90932054]
|
|
|
|
mean value: 0.9040844381960059
|
|
|
|
key: test_accuracy
|
|
value: [0.9787234 0.91489362 0.93617021 0.9787234 0.95744681 0.89361702
|
|
0.95744681 0.91304348 0.89130435 0.93478261]
|
|
|
|
mean value: 0.9356151711378353
|
|
|
|
key: train_accuracy
|
|
value: [0.96428571 0.96428571 0.9452381 0.95238095 0.96904762 0.95714286
|
|
0.95 0.96199525 0.94774347 0.95961995]
|
|
|
|
mean value: 0.9571739622214681
|
|
|
|
key: test_fscore
|
|
value: [0.98412698 0.9375 0.95238095 0.98412698 0.96875 0.92307692
|
|
0.96875 0.93548387 0.91803279 0.95081967]
|
|
|
|
mean value: 0.9523048173695978
|
|
|
|
key: train_fscore
|
|
value: [0.97345133 0.97354497 0.95943563 0.96478873 0.97699115 0.96830986
|
|
0.96296296 0.97173145 0.96140351 0.97001764]
|
|
|
|
mean value: 0.9682637226255115
|
|
|
|
key: test_precision
|
|
value: [0.96875 0.90909091 0.9375 0.96875 0.93939394 0.88235294
|
|
0.93939394 0.93548387 0.93333333 0.93548387]
|
|
|
|
mean value: 0.9349532804324076
|
|
|
|
key: train_precision
|
|
value: [0.95818815 0.9550173 0.94117647 0.94482759 0.96167247 0.94827586
|
|
0.94463668 0.95486111 0.93835616 0.95486111]
|
|
|
|
mean value: 0.9501872911886335
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 0.96774194 1. 1. 0.96774194
|
|
1. 0.93548387 0.90322581 0.96666667]
|
|
|
|
mean value: 0.9708602150537634
|
|
|
|
key: train_recall
|
|
value: [0.98920863 0.99280576 0.97841727 0.98561151 0.99280576 0.98920863
|
|
0.98201439 0.98920863 0.98561151 0.98566308]
|
|
|
|
mean value: 0.9870555168768211
|
|
|
|
key: test_roc_auc
|
|
value: [0.96875 0.89012097 0.92137097 0.96875 0.9375 0.85887097
|
|
0.9375 0.90107527 0.88494624 0.92083333]
|
|
|
|
mean value: 0.9189717741935484
|
|
|
|
key: train_roc_auc
|
|
value: [0.9523508 0.95062823 0.92934948 0.93646773 0.95767048 0.94178742
|
|
0.93466917 0.94914977 0.92986869 0.94705689]
|
|
|
|
mean value: 0.9428998652048836
|
|
|
|
key: test_jcc
|
|
value: [0.96875 0.88235294 0.90909091 0.96875 0.93939394 0.85714286
|
|
0.93939394 0.87878788 0.84848485 0.90625 ]
|
|
|
|
mean value: 0.9098397313470843
|
|
|
|
key: train_jcc
|
|
value: [0.94827586 0.94845361 0.9220339 0.93197279 0.9550173 0.93856655
|
|
0.92857143 0.94501718 0.92567568 0.94178082]
|
|
|
|
mean value: 0.9385365119971703
|
|
|
|
MCC on Blind test: 0.18
|
|
|
|
Accuracy on Blind test: 0.4
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: /home/tanu/git/LSHTM_analysis/scripts/ml/./katg_config.py:122: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
baseline_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./katg_config.py:125: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
baseline_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.20578527 0.09308887 0.1543951 0.2597549 0.22493172 0.19128704
|
|
0.15882158 0.10618186 0.18721414 0.18994951]
|
|
|
|
mean value: 0.17714099884033202
|
|
|
|
key: score_time
|
|
value: [0.01115108 0.01121378 0.0221827 0.02100086 0.02120638 0.02156854
|
|
0.01102662 0.0211966 0.01679158 0.01401901]
|
|
|
|
mean value: 0.01713571548461914
|
|
|
|
key: test_mcc
|
|
value: [1. 0.8566725 1. 0.95299692 0.90662544 0.81503725
|
|
0.95299692 0.9085301 0.90107527 0.85513419]
|
|
|
|
mean value: 0.914906858369952
|
|
|
|
key: train_mcc
|
|
value: [0.92522791 0.94131391 0.91988445 0.92534566 0.93598399 0.94131391
|
|
0.93066133 0.94171645 0.93099139 0.94680199]
|
|
|
|
mean value: 0.9339241011569926
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.93617021 1. 0.9787234 0.95744681 0.91489362
|
|
0.9787234 0.95652174 0.95652174 0.93478261]
|
|
|
|
mean value: 0.9613783533765032
|
|
|
|
key: train_accuracy
|
|
value: [0.96666667 0.97380952 0.96428571 0.96666667 0.97142857 0.97380952
|
|
0.96904762 0.97387173 0.96912114 0.97624703]
|
|
|
|
mean value: 0.970495419070241
|
|
|
|
key: test_fscore
|
|
value: [1. 0.95238095 1. 0.98412698 0.96875 0.93939394
|
|
0.98412698 0.96666667 0.96774194 0.95081967]
|
|
|
|
mean value: 0.9714007134310545
|
|
|
|
key: train_fscore
|
|
value: [0.97508897 0.98039216 0.97335702 0.9751773 0.97864769 0.98039216
|
|
0.97690941 0.98046181 0.97690941 0.9822695 ]
|
|
|
|
mean value: 0.9779605432457805
|
|
|
|
key: test_precision
|
|
value: [1. 0.9375 1. 0.96875 0.93939394 0.88571429
|
|
0.96875 1. 0.96774194 0.93548387]
|
|
|
|
mean value: 0.9603334031559838
|
|
|
|
key: train_precision
|
|
value: [0.96478873 0.97173145 0.96140351 0.96153846 0.96830986 0.97173145
|
|
0.96491228 0.96842105 0.96491228 0.97192982]
|
|
|
|
mean value: 0.9669678897982681
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 1. 1. 1. 1.
|
|
1. 0.93548387 0.96774194 0.96666667]
|
|
|
|
mean value: 0.983763440860215
|
|
|
|
key: train_recall
|
|
value: [0.98561151 0.98920863 0.98561151 0.98920863 0.98920863 0.98920863
|
|
0.98920863 0.99280576 0.98920863 0.99283154]
|
|
|
|
mean value: 0.9892112116758206
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.92137097 1. 0.96875 0.9375 0.875
|
|
0.96875 0.96774194 0.95053763 0.92083333]
|
|
|
|
mean value: 0.9510483870967742
|
|
|
|
key: train_roc_auc
|
|
value: [0.95759449 0.9664353 0.95407336 0.95587192 0.96291418 0.9664353
|
|
0.95939305 0.96493435 0.95963928 0.96824676]
|
|
|
|
mean value: 0.9615537984903284
|
|
|
|
key: test_jcc
|
|
value: [1. 0.90909091 1. 0.96875 0.93939394 0.88571429
|
|
0.96875 0.93548387 0.9375 0.90625 ]
|
|
|
|
mean value: 0.9450933005166876
|
|
|
|
key: train_jcc
|
|
value: [0.95138889 0.96153846 0.94809689 0.95155709 0.95818815 0.96153846
|
|
0.95486111 0.96167247 0.95486111 0.96515679]
|
|
|
|
mean value: 0.9568859435029576
|
|
|
|
MCC on Blind test: 0.14
|
|
|
|
Accuracy on Blind test: 0.33
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02589321 0.0243125 0.02480197 0.02414203 0.02598643 0.0267787
|
|
0.02300882 0.02308178 0.02694511 0.02658701]
|
|
|
|
mean value: 0.025153756141662598
|
|
|
|
key: score_time
|
|
value: [0.01105022 0.01108479 0.02707553 0.01083922 0.01093078 0.01091671
|
|
0.01093459 0.01086307 0.01094246 0.01091409]
|
|
|
|
mean value: 0.012555146217346191
|
|
|
|
key: test_mcc
|
|
value: [1. 0.7130241 0.77784447 0.83914639 0.87096774 0.87096774
|
|
0.74193548 0.84266484 0.67314268 0.8688172 ]
|
|
|
|
mean value: 0.8198510652102912
|
|
|
|
key: train_mcc
|
|
value: [0.87415162 0.85611511 0.87052613 0.84894283 0.84894283 0.84892086
|
|
0.85256763 0.84537297 0.86364692 0.85997009]
|
|
|
|
mean value: 0.8569156981998511
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.85483871 0.88709677 0.91935484 0.93548387 0.93548387
|
|
0.87096774 0.91935484 0.83606557 0.93442623]
|
|
|
|
mean value: 0.9093072448439978
|
|
|
|
key: train_accuracy
|
|
value: [0.93705036 0.92805755 0.9352518 0.92446043 0.92446043 0.92446043
|
|
0.92625899 0.92266187 0.93177738 0.92998205]
|
|
|
|
mean value: 0.9284421295997314
|
|
|
|
key: test_fscore
|
|
value: [1. 0.86153846 0.89230769 0.92063492 0.93548387 0.93548387
|
|
0.87096774 0.91525424 0.84375 0.93333333]
|
|
|
|
mean value: 0.9108754128973511
|
|
|
|
key: train_fscore
|
|
value: [0.93670886 0.92805755 0.93548387 0.92473118 0.92473118 0.92446043
|
|
0.92665474 0.92307692 0.93214286 0.92998205]
|
|
|
|
mean value: 0.9286029650436789
|
|
|
|
key: test_precision
|
|
value: [1. 0.82352941 0.85294118 0.90625 0.93548387 0.93548387
|
|
0.87096774 0.96428571 0.81818182 0.93333333]
|
|
|
|
mean value: 0.9040456937907128
|
|
|
|
key: train_precision
|
|
value: [0.94181818 0.92805755 0.93214286 0.92142857 0.92142857 0.92446043
|
|
0.92170819 0.91814947 0.92553191 0.93165468]
|
|
|
|
mean value: 0.9266380409827853
|
|
|
|
key: test_recall
|
|
value: [1. 0.90322581 0.93548387 0.93548387 0.93548387 0.93548387
|
|
0.87096774 0.87096774 0.87096774 0.93333333]
|
|
|
|
mean value: 0.9191397849462365
|
|
|
|
key: train_recall
|
|
value: [0.93165468 0.92805755 0.93884892 0.92805755 0.92805755 0.92446043
|
|
0.93165468 0.92805755 0.93884892 0.92831541]
|
|
|
|
mean value: 0.9306013253912999
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.85483871 0.88709677 0.91935484 0.93548387 0.93548387
|
|
0.87096774 0.91935484 0.83548387 0.9344086 ]
|
|
|
|
mean value: 0.909247311827957
|
|
|
|
key: train_roc_auc
|
|
value: [0.93705036 0.92805755 0.9352518 0.92446043 0.92446043 0.92446043
|
|
0.92625899 0.92266187 0.93179005 0.92998504]
|
|
|
|
mean value: 0.9284436966555788
|
|
|
|
key: test_jcc
|
|
value: [1. 0.75675676 0.80555556 0.85294118 0.87878788 0.87878788
|
|
0.77142857 0.84375 0.72972973 0.875 ]
|
|
|
|
mean value: 0.839273754751696
|
|
|
|
key: train_jcc
|
|
value: [0.88095238 0.86577181 0.87878788 0.86 0.86 0.85953177
|
|
0.86333333 0.85714286 0.8729097 0.86912752]
|
|
|
|
mean value: 0.8667557250647416
|
|
|
|
MCC on Blind test: 0.21
|
|
|
|
Accuracy on Blind test: 0.5
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.84429264 0.72940993 0.72171068 0.85939646 0.69464445 0.72773337
|
|
0.77860117 0.70124364 0.78092885 0.7428112 ]
|
|
|
|
mean value: 0.7580772399902344
|
|
|
|
key: score_time
|
|
value: [0.01205468 0.01223755 0.01254439 0.01247644 0.02100563 0.01274776
|
|
0.01243854 0.01249003 0.01463079 0.01232004]
|
|
|
|
mean value: 0.013494586944580078
|
|
|
|
key: test_mcc
|
|
value: [0.96824584 0.93548387 0.96824584 0.90748521 0.90369611 0.93548387
|
|
1. 0.87278605 0.90215054 0.8688172 ]
|
|
|
|
mean value: 0.9262394532240339
|
|
|
|
key: train_mcc
|
|
value: [0.94966486 0.96412858 0.94604929 0.96763216 0.96405373 0.96405373
|
|
0.94604929 0.97482645 0.96774069 0.96783888]
|
|
|
|
mean value: 0.9612037646576601
|
|
|
|
key: test_accuracy
|
|
value: [0.98387097 0.96774194 0.98387097 0.9516129 0.9516129 0.96774194
|
|
1. 0.93548387 0.95081967 0.93442623]
|
|
|
|
mean value: 0.9627181385510312
|
|
|
|
key: train_accuracy
|
|
value: [0.97482014 0.98201439 0.97302158 0.98381295 0.98201439 0.98201439
|
|
0.97302158 0.98741007 0.98384201 0.98384201]
|
|
|
|
mean value: 0.9805813517946863
|
|
|
|
key: test_fscore
|
|
value: [0.98412698 0.96774194 0.98360656 0.95384615 0.95081967 0.96774194
|
|
1. 0.93333333 0.95081967 0.93333333]
|
|
|
|
mean value: 0.9625369577246891
|
|
|
|
key: train_fscore
|
|
value: [0.97491039 0.98214286 0.97307002 0.98384201 0.98207885 0.98207885
|
|
0.97307002 0.98743268 0.98389982 0.98401421]
|
|
|
|
mean value: 0.9806539709925397
|
|
|
|
key: test_precision
|
|
value: [0.96875 0.96774194 1. 0.91176471 0.96666667 0.96774194
|
|
1. 0.96551724 0.96666667 0.93333333]
|
|
|
|
mean value: 0.9648182484896072
|
|
|
|
key: train_precision
|
|
value: [0.97142857 0.9751773 0.97132616 0.98207885 0.97857143 0.97857143
|
|
0.97132616 0.98566308 0.97864769 0.97535211]
|
|
|
|
mean value: 0.9768142798277739
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 0.96774194 1. 0.93548387 0.96774194
|
|
1. 0.90322581 0.93548387 0.93333333]
|
|
|
|
mean value: 0.9610752688172043
|
|
|
|
key: train_recall
|
|
value: [0.97841727 0.98920863 0.97482014 0.98561151 0.98561151 0.98561151
|
|
0.97482014 0.98920863 0.98920863 0.99283154]
|
|
|
|
mean value: 0.9845349526830148
|
|
|
|
key: test_roc_auc
|
|
value: [0.98387097 0.96774194 0.98387097 0.9516129 0.9516129 0.96774194
|
|
1. 0.93548387 0.95107527 0.9344086 ]
|
|
|
|
mean value: 0.962741935483871
|
|
|
|
key: train_roc_auc
|
|
value: [0.97482014 0.98201439 0.97302158 0.98381295 0.98201439 0.98201439
|
|
0.97302158 0.98741007 0.98385163 0.98382584]
|
|
|
|
mean value: 0.9805806967329362
|
|
|
|
key: test_jcc
|
|
value: [0.96875 0.9375 0.96774194 0.91176471 0.90625 0.9375
|
|
1. 0.875 0.90625 0.875 ]
|
|
|
|
mean value: 0.9285756641366224
|
|
|
|
key: train_jcc
|
|
value: [0.95104895 0.96491228 0.94755245 0.96819788 0.96478873 0.96478873
|
|
0.94755245 0.9751773 0.96830986 0.96853147]
|
|
|
|
mean value: 0.9620860104153928
|
|
|
|
MCC on Blind test: 0.14
|
|
|
|
Accuracy on Blind test: 0.35
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01086211 0.01025057 0.00859928 0.00814486 0.00849771 0.00856209
|
|
0.0083437 0.00834632 0.00836563 0.0083468 ]
|
|
|
|
mean value: 0.00883190631866455
|
|
|
|
key: score_time
|
|
value: [0.01082826 0.00907207 0.00906467 0.00892687 0.00863767 0.00889754
|
|
0.00867772 0.00834537 0.00863099 0.00864482]
|
|
|
|
mean value: 0.008972597122192384
|
|
|
|
key: test_mcc
|
|
value: [0.83914639 0.64820372 0.71004695 0.81325006 0.80645161 0.74348441
|
|
0.61418277 0.87278605 0.60645161 0.70505961]
|
|
|
|
mean value: 0.7359063194782269
|
|
|
|
key: train_mcc
|
|
value: [0.75529076 0.7627676 0.76266888 0.74820144 0.73741484 0.74837576
|
|
0.74460913 0.73025835 0.76301539 0.75249226]
|
|
|
|
mean value: 0.7505094421634964
|
|
|
|
key: test_accuracy
|
|
value: [0.91935484 0.82258065 0.85483871 0.90322581 0.90322581 0.87096774
|
|
0.80645161 0.93548387 0.80327869 0.85245902]
|
|
|
|
mean value: 0.8671866737176097
|
|
|
|
key: train_accuracy
|
|
value: [0.87410072 0.88129496 0.88129496 0.87410072 0.86870504 0.87410072
|
|
0.87230216 0.86510791 0.88150808 0.87612208]
|
|
|
|
mean value: 0.8748637355824497
|
|
|
|
key: test_fscore
|
|
value: [0.92063492 0.83076923 0.85245902 0.90909091 0.90322581 0.875
|
|
0.8 0.93333333 0.80645161 0.84745763]
|
|
|
|
mean value: 0.8678422456695318
|
|
|
|
key: train_fscore
|
|
value: [0.88215488 0.88 0.88214286 0.87410072 0.86894075 0.87272727
|
|
0.87253142 0.86437613 0.88129496 0.87477314]
|
|
|
|
mean value: 0.8753042137774966
|
|
|
|
key: test_precision
|
|
value: [0.90625 0.79411765 0.86666667 0.85714286 0.90322581 0.84848485
|
|
0.82758621 0.96551724 0.80645161 0.86206897]
|
|
|
|
mean value: 0.8637511852501137
|
|
|
|
key: train_precision
|
|
value: [0.82911392 0.88970588 0.87588652 0.87410072 0.86738351 0.88235294
|
|
0.87096774 0.86909091 0.88129496 0.88602941]
|
|
|
|
mean value: 0.8725926531191879
|
|
|
|
key: test_recall
|
|
value: [0.93548387 0.87096774 0.83870968 0.96774194 0.90322581 0.90322581
|
|
0.77419355 0.90322581 0.80645161 0.83333333]
|
|
|
|
mean value: 0.8736559139784946
|
|
|
|
key: train_recall
|
|
value: [0.94244604 0.8705036 0.88848921 0.87410072 0.8705036 0.86330935
|
|
0.87410072 0.85971223 0.88129496 0.86379928]
|
|
|
|
mean value: 0.8788259714808798
|
|
|
|
key: test_roc_auc
|
|
value: [0.91935484 0.82258065 0.85483871 0.90322581 0.90322581 0.87096774
|
|
0.80645161 0.93548387 0.80322581 0.85215054]
|
|
|
|
mean value: 0.8671505376344086
|
|
|
|
key: train_roc_auc
|
|
value: [0.87410072 0.88129496 0.88129496 0.87410072 0.86870504 0.87410072
|
|
0.87230216 0.86510791 0.8815077 0.87614425]
|
|
|
|
mean value: 0.8748659137206364
|
|
|
|
key: test_jcc
|
|
value: [0.85294118 0.71052632 0.74285714 0.83333333 0.82352941 0.77777778
|
|
0.66666667 0.875 0.67567568 0.73529412]
|
|
|
|
mean value: 0.7693601617982423
|
|
|
|
key: train_jcc
|
|
value: [0.78915663 0.78571429 0.78913738 0.77635783 0.76825397 0.77419355
|
|
0.77388535 0.7611465 0.78778135 0.77741935]
|
|
|
|
mean value: 0.778304618898389
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.54
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00874949 0.00879979 0.0084486 0.00865364 0.00868559 0.00862646
|
|
0.00848246 0.00858378 0.00882339 0.00859857]
|
|
|
|
mean value: 0.008645176887512207
|
|
|
|
key: score_time
|
|
value: [0.00911641 0.00886369 0.00856209 0.00891066 0.00887418 0.00860906
|
|
0.00873876 0.00870085 0.00862622 0.008708 ]
|
|
|
|
mean value: 0.008770990371704101
|
|
|
|
key: test_mcc
|
|
value: [0.64820372 0.68313005 0.48488114 0.74348441 0.80813523 0.74348441
|
|
0.64820372 0.74193548 0.63978495 0.67204301]
|
|
|
|
mean value: 0.6813286129520032
|
|
|
|
key: train_mcc
|
|
value: [0.69129181 0.69623388 0.69785979 0.69872831 0.69209976 0.70569372
|
|
0.7019886 0.70220704 0.69929441 0.69881448]
|
|
|
|
mean value: 0.698421180066379
|
|
|
|
key: test_accuracy
|
|
value: [0.82258065 0.83870968 0.74193548 0.87096774 0.90322581 0.87096774
|
|
0.82258065 0.87096774 0.81967213 0.83606557]
|
|
|
|
mean value: 0.8397673188789001
|
|
|
|
key: train_accuracy
|
|
value: [0.84532374 0.8471223 0.84892086 0.84892086 0.84532374 0.85251799
|
|
0.85071942 0.85071942 0.8491921 0.8491921 ]
|
|
|
|
mean value: 0.8487952546400941
|
|
|
|
key: test_fscore
|
|
value: [0.81355932 0.84848485 0.75 0.875 0.9 0.875
|
|
0.83076923 0.87096774 0.81967213 0.83333333]
|
|
|
|
mean value: 0.8416786607704336
|
|
|
|
key: train_fscore
|
|
value: [0.84859155 0.85268631 0.84837545 0.85263158 0.85017422 0.8556338
|
|
0.85361552 0.85413005 0.85263158 0.85211268]
|
|
|
|
mean value: 0.8520582734853629
|
|
|
|
key: test_precision
|
|
value: [0.85714286 0.8 0.72727273 0.84848485 0.93103448 0.84848485
|
|
0.79411765 0.87096774 0.83333333 0.83333333]
|
|
|
|
mean value: 0.8344171819804876
|
|
|
|
key: train_precision
|
|
value: [0.83103448 0.82274247 0.85144928 0.83219178 0.82432432 0.83793103
|
|
0.83737024 0.83505155 0.83219178 0.83737024]
|
|
|
|
mean value: 0.8341657184309065
|
|
|
|
key: test_recall
|
|
value: [0.77419355 0.90322581 0.77419355 0.90322581 0.87096774 0.90322581
|
|
0.87096774 0.87096774 0.80645161 0.83333333]
|
|
|
|
mean value: 0.8510752688172043
|
|
|
|
key: train_recall
|
|
value: [0.86690647 0.88489209 0.84532374 0.87410072 0.87769784 0.87410072
|
|
0.8705036 0.87410072 0.87410072 0.86738351]
|
|
|
|
mean value: 0.8709110131249839
|
|
|
|
key: test_roc_auc
|
|
value: [0.82258065 0.83870968 0.74193548 0.87096774 0.90322581 0.87096774
|
|
0.82258065 0.87096774 0.81989247 0.83602151]
|
|
|
|
mean value: 0.8397849462365592
|
|
|
|
key: train_roc_auc
|
|
value: [0.84532374 0.8471223 0.84892086 0.84892086 0.84532374 0.85251799
|
|
0.85071942 0.85071942 0.84923674 0.84915938]
|
|
|
|
mean value: 0.8487964467135968
|
|
|
|
key: test_jcc
|
|
value: [0.68571429 0.73684211 0.6 0.77777778 0.81818182 0.77777778
|
|
0.71052632 0.77142857 0.69444444 0.71428571]
|
|
|
|
mean value: 0.7286978810663021
|
|
|
|
key: train_jcc
|
|
value: [0.73700306 0.74320242 0.73667712 0.74311927 0.73939394 0.74769231
|
|
0.74461538 0.74539877 0.74311927 0.74233129]
|
|
|
|
mean value: 0.7422552816171282
|
|
|
|
MCC on Blind test: 0.19
|
|
|
|
Accuracy on Blind test: 0.49
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00824833 0.00824618 0.00818968 0.00823784 0.00803328 0.00808263
|
|
0.0080018 0.00826025 0.00804806 0.00802422]
|
|
|
|
mean value: 0.008137226104736328
|
|
|
|
key: score_time
|
|
value: [0.02001548 0.0169642 0.01295042 0.01175404 0.01527023 0.01145744
|
|
0.01146245 0.01176381 0.01168776 0.01165533]
|
|
|
|
mean value: 0.01349811553955078
|
|
|
|
key: test_mcc
|
|
value: [0.75623534 0.67741935 0.64820372 0.83914639 0.80813523 0.74193548
|
|
0.61418277 0.68313005 0.67204301 0.67721392]
|
|
|
|
mean value: 0.7117645281572317
|
|
|
|
key: train_mcc
|
|
value: [0.75664991 0.80977699 0.79501032 0.78789723 0.7814304 0.77770329
|
|
0.79138739 0.77342633 0.78180276 0.78587941]
|
|
|
|
mean value: 0.7840964017204444
|
|
|
|
key: test_accuracy
|
|
value: [0.87096774 0.83870968 0.82258065 0.91935484 0.90322581 0.87096774
|
|
0.80645161 0.83870968 0.83606557 0.83606557]
|
|
|
|
mean value: 0.8543098889476468
|
|
|
|
key: train_accuracy
|
|
value: [0.87769784 0.90467626 0.89748201 0.89388489 0.89028777 0.88848921
|
|
0.89568345 0.88669065 0.89048474 0.89228007]
|
|
|
|
mean value: 0.8917656897821061
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 0.83870968 0.81355932 0.92063492 0.90625 0.87096774
|
|
0.8125 0.82758621 0.83870968 0.82142857]
|
|
|
|
mean value: 0.8507488974910993
|
|
|
|
key: train_fscore
|
|
value: [0.87407407 0.90310786 0.89692586 0.89292196 0.88766114 0.88602941
|
|
0.89605735 0.88607595 0.88766114 0.88929889]
|
|
|
|
mean value: 0.8899813639558726
|
|
|
|
key: test_precision
|
|
value: [0.96 0.83870968 0.85714286 0.90625 0.87878788 0.87096774
|
|
0.78787879 0.88888889 0.83870968 0.88461538]
|
|
|
|
mean value: 0.8711950894087991
|
|
|
|
key: train_precision
|
|
value: [0.90076336 0.91821561 0.90181818 0.9010989 0.90943396 0.90601504
|
|
0.89285714 0.89090909 0.90943396 0.91634981]
|
|
|
|
mean value: 0.9046895060853061
|
|
|
|
key: test_recall
|
|
value: [0.77419355 0.83870968 0.77419355 0.93548387 0.93548387 0.87096774
|
|
0.83870968 0.77419355 0.83870968 0.76666667]
|
|
|
|
mean value: 0.8347311827956989
|
|
|
|
key: train_recall
|
|
value: [0.84892086 0.88848921 0.89208633 0.88489209 0.86690647 0.86690647
|
|
0.89928058 0.88129496 0.86690647 0.86379928]
|
|
|
|
mean value: 0.8759482736391532
|
|
|
|
key: test_roc_auc
|
|
value: [0.87096774 0.83870968 0.82258065 0.91935484 0.90322581 0.87096774
|
|
0.80645161 0.83870968 0.83602151 0.83494624]
|
|
|
|
mean value: 0.8541935483870968
|
|
|
|
key: train_roc_auc
|
|
value: [0.87769784 0.90467626 0.89748201 0.89388489 0.89028777 0.88848921
|
|
0.89568345 0.88669065 0.89044248 0.8923313 ]
|
|
|
|
mean value: 0.8917665867306155
|
|
|
|
key: test_jcc
|
|
value: [0.75 0.72222222 0.68571429 0.85294118 0.82857143 0.77142857
|
|
0.68421053 0.70588235 0.72222222 0.6969697 ]
|
|
|
|
mean value: 0.7420162482855981
|
|
|
|
key: train_jcc
|
|
value: [0.77631579 0.82333333 0.81311475 0.80655738 0.79801325 0.79537954
|
|
0.81168831 0.79545455 0.79801325 0.80066445]
|
|
|
|
mean value: 0.8018534590944678
|
|
|
|
MCC on Blind test: 0.17
|
|
|
|
Accuracy on Blind test: 0.56
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01743269 0.01715159 0.01687074 0.01602173 0.01673126 0.01642895
|
|
0.01583314 0.01826096 0.01568484 0.0176158 ]
|
|
|
|
mean value: 0.016803169250488283
|
|
|
|
key: score_time
|
|
value: [0.01035261 0.00926685 0.01012945 0.00933409 0.00936246 0.01025271
|
|
0.00945616 0.01029134 0.00932956 0.0092721 ]
|
|
|
|
mean value: 0.00970473289489746
|
|
|
|
key: test_mcc
|
|
value: [0.93548387 0.69047575 0.62471615 0.77784447 0.77784447 0.75623534
|
|
0.58338335 0.74348441 0.61090565 0.81062315]
|
|
|
|
mean value: 0.7310996615906107
|
|
|
|
key: train_mcc
|
|
value: [0.82186847 0.79485081 0.75204143 0.78877892 0.78485761 0.7611094
|
|
0.79209132 0.77560672 0.78260516 0.81085297]
|
|
|
|
mean value: 0.7864662785132636
|
|
|
|
key: test_accuracy
|
|
value: [0.96774194 0.83870968 0.80645161 0.88709677 0.88709677 0.87096774
|
|
0.79032258 0.87096774 0.80327869 0.90163934]
|
|
|
|
mean value: 0.8624272871496562
|
|
|
|
key: train_accuracy
|
|
value: [0.91007194 0.89568345 0.87230216 0.89208633 0.89028777 0.87769784
|
|
0.89388489 0.88489209 0.88868941 0.9048474 ]
|
|
|
|
mean value: 0.8910443279128941
|
|
|
|
key: test_fscore
|
|
value: [0.96774194 0.85294118 0.82352941 0.89230769 0.89230769 0.88235294
|
|
0.8 0.875 0.81818182 0.90625 ]
|
|
|
|
mean value: 0.8710612667692839
|
|
|
|
key: train_fscore
|
|
value: [0.91289199 0.90034364 0.88067227 0.89761092 0.8957265 0.88474576
|
|
0.8991453 0.89152542 0.89455782 0.90750436]
|
|
|
|
mean value: 0.8964723986527141
|
|
|
|
key: test_precision
|
|
value: [0.96774194 0.78378378 0.75675676 0.85294118 0.85294118 0.81081081
|
|
0.76470588 0.84848485 0.77142857 0.85294118]
|
|
|
|
mean value: 0.8262536118513348
|
|
|
|
key: train_precision
|
|
value: [0.88513514 0.86184211 0.82649842 0.8538961 0.8534202 0.83653846
|
|
0.85667752 0.84294872 0.8483871 0.88435374]
|
|
|
|
mean value: 0.8549697504635009
|
|
|
|
key: test_recall
|
|
value: [0.96774194 0.93548387 0.90322581 0.93548387 0.93548387 0.96774194
|
|
0.83870968 0.90322581 0.87096774 0.96666667]
|
|
|
|
mean value: 0.9224731182795699
|
|
|
|
key: train_recall
|
|
value: [0.94244604 0.94244604 0.94244604 0.94604317 0.94244604 0.93884892
|
|
0.94604317 0.94604317 0.94604317 0.93189964]
|
|
|
|
mean value: 0.9424705396972745
|
|
|
|
key: test_roc_auc
|
|
value: [0.96774194 0.83870968 0.80645161 0.88709677 0.88709677 0.87096774
|
|
0.79032258 0.87096774 0.80215054 0.90268817]
|
|
|
|
mean value: 0.8624193548387098
|
|
|
|
key: train_roc_auc
|
|
value: [0.91007194 0.89568345 0.87230216 0.89208633 0.89028777 0.87769784
|
|
0.89388489 0.88489209 0.88879219 0.90479874]
|
|
|
|
mean value: 0.8910497408524792
|
|
|
|
key: test_jcc
|
|
value: [0.9375 0.74358974 0.7 0.80555556 0.80555556 0.78947368
|
|
0.66666667 0.77777778 0.69230769 0.82857143]
|
|
|
|
mean value: 0.7746998104234947
|
|
|
|
key: train_jcc
|
|
value: [0.83974359 0.81875 0.78678679 0.81424149 0.81114551 0.79331307
|
|
0.81677019 0.80428135 0.80923077 0.83067093]
|
|
|
|
mean value: 0.812493367099271
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.48
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.58698964 1.47472477 1.58772182 1.53978014 1.45300221 1.59102941
|
|
1.59130311 1.50179529 1.59061027 1.5674026 ]
|
|
|
|
mean value: 1.548435926437378
|
|
|
|
key: score_time
|
|
value: [0.01429367 0.01347637 0.01343799 0.0135057 0.01355076 0.01363134
|
|
0.01368833 0.01345825 0.01342821 0.01383781]
|
|
|
|
mean value: 0.013630843162536621
|
|
|
|
key: test_mcc
|
|
value: [0.96824584 0.84266484 0.87278605 0.93743687 0.93743687 0.90369611
|
|
0.90369611 0.90748521 0.83638369 0.8688172 ]
|
|
|
|
mean value: 0.8978648796280239
|
|
|
|
key: train_mcc
|
|
value: [0.99283145 0.98921503 0.98921503 0.98561151 0.98921503 0.98921503
|
|
0.98561151 0.99640932 0.99284416 0.99641577]
|
|
|
|
mean value: 0.9906583855147647
|
|
|
|
key: test_accuracy
|
|
value: [0.98387097 0.91935484 0.93548387 0.96774194 0.96774194 0.9516129
|
|
0.9516129 0.9516129 0.91803279 0.93442623]
|
|
|
|
mean value: 0.9481491274457959
|
|
|
|
key: train_accuracy
|
|
value: [0.99640288 0.99460432 0.99460432 0.99280576 0.99460432 0.99460432
|
|
0.99280576 0.99820144 0.99640934 0.99820467]
|
|
|
|
mean value: 0.9953247097115845
|
|
|
|
key: test_fscore
|
|
value: [0.98412698 0.92307692 0.9375 0.96875 0.96875 0.95081967
|
|
0.95238095 0.94915254 0.92063492 0.93333333]
|
|
|
|
mean value: 0.9488525328057142
|
|
|
|
key: train_fscore
|
|
value: [0.99638989 0.99459459 0.99459459 0.99280576 0.99459459 0.99459459
|
|
0.99280576 0.9981982 0.99638989 0.99820467]
|
|
|
|
mean value: 0.9953172538625
|
|
|
|
key: test_precision
|
|
value: [0.96875 0.88235294 0.90909091 0.93939394 0.93939394 0.96666667
|
|
0.9375 1. 0.90625 0.93333333]
|
|
|
|
mean value: 0.9382731729055258
|
|
|
|
key: train_precision
|
|
value: [1. 0.99638989 0.99638989 0.99280576 0.99638989 0.99638989
|
|
0.99280576 1. 1. 1. ]
|
|
|
|
mean value: 0.997117107757837
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 0.96774194 1. 1. 0.93548387
|
|
0.96774194 0.90322581 0.93548387 0.93333333]
|
|
|
|
mean value: 0.9610752688172043
|
|
|
|
key: train_recall
|
|
value: [0.99280576 0.99280576 0.99280576 0.99280576 0.99280576 0.99280576
|
|
0.99280576 0.99640288 0.99280576 0.99641577]
|
|
|
|
mean value: 0.9935264691472628
|
|
|
|
key: test_roc_auc
|
|
value: [0.98387097 0.91935484 0.93548387 0.96774194 0.96774194 0.9516129
|
|
0.9516129 0.9516129 0.91774194 0.9344086 ]
|
|
|
|
mean value: 0.9481182795698926
|
|
|
|
key: train_roc_auc
|
|
value: [0.99640288 0.99460432 0.99460432 0.99280576 0.99460432 0.99460432
|
|
0.99280576 0.99820144 0.99640288 0.99820789]
|
|
|
|
mean value: 0.9953243856527682
|
|
|
|
key: test_jcc
|
|
value: [0.96875 0.85714286 0.88235294 0.93939394 0.93939394 0.90625
|
|
0.90909091 0.90322581 0.85294118 0.875 ]
|
|
|
|
mean value: 0.9033541569120317
|
|
|
|
key: train_jcc
|
|
value: [0.99280576 0.98924731 0.98924731 0.98571429 0.98924731 0.98924731
|
|
0.98571429 0.99640288 0.99280576 0.99641577]
|
|
|
|
mean value: 0.9906847977838927
|
|
|
|
MCC on Blind test: 0.18
|
|
|
|
Accuracy on Blind test: 0.36
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01465797 0.01315546 0.01135731 0.01076937 0.01108599 0.01082325
|
|
0.01092386 0.01045871 0.01094151 0.01048708]
|
|
|
|
mean value: 0.011466050148010254
|
|
|
|
key: score_time
|
|
value: [0.01047611 0.00827646 0.00819159 0.00810289 0.00804043 0.00810003
|
|
0.00786495 0.00790858 0.00797296 0.00794983]
|
|
|
|
mean value: 0.008288383483886719
|
|
|
|
key: test_mcc
|
|
value: [0.96824584 0.90369611 1. 0.90748521 0.90369611 0.87831007
|
|
0.84266484 0.96824584 0.8688172 0.87055472]
|
|
|
|
mean value: 0.9111715945488771
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98387097 0.9516129 1. 0.9516129 0.9516129 0.93548387
|
|
0.91935484 0.98387097 0.93442623 0.93442623]
|
|
|
|
mean value: 0.9546271813855103
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98412698 0.95081967 1. 0.95384615 0.95081967 0.93103448
|
|
0.91525424 0.98360656 0.93548387 0.93103448]
|
|
|
|
mean value: 0.9536026113385601
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96875 0.96666667 1. 0.91176471 0.96666667 1.
|
|
0.96428571 1. 0.93548387 0.96428571]
|
|
|
|
mean value: 0.9677903338754856
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.93548387 1. 1. 0.93548387 0.87096774
|
|
0.87096774 0.96774194 0.93548387 0.9 ]
|
|
|
|
mean value: 0.9416129032258065
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98387097 0.9516129 1. 0.9516129 0.9516129 0.93548387
|
|
0.91935484 0.98387097 0.9344086 0.93387097]
|
|
|
|
mean value: 0.9545698924731183
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96875 0.90625 1. 0.91176471 0.90625 0.87096774
|
|
0.84375 0.96774194 0.87878788 0.87096774]
|
|
|
|
mean value: 0.912523000402507
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.58
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10264206 0.10256195 0.10940504 0.10670924 0.10698891 0.10749125
|
|
0.10634494 0.1044426 0.1049583 0.10701418]
|
|
|
|
mean value: 0.10585584640502929
|
|
|
|
key: score_time
|
|
value: [0.01734233 0.01776242 0.01870441 0.01858568 0.01841116 0.01849437
|
|
0.01723242 0.01806641 0.01844049 0.01706672]
|
|
|
|
mean value: 0.018010640144348146
|
|
|
|
key: test_mcc
|
|
value: [0.93743687 0.81325006 0.87096774 0.87278605 0.93743687 0.90369611
|
|
0.80645161 0.93743687 0.8688172 0.90215054]
|
|
|
|
mean value: 0.8850429919540547
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96774194 0.90322581 0.93548387 0.93548387 0.96774194 0.9516129
|
|
0.90322581 0.96774194 0.93442623 0.95081967]
|
|
|
|
mean value: 0.9417503966155474
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96875 0.90909091 0.93548387 0.9375 0.96875 0.95238095
|
|
0.90322581 0.96666667 0.93548387 0.95081967]
|
|
|
|
mean value: 0.9428151748656772
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.93939394 0.85714286 0.93548387 0.90909091 0.93939394 0.9375
|
|
0.90322581 1. 0.93548387 0.93548387]
|
|
|
|
mean value: 0.9292199064376484
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 0.93548387 0.96774194 1. 0.96774194
|
|
0.90322581 0.93548387 0.93548387 0.96666667]
|
|
|
|
mean value: 0.9579569892473119
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96774194 0.90322581 0.93548387 0.93548387 0.96774194 0.9516129
|
|
0.90322581 0.96774194 0.9344086 0.95107527]
|
|
|
|
mean value: 0.9417741935483872
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.93939394 0.83333333 0.87878788 0.88235294 0.93939394 0.90909091
|
|
0.82352941 0.93548387 0.87878788 0.90625 ]
|
|
|
|
mean value: 0.8926404102696797
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.22
|
|
|
|
Accuracy on Blind test: 0.4
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00817394 0.0080502 0.0084374 0.00861168 0.00831032 0.00796342
|
|
0.00770545 0.00855494 0.00841975 0.00792027]
|
|
|
|
mean value: 0.008214735984802246
|
|
|
|
key: score_time
|
|
value: [0.00823379 0.00851464 0.00845742 0.00857925 0.00831676 0.00783062
|
|
0.00798845 0.00863981 0.00799108 0.00859761]
|
|
|
|
mean value: 0.008314943313598633
|
|
|
|
key: test_mcc
|
|
value: [0.71004695 0.5809475 0.67883359 0.59603956 0.64549722 0.77784447
|
|
0.65372045 0.67883359 0.77072165 0.77096774]
|
|
|
|
mean value: 0.68634527326083
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.85483871 0.79032258 0.83870968 0.79032258 0.82258065 0.88709677
|
|
0.82258065 0.83870968 0.8852459 0.8852459 ]
|
|
|
|
mean value: 0.841565309360127
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.85245902 0.79365079 0.83333333 0.76363636 0.81967213 0.88135593
|
|
0.80701754 0.83333333 0.88888889 0.8852459 ]
|
|
|
|
mean value: 0.835859323808608
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.86666667 0.78125 0.86206897 0.875 0.83333333 0.92857143
|
|
0.88461538 0.86206897 0.875 0.87096774]
|
|
|
|
mean value: 0.8639542486156779
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.83870968 0.80645161 0.80645161 0.67741935 0.80645161 0.83870968
|
|
0.74193548 0.80645161 0.90322581 0.9 ]
|
|
|
|
mean value: 0.8125806451612902
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.85483871 0.79032258 0.83870968 0.79032258 0.82258065 0.88709677
|
|
0.82258065 0.83870968 0.88494624 0.88548387]
|
|
|
|
mean value: 0.8415591397849462
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.74285714 0.65789474 0.71428571 0.61764706 0.69444444 0.78787879
|
|
0.67647059 0.71428571 0.8 0.79411765]
|
|
|
|
mean value: 0.7199881834711557
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.07
|
|
|
|
Accuracy on Blind test: 0.43
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.35455632 1.36369705 1.44976234 1.43192887 1.36655641 1.41406369
|
|
1.44773722 1.37284899 1.38293886 1.38959265]
|
|
|
|
mean value: 1.3973682403564454
|
|
|
|
key: score_time
|
|
value: [0.09139943 0.09957314 0.09985614 0.09845757 0.09911156 0.0994699
|
|
0.09767675 0.09422445 0.09873199 0.09957123]
|
|
|
|
mean value: 0.09780721664428711
|
|
|
|
key: test_mcc
|
|
value: [0.96824584 0.93548387 0.96824584 0.90748521 0.93743687 0.96824584
|
|
1. 0.96824584 0.96770777 0.8688172 ]
|
|
|
|
mean value: 0.9489914273704848
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98387097 0.96774194 0.98387097 0.9516129 0.96774194 0.98387097
|
|
1. 0.98387097 0.98360656 0.93442623]
|
|
|
|
mean value: 0.9740613432046537
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98412698 0.96774194 0.98412698 0.95384615 0.96875 0.98360656
|
|
1. 0.98360656 0.98412698 0.93333333]
|
|
|
|
mean value: 0.9743265489798408
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96875 0.96774194 0.96875 0.91176471 0.93939394 1.
|
|
1. 1. 0.96875 0.93333333]
|
|
|
|
mean value: 0.9658483914093496
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 1. 1. 1. 0.96774194
|
|
1. 0.96774194 1. 0.93333333]
|
|
|
|
mean value: 0.9836559139784946
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98387097 0.96774194 0.98387097 0.9516129 0.96774194 0.98387097
|
|
1. 0.98387097 0.98333333 0.9344086 ]
|
|
|
|
mean value: 0.9740322580645162
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96875 0.9375 0.96875 0.91176471 0.93939394 0.96774194
|
|
1. 0.96774194 0.96875 0.875 ]
|
|
|
|
mean value: 0.9505392516244034
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.16
|
|
|
|
Accuracy on Blind test: 0.35
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.87023854 0.93618393 0.92621827 0.95837045 1.00464082 0.93836141
|
|
0.98241544 0.91587925 0.8986578 0.98821497]
|
|
|
|
mean value: 0.9419180870056152
|
|
|
|
key: score_time
|
|
value: [0.23300123 0.2598815 0.26490426 0.22142696 0.22671819 0.23441744
|
|
0.25722957 0.27357078 0.23566699 0.21245193]
|
|
|
|
mean value: 0.2419268846511841
|
|
|
|
key: test_mcc
|
|
value: [0.96824584 0.87278605 0.93743687 0.90748521 0.93743687 0.96824584
|
|
0.96824584 0.96824584 0.93635873 0.83655914]
|
|
|
|
mean value: 0.9301046213212982
|
|
|
|
key: train_mcc
|
|
value: [0.96778244 0.97132357 0.96778244 0.97487691 0.96768225 0.96768225
|
|
0.96778244 0.96778244 0.97137553 0.9784809 ]
|
|
|
|
mean value: 0.9702551166949516
|
|
|
|
key: test_accuracy
|
|
value: [0.98387097 0.93548387 0.96774194 0.9516129 0.96774194 0.98387097
|
|
0.98387097 0.98387097 0.96721311 0.91803279]
|
|
|
|
mean value: 0.9643310417768377
|
|
|
|
key: train_accuracy
|
|
value: [0.98381295 0.98561151 0.98381295 0.98741007 0.98381295 0.98381295
|
|
0.98381295 0.98381295 0.98563734 0.98922801]
|
|
|
|
mean value: 0.9850764630665306
|
|
|
|
key: test_fscore
|
|
value: [0.98412698 0.9375 0.96875 0.95384615 0.96875 0.98360656
|
|
0.98412698 0.98360656 0.96875 0.91803279]
|
|
|
|
mean value: 0.9651096023739466
|
|
|
|
key: train_fscore
|
|
value: [0.98395722 0.98571429 0.98395722 0.98747764 0.98389982 0.98389982
|
|
0.98395722 0.98395722 0.98571429 0.98928571]
|
|
|
|
mean value: 0.985182044357831
|
|
|
|
key: test_precision
|
|
value: [0.96875 0.90909091 0.93939394 0.91176471 0.93939394 1.
|
|
0.96875 1. 0.93939394 0.90322581]
|
|
|
|
mean value: 0.9479763239606693
|
|
|
|
key: train_precision
|
|
value: [0.97526502 0.9787234 0.97526502 0.98220641 0.97864769 0.97864769
|
|
0.97526502 0.97526502 0.9787234 0.98576512]
|
|
|
|
mean value: 0.9783773783096608
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 1. 1. 1. 0.96774194
|
|
1. 0.96774194 1. 0.93333333]
|
|
|
|
mean value: 0.9836559139784946
|
|
|
|
key: train_recall
|
|
value: [0.99280576 0.99280576 0.99280576 0.99280576 0.98920863 0.98920863
|
|
0.99280576 0.99280576 0.99280576 0.99283154]
|
|
|
|
mean value: 0.9920889095175472
|
|
|
|
key: test_roc_auc
|
|
value: [0.98387097 0.93548387 0.96774194 0.9516129 0.96774194 0.98387097
|
|
0.98387097 0.98387097 0.96666667 0.91827957]
|
|
|
|
mean value: 0.9643010752688173
|
|
|
|
key: train_roc_auc
|
|
value: [0.98381295 0.98561151 0.98381295 0.98741007 0.98381295 0.98381295
|
|
0.98381295 0.98381295 0.98565019 0.98922153]
|
|
|
|
mean value: 0.9850770996106342
|
|
|
|
key: test_jcc
|
|
value: [0.96875 0.88235294 0.93939394 0.91176471 0.93939394 0.96774194
|
|
0.96875 0.96774194 0.93939394 0.84848485]
|
|
|
|
mean value: 0.9333768184693232
|
|
|
|
key: train_jcc
|
|
value: [0.96842105 0.97183099 0.96842105 0.97526502 0.96830986 0.96830986
|
|
0.96842105 0.96842105 0.97183099 0.97879859]
|
|
|
|
mean value: 0.9708029504907444
|
|
|
|
MCC on Blind test: 0.15
|
|
|
|
Accuracy on Blind test: 0.4
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01861191 0.00832939 0.00832176 0.00824809 0.00866604 0.00814605
|
|
0.00879598 0.00842834 0.00795603 0.00822353]
|
|
|
|
mean value: 0.009372711181640625
|
|
|
|
key: score_time
|
|
value: [0.00951219 0.00824142 0.00890613 0.00865459 0.00877047 0.00828552
|
|
0.00873399 0.00808811 0.00830865 0.00846767]
|
|
|
|
mean value: 0.00859687328338623
|
|
|
|
key: test_mcc
|
|
value: [0.64820372 0.68313005 0.48488114 0.74348441 0.80813523 0.74348441
|
|
0.64820372 0.74193548 0.63978495 0.67204301]
|
|
|
|
mean value: 0.6813286129520032
|
|
|
|
key: train_mcc
|
|
value: [0.69129181 0.69623388 0.69785979 0.69872831 0.69209976 0.70569372
|
|
0.7019886 0.70220704 0.69929441 0.69881448]
|
|
|
|
mean value: 0.698421180066379
|
|
|
|
key: test_accuracy
|
|
value: [0.82258065 0.83870968 0.74193548 0.87096774 0.90322581 0.87096774
|
|
0.82258065 0.87096774 0.81967213 0.83606557]
|
|
|
|
mean value: 0.8397673188789001
|
|
|
|
key: train_accuracy
|
|
value: [0.84532374 0.8471223 0.84892086 0.84892086 0.84532374 0.85251799
|
|
0.85071942 0.85071942 0.8491921 0.8491921 ]
|
|
|
|
mean value: 0.8487952546400941
|
|
|
|
key: test_fscore
|
|
value: [0.81355932 0.84848485 0.75 0.875 0.9 0.875
|
|
0.83076923 0.87096774 0.81967213 0.83333333]
|
|
|
|
mean value: 0.8416786607704336
|
|
|
|
key: train_fscore
|
|
value: [0.84859155 0.85268631 0.84837545 0.85263158 0.85017422 0.8556338
|
|
0.85361552 0.85413005 0.85263158 0.85211268]
|
|
|
|
mean value: 0.8520582734853629
|
|
|
|
key: test_precision
|
|
value: [0.85714286 0.8 0.72727273 0.84848485 0.93103448 0.84848485
|
|
0.79411765 0.87096774 0.83333333 0.83333333]
|
|
|
|
mean value: 0.8344171819804876
|
|
|
|
key: train_precision
|
|
value: [0.83103448 0.82274247 0.85144928 0.83219178 0.82432432 0.83793103
|
|
0.83737024 0.83505155 0.83219178 0.83737024]
|
|
|
|
mean value: 0.8341657184309065
|
|
|
|
key: test_recall
|
|
value: [0.77419355 0.90322581 0.77419355 0.90322581 0.87096774 0.90322581
|
|
0.87096774 0.87096774 0.80645161 0.83333333]
|
|
|
|
mean value: 0.8510752688172043
|
|
|
|
key: train_recall
|
|
value: [0.86690647 0.88489209 0.84532374 0.87410072 0.87769784 0.87410072
|
|
0.8705036 0.87410072 0.87410072 0.86738351]
|
|
|
|
mean value: 0.8709110131249839
|
|
|
|
key: test_roc_auc
|
|
value: [0.82258065 0.83870968 0.74193548 0.87096774 0.90322581 0.87096774
|
|
0.82258065 0.87096774 0.81989247 0.83602151]
|
|
|
|
mean value: 0.8397849462365592
|
|
|
|
key: train_roc_auc
|
|
value: [0.84532374 0.8471223 0.84892086 0.84892086 0.84532374 0.85251799
|
|
0.85071942 0.85071942 0.84923674 0.84915938]
|
|
|
|
mean value: 0.8487964467135968
|
|
|
|
key: test_jcc
|
|
value: [0.68571429 0.73684211 0.6 0.77777778 0.81818182 0.77777778
|
|
0.71052632 0.77142857 0.69444444 0.71428571]
|
|
|
|
mean value: 0.7286978810663021
|
|
|
|
key: train_jcc
|
|
value: [0.73700306 0.74320242 0.73667712 0.74311927 0.73939394 0.74769231
|
|
0.74461538 0.74539877 0.74311927 0.74233129]
|
|
|
|
mean value: 0.7422552816171282
|
|
|
|
MCC on Blind test: 0.19
|
|
|
|
Accuracy on Blind test: 0.49
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.08428884 0.04923701 0.12757492 0.1029861 0.05474067 0.05481815
|
|
0.06141877 0.06270385 0.06345892 0.05934381]
|
|
|
|
mean value: 0.07205710411071778
|
|
|
|
key: score_time
|
|
value: [0.01002645 0.00963044 0.01171899 0.01000237 0.00956392 0.00953889
|
|
0.00953102 0.00952125 0.00951862 0.00952578]
|
|
|
|
mean value: 0.009857773780822754
|
|
|
|
key: test_mcc
|
|
value: [0.96824584 0.90369611 0.93743687 0.90748521 0.90369611 0.93743687
|
|
1. 0.96824584 0.96770777 0.8688172 ]
|
|
|
|
mean value: 0.9362767824424617
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98387097 0.9516129 0.96774194 0.9516129 0.9516129 0.96774194
|
|
1. 0.98387097 0.98360656 0.93442623]
|
|
|
|
mean value: 0.9676097303014278
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98412698 0.95238095 0.96875 0.95384615 0.95238095 0.96666667
|
|
1. 0.98360656 0.98412698 0.93333333]
|
|
|
|
mean value: 0.9679218584239075
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96875 0.9375 0.93939394 0.91176471 0.9375 1.
|
|
1. 1. 0.96875 0.93333333]
|
|
|
|
mean value: 0.9596991978609626
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 1. 1. 0.96774194 0.93548387
|
|
1. 0.96774194 1. 0.93333333]
|
|
|
|
mean value: 0.9772043010752688
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98387097 0.9516129 0.96774194 0.9516129 0.9516129 0.96774194
|
|
1. 0.98387097 0.98333333 0.9344086 ]
|
|
|
|
mean value: 0.9675806451612904
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96875 0.90909091 0.93939394 0.91176471 0.90909091 0.93548387
|
|
1. 0.96774194 0.96875 0.875 ]
|
|
|
|
mean value: 0.9385066269909723
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.61
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01458907 0.04201937 0.02599144 0.01775765 0.04186487 0.0279336
|
|
0.01842332 0.04155302 0.04179025 0.01767874]
|
|
|
|
mean value: 0.028960132598876955
|
|
|
|
key: score_time
|
|
value: [0.01030087 0.02038527 0.01068902 0.01067185 0.01916838 0.01074195
|
|
0.01076746 0.01074457 0.02005053 0.010741 ]
|
|
|
|
mean value: 0.013426089286804199
|
|
|
|
key: test_mcc
|
|
value: [0.96824584 0.87278605 1. 0.90748521 0.96824584 0.96824584
|
|
1. 0.93743687 0.87082935 0.83655914]
|
|
|
|
mean value: 0.9329834129888399
|
|
|
|
key: train_mcc
|
|
value: [0.95329292 0.9497386 0.95329292 0.96048758 0.94966486 0.94966486
|
|
0.93900081 0.95339163 0.95693712 0.96065614]
|
|
|
|
mean value: 0.9526127442796535
|
|
|
|
key: test_accuracy
|
|
value: [0.98387097 0.93548387 1. 0.9516129 0.98387097 0.98387097
|
|
1. 0.96774194 0.93442623 0.91803279]
|
|
|
|
mean value: 0.9658910629296669
|
|
|
|
key: train_accuracy
|
|
value: [0.97661871 0.97482014 0.97661871 0.98021583 0.97482014 0.97482014
|
|
0.96942446 0.97661871 0.97845601 0.98025135]
|
|
|
|
mean value: 0.9762664195394134
|
|
|
|
key: test_fscore
|
|
value: [0.98360656 0.9375 1. 0.95384615 0.98412698 0.98360656
|
|
1. 0.96666667 0.93333333 0.91803279]
|
|
|
|
mean value: 0.9660719039612482
|
|
|
|
key: train_fscore
|
|
value: [0.97674419 0.975 0.97674419 0.980322 0.97491039 0.97491039
|
|
0.96969697 0.97682709 0.97849462 0.98046181]
|
|
|
|
mean value: 0.9764111663751257
|
|
|
|
key: test_precision
|
|
value: [1. 0.90909091 1. 0.91176471 0.96875 1.
|
|
1. 1. 0.96551724 0.90322581]
|
|
|
|
mean value: 0.9658348662804185
|
|
|
|
key: train_precision
|
|
value: [0.97153025 0.96808511 0.97153025 0.97508897 0.97142857 0.97142857
|
|
0.96113074 0.96819788 0.975 0.97183099]
|
|
|
|
mean value: 0.9705251323255912
|
|
|
|
key: test_recall
|
|
value: [0.96774194 0.96774194 1. 1. 1. 0.96774194
|
|
1. 0.93548387 0.90322581 0.93333333]
|
|
|
|
mean value: 0.9675268817204301
|
|
|
|
key: train_recall
|
|
value: [0.98201439 0.98201439 0.98201439 0.98561151 0.97841727 0.97841727
|
|
0.97841727 0.98561151 0.98201439 0.98924731]
|
|
|
|
mean value: 0.9823779685928676
|
|
|
|
key: test_roc_auc
|
|
value: [0.98387097 0.93548387 1. 0.9516129 0.98387097 0.98387097
|
|
1. 0.96774194 0.93494624 0.91827957]
|
|
|
|
mean value: 0.9659677419354838
|
|
|
|
key: train_roc_auc
|
|
value: [0.97661871 0.97482014 0.97661871 0.98021583 0.97482014 0.97482014
|
|
0.96942446 0.97661871 0.97846239 0.98023517]
|
|
|
|
mean value: 0.976265439261494
|
|
|
|
key: test_jcc
|
|
value: [0.96774194 0.88235294 1. 0.91176471 0.96875 0.96774194
|
|
1. 0.93548387 0.875 0.84848485]
|
|
|
|
mean value: 0.9357320237479156
|
|
|
|
key: train_jcc
|
|
value: [0.95454545 0.95121951 0.95454545 0.96140351 0.95104895 0.95104895
|
|
0.94117647 0.95470383 0.95789474 0.96167247]
|
|
|
|
mean value: 0.9539259346206412
|
|
|
|
MCC on Blind test: 0.17
|
|
|
|
Accuracy on Blind test: 0.38
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02236176 0.00778937 0.00771594 0.00752807 0.0074892 0.00744605
|
|
0.00749993 0.007586 0.00749612 0.00748873]
|
|
|
|
mean value: 0.009040117263793945
|
|
|
|
key: score_time
|
|
value: [0.01843238 0.00818586 0.00802255 0.00780058 0.00774455 0.00785375
|
|
0.00774026 0.00784397 0.00779438 0.00780678]
|
|
|
|
mean value: 0.008922505378723144
|
|
|
|
key: test_mcc
|
|
value: [0.77459667 0.65372045 0.55301004 0.74819006 0.74819006 0.7190925
|
|
0.58338335 0.77459667 0.57576971 0.81062315]
|
|
|
|
mean value: 0.6941172654572817
|
|
|
|
key: train_mcc
|
|
value: [0.70194087 0.71536572 0.73033254 0.70140848 0.70140848 0.70194087
|
|
0.72031981 0.70528679 0.72419371 0.70094494]
|
|
|
|
mean value: 0.7103142230314713
|
|
|
|
key: test_accuracy
|
|
value: [0.88709677 0.82258065 0.77419355 0.87096774 0.87096774 0.85483871
|
|
0.79032258 0.88709677 0.78688525 0.90163934]
|
|
|
|
mean value: 0.8446589106292967
|
|
|
|
key: train_accuracy
|
|
value: [0.84892086 0.85611511 0.86330935 0.84892086 0.84892086 0.84892086
|
|
0.85791367 0.85071942 0.85996409 0.8491921 ]
|
|
|
|
mean value: 0.8532897201090115
|
|
|
|
key: test_fscore
|
|
value: [0.8852459 0.8358209 0.78787879 0.87878788 0.87878788 0.86567164
|
|
0.8 0.88888889 0.8 0.90625 ]
|
|
|
|
mean value: 0.8527331873296211
|
|
|
|
key: train_fscore
|
|
value: [0.85665529 0.86254296 0.86986301 0.85616438 0.85616438 0.85665529
|
|
0.86541738 0.85811966 0.8668942 0.8556701 ]
|
|
|
|
mean value: 0.8604146652008448
|
|
|
|
key: test_precision
|
|
value: [0.9 0.77777778 0.74285714 0.82857143 0.82857143 0.80555556
|
|
0.76470588 0.875 0.76470588 0.85294118]
|
|
|
|
mean value: 0.8140686274509804
|
|
|
|
key: train_precision
|
|
value: [0.81493506 0.82565789 0.83006536 0.81699346 0.81699346 0.81493506
|
|
0.82200647 0.81758958 0.82467532 0.82178218]
|
|
|
|
mean value: 0.8205633864120958
|
|
|
|
key: test_recall
|
|
value: [0.87096774 0.90322581 0.83870968 0.93548387 0.93548387 0.93548387
|
|
0.83870968 0.90322581 0.83870968 0.96666667]
|
|
|
|
mean value: 0.8966666666666666
|
|
|
|
key: train_recall
|
|
value: [0.9028777 0.9028777 0.91366906 0.89928058 0.89928058 0.9028777
|
|
0.91366906 0.9028777 0.91366906 0.89247312]
|
|
|
|
mean value: 0.9043552254970217
|
|
|
|
key: test_roc_auc
|
|
value: [0.88709677 0.82258065 0.77419355 0.87096774 0.87096774 0.85483871
|
|
0.79032258 0.88709677 0.78602151 0.90268817]
|
|
|
|
mean value: 0.8446774193548388
|
|
|
|
key: train_roc_auc
|
|
value: [0.84892086 0.85611511 0.86330935 0.84892086 0.84892086 0.84892086
|
|
0.85791367 0.85071942 0.86006034 0.84911426]
|
|
|
|
mean value: 0.853291560300147
|
|
|
|
key: test_jcc
|
|
value: [0.79411765 0.71794872 0.65 0.78378378 0.78378378 0.76315789
|
|
0.66666667 0.8 0.66666667 0.82857143]
|
|
|
|
mean value: 0.7454696589216713
|
|
|
|
key: train_jcc
|
|
value: [0.74925373 0.75830816 0.76969697 0.74850299 0.74850299 0.74925373
|
|
0.76276276 0.75149701 0.76506024 0.74774775]
|
|
|
|
mean value: 0.7550586334969577
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.48
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01076055 0.01259518 0.01484299 0.0152657 0.01288152 0.01513839
|
|
0.01497507 0.01241827 0.01552725 0.01354051]
|
|
|
|
mean value: 0.013794541358947754
|
|
|
|
key: score_time
|
|
value: [0.00853276 0.01013088 0.01017213 0.01044273 0.01037955 0.01046228
|
|
0.01040554 0.01038742 0.01037264 0.01043701]
|
|
|
|
mean value: 0.010172295570373534
|
|
|
|
key: test_mcc
|
|
value: [0.93743687 0.81325006 0.84983659 0.87831007 0.93548387 0.96824584
|
|
0.93743687 0.90748521 0.87082935 0.70997538]
|
|
|
|
mean value: 0.8808290098706804
|
|
|
|
key: train_mcc
|
|
value: [0.89396219 0.81804143 0.8410572 0.96058703 0.93914669 0.95329292
|
|
0.9354697 0.94266562 0.95337563 0.78144333]
|
|
|
|
mean value: 0.9019041746413544
|
|
|
|
key: test_accuracy
|
|
value: [0.96774194 0.90322581 0.91935484 0.93548387 0.96774194 0.98387097
|
|
0.96774194 0.9516129 0.93442623 0.83606557]
|
|
|
|
mean value: 0.9367265996827076
|
|
|
|
key: train_accuracy
|
|
value: [0.94604317 0.9028777 0.91546763 0.98021583 0.96942446 0.97661871
|
|
0.9676259 0.97122302 0.97666068 0.88150808]
|
|
|
|
mean value: 0.9487665164098523
|
|
|
|
key: test_fscore
|
|
value: [0.96666667 0.89655172 0.92537313 0.93939394 0.96774194 0.98360656
|
|
0.96666667 0.94915254 0.93333333 0.8 ]
|
|
|
|
mean value: 0.9328486499760696
|
|
|
|
key: train_fscore
|
|
value: [0.94423792 0.89370079 0.92153589 0.98039216 0.96903461 0.97674419
|
|
0.96727273 0.97153025 0.97649186 0.86746988]
|
|
|
|
mean value: 0.9468410268529506
|
|
|
|
key: test_precision
|
|
value: [1. 0.96296296 0.86111111 0.88571429 0.96774194 1.
|
|
1. 1. 0.96551724 1. ]
|
|
|
|
mean value: 0.9643047536651541
|
|
|
|
key: train_precision
|
|
value: [0.97692308 0.98695652 0.85981308 0.97173145 0.98154982 0.97153025
|
|
0.97794118 0.96126761 0.98181818 0.98630137]
|
|
|
|
mean value: 0.9655832529931669
|
|
|
|
key: test_recall
|
|
value: [0.93548387 0.83870968 1. 1. 0.96774194 0.96774194
|
|
0.93548387 0.90322581 0.90322581 0.66666667]
|
|
|
|
mean value: 0.9118279569892473
|
|
|
|
key: train_recall
|
|
value: [0.91366906 0.81654676 0.99280576 0.98920863 0.95683453 0.98201439
|
|
0.95683453 0.98201439 0.97122302 0.77419355]
|
|
|
|
mean value: 0.9335344627523787
|
|
|
|
key: test_roc_auc
|
|
value: [0.96774194 0.90322581 0.91935484 0.93548387 0.96774194 0.98387097
|
|
0.96774194 0.9516129 0.93494624 0.83333333]
|
|
|
|
mean value: 0.936505376344086
|
|
|
|
key: train_roc_auc
|
|
value: [0.94604317 0.9028777 0.91546763 0.98021583 0.96942446 0.97661871
|
|
0.9676259 0.97122302 0.97665094 0.88170109]
|
|
|
|
mean value: 0.9487848430932674
|
|
|
|
key: test_jcc
|
|
value: [0.93548387 0.8125 0.86111111 0.88571429 0.9375 0.96774194
|
|
0.93548387 0.90322581 0.875 0.66666667]
|
|
|
|
mean value: 0.8780427547363031
|
|
|
|
key: train_jcc
|
|
value: [0.8943662 0.80782918 0.85448916 0.96153846 0.93992933 0.95454545
|
|
0.93661972 0.94463668 0.9540636 0.76595745]
|
|
|
|
mean value: 0.9013975235029617
|
|
|
|
MCC on Blind test: 0.13
|
|
|
|
Accuracy on Blind test: 0.31
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0138278 0.01442528 0.01401639 0.01388168 0.01454473 0.0136025
|
|
0.0134716 0.01331687 0.01288104 0.01246238]
|
|
|
|
mean value: 0.013643026351928711
|
|
|
|
key: score_time
|
|
value: [0.01061082 0.01157665 0.01069307 0.01073432 0.01059294 0.01056838
|
|
0.01065302 0.01038933 0.01044464 0.01039839]
|
|
|
|
mean value: 0.010666155815124511
|
|
|
|
key: test_mcc
|
|
value: [0.93743687 0.87278605 0.93743687 0.90748521 0.90369611 0.87831007
|
|
1. 0.78446454 0.72318666 0.50305191]
|
|
|
|
mean value: 0.8447854282682599
|
|
|
|
key: train_mcc
|
|
value: [0.90882979 0.95705746 0.95025527 0.94305636 0.92239227 0.89154571
|
|
0.94604929 0.77463214 0.83507476 0.45405525]
|
|
|
|
mean value: 0.858294830889454
|
|
|
|
key: test_accuracy
|
|
value: [0.96774194 0.93548387 0.96774194 0.9516129 0.9516129 0.93548387
|
|
1. 0.88709677 0.85245902 0.70491803]
|
|
|
|
mean value: 0.9154151242728715
|
|
|
|
key: train_accuracy
|
|
value: [0.95323741 0.97841727 0.97482014 0.97122302 0.96043165 0.9442446
|
|
0.97302158 0.87589928 0.91202873 0.67504488]
|
|
|
|
mean value: 0.9218368572646372
|
|
|
|
key: test_fscore
|
|
value: [0.96666667 0.9375 0.96875 0.95384615 0.95081967 0.93103448
|
|
1. 0.89552239 0.86956522 0.57142857]
|
|
|
|
mean value: 0.9045133152282165
|
|
|
|
key: train_fscore
|
|
value: [0.95149254 0.97864769 0.97526502 0.97069597 0.95925926 0.94183865
|
|
0.97307002 0.88924559 0.91846922 0.52493438]
|
|
|
|
mean value: 0.908291832592524
|
|
|
|
key: test_precision
|
|
value: [1. 0.90909091 0.93939394 0.91176471 0.96666667 1.
|
|
1. 0.83333333 0.78947368 1. ]
|
|
|
|
mean value: 0.9349723238577727
|
|
|
|
key: train_precision
|
|
value: [0.98837209 0.96830986 0.95833333 0.98880597 0.98854962 0.98431373
|
|
0.97132616 0.80289855 0.85448916 0.98039216]
|
|
|
|
mean value: 0.9485790636020202
|
|
|
|
key: test_recall
|
|
value: [0.93548387 0.96774194 1. 1. 0.93548387 0.87096774
|
|
1. 0.96774194 0.96774194 0.4 ]
|
|
|
|
mean value: 0.9045161290322581
|
|
|
|
key: train_recall
|
|
value: [0.91726619 0.98920863 0.99280576 0.95323741 0.93165468 0.9028777
|
|
0.97482014 0.99640288 0.99280576 0.35842294]
|
|
|
|
mean value: 0.9009502075758747
|
|
|
|
key: test_roc_auc
|
|
value: [0.96774194 0.93548387 0.96774194 0.9516129 0.9516129 0.93548387
|
|
1. 0.88709677 0.85053763 0.7 ]
|
|
|
|
mean value: 0.914731182795699
|
|
|
|
key: train_roc_auc
|
|
value: [0.95323741 0.97841727 0.97482014 0.97122302 0.96043165 0.9442446
|
|
0.97302158 0.87589928 0.91217349 0.67561435]
|
|
|
|
mean value: 0.9219082798277507
|
|
|
|
key: test_jcc
|
|
value: [0.93548387 0.88235294 0.93939394 0.91176471 0.90625 0.87096774
|
|
1. 0.81081081 0.76923077 0.4 ]
|
|
|
|
mean value: 0.8426254779397568
|
|
|
|
key: train_jcc
|
|
value: [0.90747331 0.95818815 0.95172414 0.9430605 0.92170819 0.89007092
|
|
0.94755245 0.80057803 0.84923077 0.35587189]
|
|
|
|
mean value: 0.8525458343695811
|
|
|
|
MCC on Blind test: 0.14
|
|
|
|
Accuracy on Blind test: 0.32
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.11406326 0.10405397 0.10169864 0.10252857 0.09923482 0.10144997
|
|
0.09957933 0.10238481 0.10498977 0.10294104]
|
|
|
|
mean value: 0.10329241752624511
|
|
|
|
key: score_time
|
|
value: [0.01416016 0.01535344 0.01559019 0.01440263 0.01463914 0.01422262
|
|
0.01545978 0.01572537 0.01503325 0.0141983 ]
|
|
|
|
mean value: 0.014878487586975098
|
|
|
|
key: test_mcc
|
|
value: [0.96824584 0.93548387 0.96824584 0.90748521 0.90748521 0.93743687
|
|
1. 0.90369611 1. 0.8688172 ]
|
|
|
|
mean value: 0.9396896154994742
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98387097 0.96774194 0.98387097 0.9516129 0.9516129 0.96774194
|
|
1. 0.9516129 1. 0.93442623]
|
|
|
|
mean value: 0.9692490745637229
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98412698 0.96774194 0.98360656 0.95384615 0.95384615 0.96666667
|
|
1. 0.95081967 1. 0.93333333]
|
|
|
|
mean value: 0.9693987456811359
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96875 0.96774194 1. 0.91176471 0.91176471 1.
|
|
1. 0.96666667 1. 0.93333333]
|
|
|
|
mean value: 0.9660021347248577
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 0.96774194 1. 1. 0.93548387
|
|
1. 0.93548387 1. 0.93333333]
|
|
|
|
mean value: 0.9739784946236559
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98387097 0.96774194 0.98387097 0.9516129 0.9516129 0.96774194
|
|
1. 0.9516129 1. 0.9344086 ]
|
|
|
|
mean value: 0.969247311827957
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96875 0.9375 0.96774194 0.91176471 0.91176471 0.93548387
|
|
1. 0.90625 1. 0.875 ]
|
|
|
|
mean value: 0.9414255218216319
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.09
|
|
|
|
Accuracy on Blind test: 0.31
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03851056 0.03913617 0.03798318 0.04739237 0.0397296 0.04984927
|
|
0.04227161 0.05067539 0.05400753 0.04927731]
|
|
|
|
mean value: 0.04488329887390137
|
|
|
|
key: score_time
|
|
value: [0.02179551 0.02289391 0.02226377 0.01712132 0.03155065 0.0246129
|
|
0.03463507 0.02148271 0.02362227 0.01659489]
|
|
|
|
mean value: 0.02365729808807373
|
|
|
|
key: test_mcc
|
|
value: [1. 0.90369611 1. 0.93743687 0.87096774 0.90748521
|
|
0.83914639 0.96824584 0.93635873 0.90204573]
|
|
|
|
mean value: 0.9265382629263172
|
|
|
|
key: train_mcc
|
|
value: [0.99640932 0.99640932 0.99280576 0.99640932 0.98563702 0.99280576
|
|
0.99640932 0.99640932 0.98923442 0.99284434]
|
|
|
|
mean value: 0.9935373910332435
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.9516129 1. 0.96774194 0.93548387 0.9516129
|
|
0.91935484 0.98387097 0.96721311 0.95081967]
|
|
|
|
mean value: 0.9627710206240084
|
|
|
|
key: train_accuracy
|
|
value: [0.99820144 0.99820144 0.99640288 0.99820144 0.99280576 0.99640288
|
|
0.99820144 0.99820144 0.994614 0.99640934]
|
|
|
|
mean value: 0.9967642044353745
|
|
|
|
key: test_fscore
|
|
value: [1. 0.95081967 1. 0.96875 0.93548387 0.94915254
|
|
0.91803279 0.98360656 0.96875 0.94915254]
|
|
|
|
mean value: 0.9623747972106947
|
|
|
|
key: train_fscore
|
|
value: [0.9981982 0.9981982 0.99640288 0.9981982 0.99277978 0.99640288
|
|
0.99820467 0.9981982 0.994614 0.99640288]
|
|
|
|
mean value: 0.9967599880734039
|
|
|
|
key: test_precision
|
|
value: [1. 0.96666667 1. 0.93939394 0.93548387 1.
|
|
0.93333333 1. 0.93939394 0.96551724]
|
|
|
|
mean value: 0.9679788991134931
|
|
|
|
key: train_precision
|
|
value: [1. 1. 0.99640288 1. 0.99637681 0.99640288
|
|
0.99641577 1. 0.99283154 1. ]
|
|
|
|
mean value: 0.9978429878817844
|
|
|
|
key: test_recall
|
|
value: [1. 0.93548387 1. 1. 0.93548387 0.90322581
|
|
0.90322581 0.96774194 1. 0.93333333]
|
|
|
|
mean value: 0.9578494623655914
|
|
|
|
key: train_recall
|
|
value: [0.99640288 0.99640288 0.99640288 0.99640288 0.98920863 0.99640288
|
|
1. 0.99640288 0.99640288 0.99283154]
|
|
|
|
mean value: 0.9956860318197055
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.9516129 1. 0.96774194 0.93548387 0.9516129
|
|
0.91935484 0.98387097 0.96666667 0.95053763]
|
|
|
|
mean value: 0.9626881720430108
|
|
|
|
key: train_roc_auc
|
|
value: [0.99820144 0.99820144 0.99640288 0.99820144 0.99280576 0.99640288
|
|
0.99820144 0.99820144 0.99461721 0.99641577]
|
|
|
|
mean value: 0.996765168510353
|
|
|
|
key: test_jcc
|
|
value: [1. 0.90625 1. 0.93939394 0.87878788 0.90322581
|
|
0.84848485 0.96774194 0.93939394 0.90322581]
|
|
|
|
mean value: 0.9286504154447703
|
|
|
|
key: train_jcc
|
|
value: [0.99640288 0.99640288 0.99283154 0.99640288 0.98566308 0.99283154
|
|
0.99641577 0.99640288 0.98928571 0.99283154]
|
|
|
|
mean value: 0.9935470701779591
|
|
|
|
MCC on Blind test: 0.07
|
|
|
|
Accuracy on Blind test: 0.58
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.12759042 0.22521901 0.21887374 0.2211132 0.17997479 0.20122313
|
|
0.19672465 0.20488429 0.276335 0.25733685]
|
|
|
|
mean value: 0.21092751026153564
|
|
|
|
key: score_time
|
|
value: [0.01269174 0.02497721 0.02092695 0.02029276 0.01257658 0.0126636
|
|
0.01265192 0.02021074 0.02772164 0.02012014]
|
|
|
|
mean value: 0.0184833288192749
|
|
|
|
key: test_mcc
|
|
value: [0.90748521 0.61807005 0.7130241 0.80813523 0.77784447 0.77459667
|
|
0.61807005 0.80645161 0.57576971 0.70780713]
|
|
|
|
mean value: 0.7307254226729265
|
|
|
|
key: train_mcc
|
|
value: [0.87086426 0.86386843 0.84312418 0.83904739 0.85318614 0.85376169
|
|
0.85720277 0.84009387 0.86412027 0.86022912]
|
|
|
|
mean value: 0.8545498119930119
|
|
|
|
key: test_accuracy
|
|
value: [0.9516129 0.80645161 0.85483871 0.90322581 0.88709677 0.88709677
|
|
0.80645161 0.90322581 0.78688525 0.85245902]
|
|
|
|
mean value: 0.8639344262295082
|
|
|
|
key: train_accuracy
|
|
value: [0.9352518 0.93165468 0.92086331 0.91906475 0.92625899 0.92625899
|
|
0.92805755 0.91906475 0.93177738 0.92998205]
|
|
|
|
mean value: 0.9268234245637601
|
|
|
|
key: test_fscore
|
|
value: [0.94915254 0.81818182 0.86153846 0.90625 0.89230769 0.88888889
|
|
0.81818182 0.90322581 0.8 0.84210526]
|
|
|
|
mean value: 0.8679832291081068
|
|
|
|
key: train_fscore
|
|
value: [0.93617021 0.93286219 0.92307692 0.92091388 0.92768959 0.92819615
|
|
0.92982456 0.92173913 0.93286219 0.93097345]
|
|
|
|
mean value: 0.9284308286107671
|
|
|
|
key: test_precision
|
|
value: [1. 0.77142857 0.82352941 0.87878788 0.85294118 0.875
|
|
0.77142857 0.90322581 0.76470588 0.88888889]
|
|
|
|
mean value: 0.8529936187573759
|
|
|
|
key: train_precision
|
|
value: [0.92307692 0.91666667 0.89795918 0.90034364 0.9100346 0.90443686
|
|
0.90753425 0.89225589 0.91666667 0.91958042]
|
|
|
|
mean value: 0.9088555103251448
|
|
|
|
key: test_recall
|
|
value: [0.90322581 0.87096774 0.90322581 0.93548387 0.93548387 0.90322581
|
|
0.87096774 0.90322581 0.83870968 0.8 ]
|
|
|
|
mean value: 0.8864516129032258
|
|
|
|
key: train_recall
|
|
value: [0.94964029 0.94964029 0.94964029 0.94244604 0.94604317 0.95323741
|
|
0.95323741 0.95323741 0.94964029 0.94265233]
|
|
|
|
mean value: 0.9489414919677162
|
|
|
|
key: test_roc_auc
|
|
value: [0.9516129 0.80645161 0.85483871 0.90322581 0.88709677 0.88709677
|
|
0.80645161 0.90322581 0.78602151 0.8516129 ]
|
|
|
|
mean value: 0.8637634408602151
|
|
|
|
key: train_roc_auc
|
|
value: [0.9352518 0.93165468 0.92086331 0.91906475 0.92625899 0.92625899
|
|
0.92805755 0.91906475 0.93180939 0.92995926]
|
|
|
|
mean value: 0.9268243469740336
|
|
|
|
key: test_jcc
|
|
value: [0.90322581 0.69230769 0.75675676 0.82857143 0.80555556 0.8
|
|
0.69230769 0.82352941 0.66666667 0.72727273]
|
|
|
|
mean value: 0.7696193737654838
|
|
|
|
key: train_jcc
|
|
value: [0.88 0.87417219 0.85714286 0.8534202 0.86513158 0.86601307
|
|
0.86885246 0.85483871 0.87417219 0.87086093]
|
|
|
|
mean value: 0.8664604170132448
|
|
|
|
MCC on Blind test: 0.22
|
|
|
|
Accuracy on Blind test: 0.49
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.26798725 0.26752782 0.26489639 0.26649332 0.25984526 0.26162148
|
|
0.2612102 0.26817322 0.26268578 0.26635146]
|
|
|
|
mean value: 0.26467921733856203
|
|
|
|
key: score_time
|
|
value: [0.00845337 0.00842595 0.00839472 0.0083878 0.00851393 0.00833416
|
|
0.00913382 0.00835061 0.00875974 0.00896358]
|
|
|
|
mean value: 0.008571767807006836
|
|
|
|
key: test_mcc
|
|
value: [1. 0.90369611 1. 0.93743687 0.93743687 0.90748521
|
|
0.96824584 0.96824584 0.96770777 0.8688172 ]
|
|
|
|
mean value: 0.9459071710309553
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.9516129 1. 0.96774194 0.96774194 0.9516129
|
|
0.98387097 0.98387097 0.98360656 0.93442623]
|
|
|
|
mean value: 0.9724484399788472
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.95081967 1. 0.96875 0.96875 0.94915254
|
|
0.98412698 0.98360656 0.98412698 0.93333333]
|
|
|
|
mean value: 0.972266607346838
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.96666667 1. 0.93939394 0.93939394 1.
|
|
0.96875 1. 0.96875 0.93333333]
|
|
|
|
mean value: 0.9716287878787879
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.93548387 1. 1. 1. 0.90322581
|
|
1. 0.96774194 1. 0.93333333]
|
|
|
|
mean value: 0.9739784946236559
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.9516129 1. 0.96774194 0.96774194 0.9516129
|
|
0.98387097 0.98387097 0.98333333 0.9344086 ]
|
|
|
|
mean value: 0.9724193548387097
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.90625 1. 0.93939394 0.93939394 0.90322581
|
|
0.96875 0.96774194 0.96875 0.875 ]
|
|
|
|
mean value: 0.9468505620723363
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.14
|
|
|
|
Accuracy on Blind test: 0.63
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01149559 0.01360273 0.01408195 0.01396298 0.0143764 0.01411986
|
|
0.01366615 0.01368761 0.01428699 0.01439714]
|
|
|
|
mean value: 0.013767743110656738
|
|
|
|
key: score_time
|
|
value: [0.01090598 0.01095629 0.01090288 0.01166439 0.01109648 0.01160717
|
|
0.01097107 0.01158404 0.01094365 0.01160645]
|
|
|
|
mean value: 0.011223840713500976
|
|
|
|
key: test_mcc
|
|
value: [0.3799803 0.51119863 0.54006172 0.74161985 0.56853524 0.56493268
|
|
0.50083542 0.43852901 0.72318666 0.76533557]
|
|
|
|
mean value: 0.5734215093600435
|
|
|
|
key: train_mcc
|
|
value: [0.4932785 0.76196204 0.69278522 0.72409686 0.56120987 0.54686874
|
|
0.76885315 0.49611447 0.76738608 0.73356387]
|
|
|
|
mean value: 0.6546118797369623
|
|
|
|
key: test_accuracy
|
|
value: [0.64516129 0.74193548 0.72580645 0.85483871 0.75806452 0.74193548
|
|
0.74193548 0.66129032 0.85245902 0.86885246]
|
|
|
|
mean value: 0.759227921734532
|
|
|
|
key: train_accuracy
|
|
value: [0.69784173 0.87230216 0.82553957 0.8471223 0.75359712 0.73021583
|
|
0.87410072 0.69964029 0.87791741 0.85098743]
|
|
|
|
mean value: 0.8029264559626984
|
|
|
|
key: test_fscore
|
|
value: [0.73170732 0.77777778 0.78481013 0.87323944 0.69387755 0.79487179
|
|
0.77142857 0.74698795 0.86956522 0.88235294]
|
|
|
|
mean value: 0.7926618685748723
|
|
|
|
key: train_fscore
|
|
value: [0.76731302 0.88455285 0.85099846 0.86614173 0.68649886 0.78753541
|
|
0.88709677 0.76837725 0.88741722 0.87010955]
|
|
|
|
mean value: 0.8256041120420929
|
|
|
|
key: test_precision
|
|
value: [0.58823529 0.68292683 0.64583333 0.775 0.94444444 0.65957447
|
|
0.69230769 0.59615385 0.78947368 0.78947368]
|
|
|
|
mean value: 0.7163423276131415
|
|
|
|
key: train_precision
|
|
value: [0.62387387 0.80712166 0.74262735 0.77030812 0.94339623 0.64953271
|
|
0.80409357 0.62528217 0.82208589 0.77222222]
|
|
|
|
mean value: 0.756054378747134
|
|
|
|
key: test_recall
|
|
value: [0.96774194 0.90322581 1. 1. 0.5483871 1.
|
|
0.87096774 1. 0.96774194 1. ]
|
|
|
|
mean value: 0.9258064516129032
|
|
|
|
key: train_recall
|
|
value: [0.99640288 0.97841727 0.99640288 0.98920863 0.53956835 1.
|
|
0.98920863 0.99640288 0.96402878 0.99641577]
|
|
|
|
mean value: 0.9446056058379103
|
|
|
|
key: test_roc_auc
|
|
value: [0.64516129 0.74193548 0.72580645 0.85483871 0.75806452 0.74193548
|
|
0.74193548 0.66129032 0.85053763 0.87096774]
|
|
|
|
mean value: 0.759247311827957
|
|
|
|
key: train_roc_auc
|
|
value: [0.69784173 0.87230216 0.82553957 0.8471223 0.75359712 0.73021583
|
|
0.87410072 0.69964029 0.87807174 0.85072587]
|
|
|
|
mean value: 0.8029157319305846
|
|
|
|
key: test_jcc
|
|
value: [0.57692308 0.63636364 0.64583333 0.775 0.53125 0.65957447
|
|
0.62790698 0.59615385 0.76923077 0.78947368]
|
|
|
|
mean value: 0.660770979104448
|
|
|
|
key: train_jcc
|
|
value: [0.62247191 0.79300292 0.74064171 0.76388889 0.52264808 0.64953271
|
|
0.79710145 0.62387387 0.79761905 0.7700831 ]
|
|
|
|
mean value: 0.7080863692848516
|
|
|
|
MCC on Blind test: 0.18
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02095389 0.03017879 0.03077292 0.03034782 0.03044152 0.03048611
|
|
0.03019238 0.03031898 0.0302968 0.03035188]
|
|
|
|
mean value: 0.02943410873413086
|
|
|
|
key: score_time
|
|
value: [0.0190351 0.02024627 0.02113628 0.01070428 0.01898575 0.01937699
|
|
0.02058935 0.01084757 0.0107224 0.01985407]
|
|
|
|
mean value: 0.017149806022644043
|
|
|
|
key: test_mcc
|
|
value: [0.96824584 0.81325006 0.83914639 0.87831007 0.96824584 0.93548387
|
|
0.90369611 0.93743687 0.80516731 0.8688172 ]
|
|
|
|
mean value: 0.8917799559713326
|
|
|
|
key: train_mcc
|
|
value: [0.93900081 0.93890359 0.91007783 0.9352518 0.92088714 0.92808157
|
|
0.91007783 0.92805755 0.92820949 0.93182991]
|
|
|
|
mean value: 0.9270377524969889
|
|
|
|
key: test_accuracy
|
|
value: [0.98387097 0.90322581 0.91935484 0.93548387 0.98387097 0.96774194
|
|
0.9516129 0.96774194 0.90163934 0.93442623]
|
|
|
|
mean value: 0.9448968799576943
|
|
|
|
key: train_accuracy
|
|
value: [0.96942446 0.96942446 0.95503597 0.9676259 0.96043165 0.96402878
|
|
0.95503597 0.96402878 0.96409336 0.96588869]
|
|
|
|
mean value: 0.9635018017901656
|
|
|
|
key: test_fscore
|
|
value: [0.98412698 0.90909091 0.92063492 0.93939394 0.98412698 0.96774194
|
|
0.95081967 0.96666667 0.9 0.93333333]
|
|
|
|
mean value: 0.9455935344988755
|
|
|
|
key: train_fscore
|
|
value: [0.96969697 0.96958855 0.95495495 0.9676259 0.96057348 0.96415771
|
|
0.95495495 0.96402878 0.96389892 0.96613191]
|
|
|
|
mean value: 0.9635612113921358
|
|
|
|
key: test_precision
|
|
value: [0.96875 0.85714286 0.90625 0.88571429 0.96875 0.96774194
|
|
0.96666667 1. 0.93103448 0.93333333]
|
|
|
|
mean value: 0.9385383561099634
|
|
|
|
key: train_precision
|
|
value: [0.96113074 0.96441281 0.9566787 0.9676259 0.95714286 0.96071429
|
|
0.9566787 0.96402878 0.9673913 0.96099291]
|
|
|
|
mean value: 0.9616796985424773
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 0.93548387 1. 1. 0.96774194
|
|
0.93548387 0.93548387 0.87096774 0.93333333]
|
|
|
|
mean value: 0.9546236559139785
|
|
|
|
key: train_recall
|
|
value: [0.97841727 0.97482014 0.95323741 0.9676259 0.96402878 0.9676259
|
|
0.95323741 0.96402878 0.96043165 0.97132616]
|
|
|
|
mean value: 0.9654779402284623
|
|
|
|
key: test_roc_auc
|
|
value: [0.98387097 0.90322581 0.91935484 0.93548387 0.98387097 0.96774194
|
|
0.9516129 0.96774194 0.90215054 0.9344086 ]
|
|
|
|
mean value: 0.9449462365591399
|
|
|
|
key: train_roc_auc
|
|
value: [0.96942446 0.96942446 0.95503597 0.9676259 0.96043165 0.96402878
|
|
0.95503597 0.96402878 0.9640868 0.96587891]
|
|
|
|
mean value: 0.9635001676078492
|
|
|
|
key: test_jcc
|
|
value: [0.96875 0.83333333 0.85294118 0.88571429 0.96875 0.9375
|
|
0.90625 0.93548387 0.81818182 0.875 ]
|
|
|
|
mean value: 0.8981904484667768
|
|
|
|
key: train_jcc
|
|
value: [0.94117647 0.94097222 0.9137931 0.93728223 0.92413793 0.93079585
|
|
0.9137931 0.93055556 0.93031359 0.93448276]
|
|
|
|
mean value: 0.9297302811483933
|
|
|
|
MCC on Blind test: 0.23
|
|
|
|
Accuracy on Blind test: 0.47
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: /home/tanu/git/LSHTM_analysis/scripts/ml/./katg_config.py:143: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
smnc_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./katg_config.py:146: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
smnc_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.20507646 0.19756031 0.19703841 0.19701099 0.20378065 0.19817948
|
|
0.19686937 0.19685602 0.19804025 0.20452499]
|
|
|
|
mean value: 0.1994936943054199
|
|
|
|
key: score_time
|
|
value: [0.01948881 0.02093601 0.01906753 0.02151203 0.02097845 0.01082182
|
|
0.02040362 0.01091933 0.02004528 0.01085353]
|
|
|
|
mean value: 0.017502641677856444
|
|
|
|
key: test_mcc
|
|
value: [0.96824584 0.84266484 0.90369611 0.90748521 0.96824584 0.96824584
|
|
0.96824584 0.93743687 0.87082935 0.83655914]
|
|
|
|
mean value: 0.9171654872995563
|
|
|
|
key: train_mcc
|
|
value: [0.94254361 0.94619622 0.94609826 0.94966486 0.94609826 0.94966486
|
|
0.93890359 0.95339163 0.95691189 0.9534734 ]
|
|
|
|
mean value: 0.948294657254694
|
|
|
|
key: test_accuracy
|
|
value: [0.98387097 0.91935484 0.9516129 0.9516129 0.98387097 0.98387097
|
|
0.98387097 0.96774194 0.93442623 0.91803279]
|
|
|
|
mean value: 0.9578265468006346
|
|
|
|
key: train_accuracy
|
|
value: [0.97122302 0.97302158 0.97302158 0.97482014 0.97302158 0.97482014
|
|
0.96942446 0.97661871 0.97845601 0.97666068]
|
|
|
|
mean value: 0.9741087919610452
|
|
|
|
key: test_fscore
|
|
value: [0.98412698 0.92307692 0.95238095 0.95384615 0.98412698 0.98360656
|
|
0.98412698 0.96666667 0.93333333 0.91803279]
|
|
|
|
mean value: 0.9583324325947277
|
|
|
|
key: train_fscore
|
|
value: [0.97142857 0.97326203 0.97316637 0.97491039 0.97316637 0.97491039
|
|
0.96958855 0.97682709 0.97841727 0.97690941]
|
|
|
|
mean value: 0.9742586454574466
|
|
|
|
key: test_precision
|
|
value: [0.96875 0.88235294 0.9375 0.91176471 0.96875 1.
|
|
0.96875 1. 0.96551724 0.90322581]
|
|
|
|
mean value: 0.9506610694889747
|
|
|
|
key: train_precision
|
|
value: [0.96453901 0.96466431 0.96797153 0.97142857 0.96797153 0.97142857
|
|
0.96441281 0.96819788 0.97841727 0.96830986]
|
|
|
|
mean value: 0.9687341337990163
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 0.96774194 1. 1. 0.96774194
|
|
1. 0.93548387 0.90322581 0.93333333]
|
|
|
|
mean value: 0.9675268817204301
|
|
|
|
key: train_recall
|
|
value: [0.97841727 0.98201439 0.97841727 0.97841727 0.97841727 0.97841727
|
|
0.97482014 0.98561151 0.97841727 0.98566308]
|
|
|
|
mean value: 0.9798612722725046
|
|
|
|
key: test_roc_auc
|
|
value: [0.98387097 0.91935484 0.9516129 0.9516129 0.98387097 0.98387097
|
|
0.98387097 0.96774194 0.93494624 0.91827957]
|
|
|
|
mean value: 0.9579032258064516
|
|
|
|
key: train_roc_auc
|
|
value: [0.97122302 0.97302158 0.97302158 0.97482014 0.97302158 0.97482014
|
|
0.96942446 0.97661871 0.97845594 0.97664449]
|
|
|
|
mean value: 0.9741071658801991
|
|
|
|
key: test_jcc
|
|
value: [0.96875 0.85714286 0.90909091 0.91176471 0.96875 0.96774194
|
|
0.96875 0.93548387 0.875 0.84848485]
|
|
|
|
mean value: 0.921095912705258
|
|
|
|
key: train_jcc
|
|
value: [0.94444444 0.94791667 0.94773519 0.95104895 0.94773519 0.95104895
|
|
0.94097222 0.95470383 0.95774648 0.95486111]
|
|
|
|
mean value: 0.9498213041443461
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.44
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.04743552 0.02349114 0.02685213 0.02750683 0.0252192 0.0279355
|
|
0.03556037 0.04037642 0.03911209 0.03533268]
|
|
|
|
mean value: 0.032882189750671385
|
|
|
|
key: score_time
|
|
value: [0.01078486 0.01099777 0.01306605 0.01077437 0.01067662 0.01066399
|
|
0.01071048 0.01073122 0.01087689 0.01084948]
|
|
|
|
mean value: 0.011013174057006836
|
|
|
|
key: test_mcc
|
|
value: [0.96824584 0.7130241 0.83914639 0.90748521 0.79471941 0.93548387
|
|
0.71004695 0.80813523 0.77096774 0.87082935]
|
|
|
|
mean value: 0.8318084093587729
|
|
|
|
key: train_mcc
|
|
value: [0.87424213 0.85278837 0.83904739 0.84537297 0.85265591 0.84192273
|
|
0.83904739 0.85646981 0.84627216 0.84586123]
|
|
|
|
mean value: 0.8493680080976538
|
|
|
|
key: test_accuracy
|
|
value: [0.98387097 0.85483871 0.91935484 0.9516129 0.88709677 0.96774194
|
|
0.85483871 0.90322581 0.8852459 0.93442623]
|
|
|
|
mean value: 0.9142252776308831
|
|
|
|
key: train_accuracy
|
|
value: [0.93705036 0.92625899 0.91906475 0.92266187 0.92625899 0.92086331
|
|
0.91906475 0.92805755 0.92280072 0.92280072]
|
|
|
|
mean value: 0.9244882011805278
|
|
|
|
key: test_fscore
|
|
value: [0.98412698 0.86153846 0.92063492 0.95384615 0.89855072 0.96774194
|
|
0.85714286 0.9 0.8852459 0.93548387]
|
|
|
|
mean value: 0.9164311810018015
|
|
|
|
key: train_fscore
|
|
value: [0.93761141 0.92717584 0.92091388 0.92307692 0.92691622 0.92170819
|
|
0.92091388 0.92907801 0.92416226 0.92389381]
|
|
|
|
mean value: 0.9255450426062092
|
|
|
|
key: test_precision
|
|
value: [0.96875 0.82352941 0.90625 0.91176471 0.81578947 0.96774194
|
|
0.84375 0.93103448 0.9 0.90625 ]
|
|
|
|
mean value: 0.8974860009573761
|
|
|
|
key: train_precision
|
|
value: [0.92932862 0.91578947 0.90034364 0.91814947 0.91872792 0.91197183
|
|
0.90034364 0.91608392 0.90657439 0.91258741]
|
|
|
|
mean value: 0.9129900316323134
|
|
|
|
key: test_recall
|
|
value: [1. 0.90322581 0.93548387 1. 1. 0.96774194
|
|
0.87096774 0.87096774 0.87096774 0.96666667]
|
|
|
|
mean value: 0.9386021505376344
|
|
|
|
key: train_recall
|
|
value: [0.94604317 0.93884892 0.94244604 0.92805755 0.9352518 0.93165468
|
|
0.94244604 0.94244604 0.94244604 0.93548387]
|
|
|
|
mean value: 0.9385124158737526
|
|
|
|
key: test_roc_auc
|
|
value: [0.98387097 0.85483871 0.91935484 0.9516129 0.88709677 0.96774194
|
|
0.85483871 0.90322581 0.88548387 0.93494624]
|
|
|
|
mean value: 0.9143010752688172
|
|
|
|
key: train_roc_auc
|
|
value: [0.93705036 0.92625899 0.91906475 0.92266187 0.92625899 0.92086331
|
|
0.91906475 0.92805755 0.92283592 0.92277791]
|
|
|
|
mean value: 0.9244894407055001
|
|
|
|
key: test_jcc
|
|
value: [0.96875 0.75675676 0.85294118 0.91176471 0.81578947 0.9375
|
|
0.75 0.81818182 0.79411765 0.87878788]
|
|
|
|
mean value: 0.8484589456822429
|
|
|
|
key: train_jcc
|
|
value: [0.88255034 0.86423841 0.8534202 0.85714286 0.86378738 0.85478548
|
|
0.8534202 0.86754967 0.85901639 0.85855263]
|
|
|
|
mean value: 0.8614463542047712
|
|
|
|
MCC on Blind test: 0.21
|
|
|
|
Accuracy on Blind test: 0.53
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.72030234 0.71688986 0.82023787 0.6914432 0.76275826 0.84668159
|
|
0.7217288 0.70235157 0.860708 0.69467163]
|
|
|
|
mean value: 0.7537773132324219
|
|
|
|
key: score_time
|
|
value: [0.01084113 0.01207328 0.01228642 0.01954889 0.0122242 0.01225781
|
|
0.01232028 0.0122869 0.01229262 0.01234746]
|
|
|
|
mean value: 0.012847900390625
|
|
|
|
key: test_mcc
|
|
value: [0.93743687 0.90369611 1. 0.90369611 0.87096774 0.93548387
|
|
0.90369611 0.87278605 0.93649139 0.87082935]
|
|
|
|
mean value: 0.9135083615653431
|
|
|
|
key: train_mcc
|
|
value: [0.95329292 0.94634322 0.95685929 0.95685929 0.97482645 0.96043787
|
|
0.93195016 0.96048758 0.96050901 0.98205307]
|
|
|
|
mean value: 0.9583618868215811
|
|
|
|
key: test_accuracy
|
|
value: [0.96774194 0.9516129 1. 0.9516129 0.93548387 0.96774194
|
|
0.9516129 0.93548387 0.96721311 0.93442623]
|
|
|
|
mean value: 0.9562929666842941
|
|
|
|
key: train_accuracy
|
|
value: [0.97661871 0.97302158 0.97841727 0.97841727 0.98741007 0.98021583
|
|
0.96582734 0.98021583 0.98025135 0.99102334]
|
|
|
|
mean value: 0.9791418570708963
|
|
|
|
key: test_fscore
|
|
value: [0.96875 0.95238095 1. 0.95238095 0.93548387 0.96774194
|
|
0.95238095 0.93333333 0.96666667 0.93548387]
|
|
|
|
mean value: 0.9564602534562212
|
|
|
|
key: train_fscore
|
|
value: [0.97674419 0.97335702 0.97833935 0.97849462 0.98738739 0.98025135
|
|
0.96625222 0.980322 0.98025135 0.99102334]
|
|
|
|
mean value: 0.9792422819398573
|
|
|
|
key: test_precision
|
|
value: [0.93939394 0.9375 1. 0.9375 0.93548387 0.96774194
|
|
0.9375 0.96551724 1. 0.90625 ]
|
|
|
|
mean value: 0.9526886987224863
|
|
|
|
key: train_precision
|
|
value: [0.97153025 0.96140351 0.98188406 0.975 0.98916968 0.97849462
|
|
0.95438596 0.97508897 0.97849462 0.99280576]
|
|
|
|
mean value: 0.975825742653484
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 1. 0.96774194 0.93548387 0.96774194
|
|
0.96774194 0.90322581 0.93548387 0.96666667]
|
|
|
|
mean value: 0.9611827956989247
|
|
|
|
key: train_recall
|
|
value: [0.98201439 0.98561151 0.97482014 0.98201439 0.98561151 0.98201439
|
|
0.97841727 0.98561151 0.98201439 0.98924731]
|
|
|
|
mean value: 0.9827376808230834
|
|
|
|
key: test_roc_auc
|
|
value: [0.96774194 0.9516129 1. 0.9516129 0.93548387 0.96774194
|
|
0.9516129 0.93548387 0.96774194 0.93494624]
|
|
|
|
mean value: 0.9563978494623656
|
|
|
|
key: train_roc_auc
|
|
value: [0.97661871 0.97302158 0.97841727 0.97841727 0.98741007 0.98021583
|
|
0.96582734 0.98021583 0.98025451 0.99102653]
|
|
|
|
mean value: 0.9791424924576468
|
|
|
|
key: test_jcc
|
|
value: [0.93939394 0.90909091 1. 0.90909091 0.87878788 0.9375
|
|
0.90909091 0.875 0.93548387 0.87878788]
|
|
|
|
mean value: 0.9172226295210166
|
|
|
|
key: train_jcc
|
|
value: [0.95454545 0.94809689 0.95759717 0.95789474 0.97508897 0.96126761
|
|
0.9347079 0.96140351 0.96126761 0.98220641]
|
|
|
|
mean value: 0.959407624783067
|
|
|
|
MCC on Blind test: 0.14
|
|
|
|
Accuracy on Blind test: 0.35
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01091671 0.01010871 0.00867534 0.00845599 0.0082767 0.00828934
|
|
0.00830674 0.00770044 0.00745106 0.00766039]
|
|
|
|
mean value: 0.008584141731262207
|
|
|
|
key: score_time
|
|
value: [0.01369882 0.00904274 0.0089016 0.00860572 0.00860953 0.00857306
|
|
0.00860238 0.00795913 0.00795174 0.00807285]
|
|
|
|
mean value: 0.009001755714416504
|
|
|
|
key: test_mcc
|
|
value: [0.78446454 0.51856298 0.71004695 0.84266484 0.7190925 0.67883359
|
|
0.51639778 0.84266484 0.67204301 0.73763441]
|
|
|
|
mean value: 0.7022405434817621
|
|
|
|
key: train_mcc
|
|
value: [0.70405758 0.72340077 0.71605437 0.70505422 0.73033396 0.71230395
|
|
0.70505422 0.70180672 0.72391206 0.73070576]
|
|
|
|
mean value: 0.7152683609552583
|
|
|
|
key: test_accuracy
|
|
value: [0.88709677 0.75806452 0.85483871 0.91935484 0.85483871 0.83870968
|
|
0.75806452 0.91935484 0.83606557 0.86885246]
|
|
|
|
mean value: 0.8495240613432047
|
|
|
|
key: train_accuracy
|
|
value: [0.84532374 0.86151079 0.85791367 0.85251799 0.86510791 0.85611511
|
|
0.85251799 0.85071942 0.86175943 0.86535009]
|
|
|
|
mean value: 0.8568836133965358
|
|
|
|
key: test_fscore
|
|
value: [0.89552239 0.76923077 0.85714286 0.92307692 0.86567164 0.84375
|
|
0.75409836 0.91525424 0.83870968 0.86666667]
|
|
|
|
mean value: 0.852912352133119
|
|
|
|
key: train_fscore
|
|
value: [0.85901639 0.86371681 0.85968028 0.85304659 0.86631016 0.85714286
|
|
0.85304659 0.85309735 0.86371681 0.86535009]
|
|
|
|
mean value: 0.8594123948387209
|
|
|
|
key: test_precision
|
|
value: [0.83333333 0.73529412 0.84375 0.88235294 0.80555556 0.81818182
|
|
0.76666667 0.96428571 0.83870968 0.86666667]
|
|
|
|
mean value: 0.8354796490932639
|
|
|
|
key: train_precision
|
|
value: [0.78915663 0.85017422 0.84912281 0.85 0.85865724 0.85106383
|
|
0.85 0.83972125 0.85017422 0.86690647]
|
|
|
|
mean value: 0.845497666835835
|
|
|
|
key: test_recall
|
|
value: [0.96774194 0.80645161 0.87096774 0.96774194 0.93548387 0.87096774
|
|
0.74193548 0.87096774 0.83870968 0.86666667]
|
|
|
|
mean value: 0.8737634408602151
|
|
|
|
key: train_recall
|
|
value: [0.94244604 0.87769784 0.8705036 0.85611511 0.87410072 0.86330935
|
|
0.85611511 0.86690647 0.87769784 0.86379928]
|
|
|
|
mean value: 0.8748691369485058
|
|
|
|
key: test_roc_auc
|
|
value: [0.88709677 0.75806452 0.85483871 0.91935484 0.85483871 0.83870968
|
|
0.75806452 0.91935484 0.83602151 0.8688172 ]
|
|
|
|
mean value: 0.8495161290322581
|
|
|
|
key: train_roc_auc
|
|
value: [0.84532374 0.86151079 0.85791367 0.85251799 0.86510791 0.85611511
|
|
0.85251799 0.85071942 0.86178799 0.86535288]
|
|
|
|
mean value: 0.8568867486655837
|
|
|
|
key: test_jcc
|
|
value: [0.81081081 0.625 0.75 0.85714286 0.76315789 0.72972973
|
|
0.60526316 0.84375 0.72222222 0.76470588]
|
|
|
|
mean value: 0.747178255489014
|
|
|
|
key: train_jcc
|
|
value: [0.75287356 0.76012461 0.75389408 0.74375 0.76415094 0.75
|
|
0.74375 0.74382716 0.76012461 0.76265823]
|
|
|
|
mean value: 0.7535153197137231
|
|
|
|
MCC on Blind test: 0.21
|
|
|
|
Accuracy on Blind test: 0.57
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00817347 0.00792408 0.00842571 0.00832367 0.00831985 0.00922894
|
|
0.00865865 0.00853586 0.00858712 0.00871468]
|
|
|
|
mean value: 0.008489203453063966
|
|
|
|
key: score_time
|
|
value: [0.0080893 0.00804472 0.00845718 0.00870252 0.00849962 0.00911498
|
|
0.00868034 0.00858021 0.00858331 0.00869846]
|
|
|
|
mean value: 0.00854506492614746
|
|
|
|
key: test_mcc
|
|
value: [0.61807005 0.65372045 0.45374261 0.71004695 0.51856298 0.71004695
|
|
0.42023032 0.74193548 0.54251915 0.57419355]
|
|
|
|
mean value: 0.5943068479116385
|
|
|
|
key: train_mcc
|
|
value: [0.61176415 0.63718965 0.62604511 0.60075441 0.62596408 0.60075441
|
|
0.65528703 0.62262853 0.64839945 0.64106733]
|
|
|
|
mean value: 0.6269854139141487
|
|
|
|
key: test_accuracy
|
|
value: [0.80645161 0.82258065 0.72580645 0.85483871 0.75806452 0.85483871
|
|
0.70967742 0.87096774 0.7704918 0.78688525]
|
|
|
|
mean value: 0.7960602855631941
|
|
|
|
key: train_accuracy
|
|
value: [0.8057554 0.81834532 0.81294964 0.80035971 0.81294964 0.80035971
|
|
0.82733813 0.81115108 0.82405745 0.82046679]
|
|
|
|
mean value: 0.8133732870077367
|
|
|
|
key: test_fscore
|
|
value: [0.79310345 0.8358209 0.71186441 0.85245902 0.76923077 0.85245902
|
|
0.71875 0.87096774 0.76666667 0.78688525]
|
|
|
|
mean value: 0.7958207207099356
|
|
|
|
key: train_fscore
|
|
value: [0.80851064 0.82186949 0.81090909 0.79927667 0.8115942 0.79927667
|
|
0.83098592 0.81415929 0.82624113 0.82269504]
|
|
|
|
mean value: 0.814551814377158
|
|
|
|
key: test_precision
|
|
value: [0.85185185 0.77777778 0.75 0.86666667 0.73529412 0.86666667
|
|
0.6969697 0.87096774 0.79310345 0.77419355]
|
|
|
|
mean value: 0.7983491516178162
|
|
|
|
key: train_precision
|
|
value: [0.7972028 0.80622837 0.81985294 0.80363636 0.81751825 0.80363636
|
|
0.8137931 0.80139373 0.81468531 0.81403509]
|
|
|
|
mean value: 0.8091982321605484
|
|
|
|
key: test_recall
|
|
value: [0.74193548 0.90322581 0.67741935 0.83870968 0.80645161 0.83870968
|
|
0.74193548 0.87096774 0.74193548 0.8 ]
|
|
|
|
mean value: 0.7961290322580645
|
|
|
|
key: train_recall
|
|
value: [0.82014388 0.8381295 0.80215827 0.79496403 0.8057554 0.79496403
|
|
0.84892086 0.82733813 0.8381295 0.83154122]
|
|
|
|
mean value: 0.8202044815760295
|
|
|
|
key: test_roc_auc
|
|
value: [0.80645161 0.82258065 0.72580645 0.85483871 0.75806452 0.85483871
|
|
0.70967742 0.87096774 0.77096774 0.78709677]
|
|
|
|
mean value: 0.7961290322580645
|
|
|
|
key: train_roc_auc
|
|
value: [0.8057554 0.81834532 0.81294964 0.80035971 0.81294964 0.80035971
|
|
0.82733813 0.81115108 0.82408267 0.82044687]
|
|
|
|
mean value: 0.8133738170753719
|
|
|
|
key: test_jcc
|
|
value: [0.65714286 0.71794872 0.55263158 0.74285714 0.625 0.74285714
|
|
0.56097561 0.77142857 0.62162162 0.64864865]
|
|
|
|
mean value: 0.6641111891208169
|
|
|
|
key: train_jcc
|
|
value: [0.67857143 0.69760479 0.68195719 0.66566265 0.68292683 0.66566265
|
|
0.71084337 0.68656716 0.70392749 0.69879518]
|
|
|
|
mean value: 0.6872518746851146
|
|
|
|
MCC on Blind test: 0.18
|
|
|
|
Accuracy on Blind test: 0.52
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00818872 0.00765157 0.00800991 0.00798917 0.00798535 0.00793529
|
|
0.00807238 0.00833344 0.00816226 0.00822878]
|
|
|
|
mean value: 0.008055686950683594
|
|
|
|
key: score_time
|
|
value: [0.01362538 0.01161528 0.01154208 0.01181722 0.01149821 0.01183558
|
|
0.01181483 0.01180124 0.01576948 0.01188374]
|
|
|
|
mean value: 0.012320303916931152
|
|
|
|
key: test_mcc
|
|
value: [0.7130241 0.61418277 0.5483871 0.77459667 0.51856298 0.74348441
|
|
0.58834841 0.61807005 0.60818119 0.57576971]
|
|
|
|
mean value: 0.6302607385125394
|
|
|
|
key: train_mcc
|
|
value: [0.7014797 0.74464768 0.73388892 0.71949894 0.75180343 0.71341277
|
|
0.73033396 0.70918848 0.73474672 0.73420349]
|
|
|
|
mean value: 0.7273204091578028
|
|
|
|
key: test_accuracy
|
|
value: [0.85483871 0.80645161 0.77419355 0.88709677 0.75806452 0.87096774
|
|
0.79032258 0.80645161 0.80327869 0.78688525]
|
|
|
|
mean value: 0.8138551031200423
|
|
|
|
key: train_accuracy
|
|
value: [0.85071942 0.87230216 0.86690647 0.85971223 0.87589928 0.85611511
|
|
0.86510791 0.85431655 0.86714542 0.86535009]
|
|
|
|
mean value: 0.8633574648360306
|
|
|
|
key: test_fscore
|
|
value: [0.84745763 0.8125 0.77419355 0.8852459 0.76923077 0.86666667
|
|
0.80597015 0.79310345 0.8 0.77192982]
|
|
|
|
mean value: 0.8126297935133517
|
|
|
|
key: train_fscore
|
|
value: [0.84990958 0.8716094 0.86594203 0.85869565 0.87567568 0.85185185
|
|
0.86388385 0.85137615 0.86446886 0.85875706]
|
|
|
|
mean value: 0.8612170116983378
|
|
|
|
key: test_precision
|
|
value: [0.89285714 0.78787879 0.77419355 0.9 0.73529412 0.89655172
|
|
0.75 0.85185185 0.82758621 0.81481481]
|
|
|
|
mean value: 0.8231028194471236
|
|
|
|
key: train_precision
|
|
value: [0.85454545 0.87636364 0.87226277 0.8649635 0.87725632 0.8778626
|
|
0.87179487 0.86891386 0.88059701 0.9047619 ]
|
|
|
|
mean value: 0.8749321930550784
|
|
|
|
key: test_recall
|
|
value: [0.80645161 0.83870968 0.77419355 0.87096774 0.80645161 0.83870968
|
|
0.87096774 0.74193548 0.77419355 0.73333333]
|
|
|
|
mean value: 0.8055913978494623
|
|
|
|
key: train_recall
|
|
value: [0.84532374 0.86690647 0.85971223 0.85251799 0.87410072 0.82733813
|
|
0.85611511 0.83453237 0.84892086 0.8172043 ]
|
|
|
|
mean value: 0.848267192697455
|
|
|
|
key: test_roc_auc
|
|
value: [0.85483871 0.80645161 0.77419355 0.88709677 0.75806452 0.87096774
|
|
0.79032258 0.80645161 0.80376344 0.78602151]
|
|
|
|
mean value: 0.8138172043010753
|
|
|
|
key: train_roc_auc
|
|
value: [0.85071942 0.87230216 0.86690647 0.85971223 0.87589928 0.85611511
|
|
0.86510791 0.85431655 0.86711276 0.86543668]
|
|
|
|
mean value: 0.8633628581006163
|
|
|
|
key: test_jcc
|
|
value: [0.73529412 0.68421053 0.63157895 0.79411765 0.625 0.76470588
|
|
0.675 0.65714286 0.66666667 0.62857143]
|
|
|
|
mean value: 0.6862288073123987
|
|
|
|
key: train_jcc
|
|
value: [0.73899371 0.7724359 0.76357827 0.75238095 0.77884615 0.74193548
|
|
0.76038339 0.74121406 0.76129032 0.75247525]
|
|
|
|
mean value: 0.7563533487181033
|
|
|
|
MCC on Blind test: 0.16
|
|
|
|
Accuracy on Blind test: 0.57
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01835275 0.01664305 0.01745725 0.01704788 0.01895404 0.01934195
|
|
0.01825833 0.01938081 0.01911664 0.01921248]
|
|
|
|
mean value: 0.0183765172958374
|
|
|
|
key: score_time
|
|
value: [0.00940108 0.01006126 0.00929928 0.00980639 0.01032066 0.01062155
|
|
0.01041937 0.01055121 0.01047111 0.01046944]
|
|
|
|
mean value: 0.010142135620117187
|
|
|
|
key: test_mcc
|
|
value: [0.96824584 0.66226618 0.62471615 0.7190925 0.7284928 0.80813523
|
|
0.50083542 0.80645161 0.63939757 0.81978229]
|
|
|
|
mean value: 0.7277415590359753
|
|
|
|
key: train_mcc
|
|
value: [0.85345163 0.77632088 0.79541168 0.777078 0.76906554 0.75930753
|
|
0.79995316 0.76580581 0.77932355 0.78519796]
|
|
|
|
mean value: 0.78609157351081
|
|
|
|
key: test_accuracy
|
|
value: [0.98387097 0.82258065 0.80645161 0.85483871 0.85483871 0.90322581
|
|
0.74193548 0.90322581 0.81967213 0.90163934]
|
|
|
|
mean value: 0.859227921734532
|
|
|
|
key: train_accuracy
|
|
value: [0.92625899 0.88489209 0.89568345 0.88489209 0.88129496 0.87589928
|
|
0.89748201 0.8794964 0.88689408 0.89048474]
|
|
|
|
mean value: 0.890327809565633
|
|
|
|
key: test_fscore
|
|
value: [0.98412698 0.84057971 0.82352941 0.86567164 0.86956522 0.90625
|
|
0.77142857 0.90322581 0.82539683 0.90909091]
|
|
|
|
mean value: 0.8698865077586886
|
|
|
|
key: train_fscore
|
|
value: [0.92794376 0.89189189 0.90068493 0.89225589 0.88851351 0.88403361
|
|
0.90289608 0.88701518 0.89303905 0.89608177]
|
|
|
|
mean value: 0.8964355683391803
|
|
|
|
key: test_precision
|
|
value: [0.96875 0.76315789 0.75675676 0.80555556 0.78947368 0.87878788
|
|
0.69230769 0.90322581 0.8125 0.83333333]
|
|
|
|
mean value: 0.8203848602140198
|
|
|
|
key: train_precision
|
|
value: [0.90721649 0.84076433 0.85947712 0.83860759 0.83757962 0.829653
|
|
0.85760518 0.83492063 0.84565916 0.8538961 ]
|
|
|
|
mean value: 0.8505379240652493
|
|
|
|
key: test_recall
|
|
value: [1. 0.93548387 0.90322581 0.93548387 0.96774194 0.93548387
|
|
0.87096774 0.90322581 0.83870968 1. ]
|
|
|
|
mean value: 0.9290322580645161
|
|
|
|
key: train_recall
|
|
value: [0.94964029 0.94964029 0.94604317 0.95323741 0.94604317 0.94604317
|
|
0.95323741 0.94604317 0.94604317 0.94265233]
|
|
|
|
mean value: 0.9478623552770686
|
|
|
|
key: test_roc_auc
|
|
value: [0.98387097 0.82258065 0.80645161 0.85483871 0.85483871 0.90322581
|
|
0.74193548 0.90322581 0.81935484 0.90322581]
|
|
|
|
mean value: 0.8593548387096774
|
|
|
|
key: train_roc_auc
|
|
value: [0.92625899 0.88489209 0.89568345 0.88489209 0.88129496 0.87589928
|
|
0.89748201 0.8794964 0.88700008 0.89039091]
|
|
|
|
mean value: 0.8903290271008999
|
|
|
|
key: test_jcc
|
|
value: [0.96875 0.725 0.7 0.76315789 0.76923077 0.82857143
|
|
0.62790698 0.82352941 0.7027027 0.83333333]
|
|
|
|
mean value: 0.7742182517083968
|
|
|
|
key: train_jcc
|
|
value: [0.86557377 0.80487805 0.81931464 0.80547112 0.7993921 0.79216867
|
|
0.82298137 0.7969697 0.80674847 0.8117284 ]
|
|
|
|
mean value: 0.8125226282348854
|
|
|
|
MCC on Blind test: 0.26
|
|
|
|
Accuracy on Blind test: 0.5
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.65558004 1.55461335 1.68518972 1.59982467 1.59840369 1.78556776
|
|
1.98289418 1.71361303 1.71010733 1.56034899]
|
|
|
|
mean value: 1.6846142768859864
|
|
|
|
key: score_time
|
|
value: [0.01405716 0.02408385 0.01391459 0.01108027 0.01359916 0.01913881
|
|
0.01201797 0.01144147 0.01147699 0.01196384]
|
|
|
|
mean value: 0.014277410507202149
|
|
|
|
key: test_mcc
|
|
value: [1. 0.90369611 0.93548387 0.96824584 0.93743687 0.90369611
|
|
0.93548387 0.93743687 0.87082935 0.90215054]
|
|
|
|
mean value: 0.9294459430210258
|
|
|
|
key: train_mcc
|
|
value: [0.99283145 0.98561151 0.99283145 0.98921503 0.99283145 0.98921503
|
|
0.98202074 0.99640932 0.99284416 0.99641577]
|
|
|
|
mean value: 0.9910225917811445
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.9516129 0.96774194 0.98387097 0.96774194 0.9516129
|
|
0.96774194 0.96774194 0.93442623 0.95081967]
|
|
|
|
mean value: 0.9643310417768377
|
|
|
|
key: train_accuracy
|
|
value: [0.99640288 0.99280576 0.99640288 0.99460432 0.99640288 0.99460432
|
|
0.99100719 0.99820144 0.99640934 0.99820467]
|
|
|
|
mean value: 0.9955045658266923
|
|
|
|
key: test_fscore
|
|
value: [1. 0.95238095 0.96774194 0.98412698 0.96875 0.95081967
|
|
0.96774194 0.96666667 0.93333333 0.95081967]
|
|
|
|
mean value: 0.9642381151737973
|
|
|
|
key: train_fscore
|
|
value: [0.99638989 0.99280576 0.99638989 0.99459459 0.99638989 0.99459459
|
|
0.99102334 0.9981982 0.99638989 0.99820467]
|
|
|
|
mean value: 0.9954980716751404
|
|
|
|
key: test_precision
|
|
value: [1. 0.9375 0.96774194 0.96875 0.93939394 0.96666667
|
|
0.96774194 1. 0.96551724 0.93548387]
|
|
|
|
mean value: 0.9648795589375401
|
|
|
|
key: train_precision
|
|
value: [1. 0.99280576 1. 0.99638989 1. 0.99638989
|
|
0.98924731 1. 1. 1. ]
|
|
|
|
mean value: 0.9974832850617142
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 0.96774194 1. 1. 0.93548387
|
|
0.96774194 0.93548387 0.90322581 0.96666667]
|
|
|
|
mean value: 0.9644086021505376
|
|
|
|
key: train_recall
|
|
value: [0.99280576 0.99280576 0.99280576 0.99280576 0.99280576 0.99280576
|
|
0.99280576 0.99640288 0.99280576 0.99641577]
|
|
|
|
mean value: 0.9935264691472628
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.9516129 0.96774194 0.98387097 0.96774194 0.9516129
|
|
0.96774194 0.96774194 0.93494624 0.95107527]
|
|
|
|
mean value: 0.9644086021505377
|
|
|
|
key: train_roc_auc
|
|
value: [0.99640288 0.99280576 0.99640288 0.99460432 0.99640288 0.99460432
|
|
0.99100719 0.99820144 0.99640288 0.99820789]
|
|
|
|
mean value: 0.995504241767876
|
|
|
|
key: test_jcc
|
|
value: [1. 0.90909091 0.9375 0.96875 0.93939394 0.90625
|
|
0.9375 0.93548387 0.875 0.90625 ]
|
|
|
|
mean value: 0.931521871945259
|
|
|
|
key: train_jcc
|
|
value: [0.99280576 0.98571429 0.99280576 0.98924731 0.99280576 0.98924731
|
|
0.98220641 0.99640288 0.99280576 0.99641577]
|
|
|
|
mean value: 0.9910456984954045
|
|
|
|
MCC on Blind test: 0.15
|
|
|
|
Accuracy on Blind test: 0.35
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01399827 0.01331663 0.01154613 0.01056981 0.01088071 0.00967002
|
|
0.00976062 0.00975561 0.0097971 0.00941205]
|
|
|
|
mean value: 0.010870695114135742
|
|
|
|
key: score_time
|
|
value: [0.01116037 0.00968766 0.00949836 0.00883269 0.00868964 0.00786233
|
|
0.00786996 0.00779343 0.0077889 0.0078299 ]
|
|
|
|
mean value: 0.008701324462890625
|
|
|
|
key: test_mcc
|
|
value: [1. 0.87096774 1. 0.96824584 0.90369611 0.87831007
|
|
0.87831007 0.96824584 0.96774194 0.90215054]
|
|
|
|
mean value: 0.9337668133579895
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.93548387 1. 0.98387097 0.9516129 0.93548387
|
|
0.93548387 0.98387097 0.98360656 0.95081967]
|
|
|
|
mean value: 0.9660232681121099
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.93548387 1. 0.98412698 0.95238095 0.93103448
|
|
0.93103448 0.98360656 0.98360656 0.95081967]
|
|
|
|
mean value: 0.9652093559878165
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.93548387 1. 0.96875 0.9375 1.
|
|
1. 1. 1. 0.93548387]
|
|
|
|
mean value: 0.9777217741935483
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.93548387 1. 1. 0.96774194 0.87096774
|
|
0.87096774 0.96774194 0.96774194 0.96666667]
|
|
|
|
mean value: 0.9547311827956989
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.93548387 1. 0.98387097 0.9516129 0.93548387
|
|
0.93548387 0.98387097 0.98387097 0.95107527]
|
|
|
|
mean value: 0.9660752688172043
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.87878788 1. 0.96875 0.90909091 0.87096774
|
|
0.87096774 0.96774194 0.96774194 0.90625 ]
|
|
|
|
mean value: 0.9340298142717498
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.01
|
|
|
|
Accuracy on Blind test: 0.2
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10142112 0.10165668 0.10126591 0.10151219 0.11299324 0.11322856
|
|
0.108289 0.1019156 0.10539865 0.1026175 ]
|
|
|
|
mean value: 0.10502984523773193
|
|
|
|
key: score_time
|
|
value: [0.0171802 0.0173862 0.01719642 0.01748347 0.01896811 0.01896906
|
|
0.01711893 0.01859283 0.01831841 0.01735854]
|
|
|
|
mean value: 0.01785721778869629
|
|
|
|
key: test_mcc
|
|
value: [1. 0.90369611 0.93548387 0.93548387 0.93743687 0.93548387
|
|
0.93743687 0.96824584 0.96770777 0.90215054]
|
|
|
|
mean value: 0.9423125607021228
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.9516129 0.96774194 0.96774194 0.96774194 0.96774194
|
|
0.96774194 0.98387097 0.98360656 0.95081967]
|
|
|
|
mean value: 0.9708619777895293
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.95238095 0.96774194 0.96774194 0.96875 0.96774194
|
|
0.96666667 0.98360656 0.98412698 0.95081967]
|
|
|
|
mean value: 0.9709576639134413
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.9375 0.96774194 0.96774194 0.93939394 0.96774194
|
|
1. 1. 0.96875 0.93548387]
|
|
|
|
mean value: 0.9684353616813295
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 0.96774194 0.96774194 1. 0.96774194
|
|
0.93548387 0.96774194 1. 0.96666667]
|
|
|
|
mean value: 0.9740860215053764
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.9516129 0.96774194 0.96774194 0.96774194 0.96774194
|
|
0.96774194 0.98387097 0.98333333 0.95107527]
|
|
|
|
mean value: 0.9708602150537635
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.90909091 0.9375 0.9375 0.93939394 0.9375
|
|
0.93548387 0.96774194 0.96875 0.90625 ]
|
|
|
|
mean value: 0.9439210654936462
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.36
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00819397 0.00782299 0.00878263 0.00824928 0.007725 0.00857353
|
|
0.00780129 0.00821066 0.008883 0.00848746]
|
|
|
|
mean value: 0.008272981643676758
|
|
|
|
key: score_time
|
|
value: [0.00791669 0.00839043 0.00856495 0.00861716 0.00864053 0.00859261
|
|
0.00863576 0.00865197 0.00859213 0.00803781]
|
|
|
|
mean value: 0.00846400260925293
|
|
|
|
key: test_mcc
|
|
value: [0.81325006 0.82199494 0.83914639 0.90369611 0.87096774 0.90369611
|
|
0.81325006 0.7284928 0.74460444 0.80475071]
|
|
|
|
mean value: 0.8243849367718851
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.90322581 0.90322581 0.91935484 0.9516129 0.93548387 0.9516129
|
|
0.90322581 0.85483871 0.86885246 0.90163934]
|
|
|
|
mean value: 0.9093072448439978
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.89655172 0.89285714 0.91803279 0.95238095 0.93548387 0.95081967
|
|
0.89655172 0.83636364 0.86206897 0.89655172]
|
|
|
|
mean value: 0.9037662199516902
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96296296 1. 0.93333333 0.9375 0.93548387 0.96666667
|
|
0.96296296 0.95833333 0.92592593 0.92857143]
|
|
|
|
mean value: 0.9511740484724356
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.83870968 0.80645161 0.90322581 0.96774194 0.93548387 0.93548387
|
|
0.83870968 0.74193548 0.80645161 0.86666667]
|
|
|
|
mean value: 0.8640860215053763
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.90322581 0.90322581 0.91935484 0.9516129 0.93548387 0.9516129
|
|
0.90322581 0.85483871 0.86989247 0.90107527]
|
|
|
|
mean value: 0.9093548387096775
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8125 0.80645161 0.84848485 0.90909091 0.87878788 0.90625
|
|
0.8125 0.71875 0.75757576 0.8125 ]
|
|
|
|
mean value: 0.826289100684262
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.26
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.33802533 1.34034443 1.34689403 1.33327985 1.32815957 1.34741807
|
|
1.36258483 1.35150194 1.35220432 1.37553906]
|
|
|
|
mean value: 1.3475951433181763
|
|
|
|
key: score_time
|
|
value: [0.09532094 0.15330195 0.09112287 0.0915432 0.09900188 0.09554839
|
|
0.09749842 0.0989244 0.09722352 0.09352469]
|
|
|
|
mean value: 0.10130102634429931
|
|
|
|
key: test_mcc
|
|
value: [1. 0.90369611 0.96824584 0.96824584 0.93743687 0.96824584
|
|
1. 0.96824584 0.96770777 0.8688172 ]
|
|
|
|
mean value: 0.9550641303879139
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.9516129 0.98387097 0.98387097 0.96774194 0.98387097
|
|
1. 0.98387097 0.98360656 0.93442623]
|
|
|
|
mean value: 0.9772871496562665
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.95238095 0.98412698 0.98412698 0.96875 0.98360656
|
|
1. 0.98360656 0.98412698 0.93333333]
|
|
|
|
mean value: 0.9774058352849336
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.9375 0.96875 0.96875 0.93939394 1.
|
|
1. 1. 0.96875 0.93333333]
|
|
|
|
mean value: 0.9716477272727273
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 1. 1. 1. 0.96774194
|
|
1. 0.96774194 1. 0.93333333]
|
|
|
|
mean value: 0.9836559139784946
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.9516129 0.98387097 0.98387097 0.96774194 0.98387097
|
|
1. 0.98387097 0.98333333 0.9344086 ]
|
|
|
|
mean value: 0.9772580645161291
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.90909091 0.96875 0.96875 0.93939394 0.96774194
|
|
1. 0.96774194 0.96875 0.875 ]
|
|
|
|
mean value: 0.9565218719452591
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.08
|
|
|
|
Accuracy on Blind test: 0.19
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.89337206 0.89832807 0.98853326 0.97143817 0.95260191 0.94139814
|
|
0.88369918 0.9011116 0.89748955 0.93211508]
|
|
|
|
mean value: 0.9260087013244629
|
|
|
|
key: score_time
|
|
value: [0.21353126 0.18774438 0.24319863 0.28244352 0.24419403 0.22724342
|
|
0.2297473 0.25225329 0.2684927 0.27329707]
|
|
|
|
mean value: 0.24221456050872803
|
|
|
|
key: test_mcc
|
|
value: [0.96824584 0.84266484 0.96824584 0.93743687 0.93743687 0.96824584
|
|
1. 0.93548387 0.96770777 0.8688172 ]
|
|
|
|
mean value: 0.9394284931358869
|
|
|
|
key: train_mcc
|
|
value: [0.96073627 0.95025527 0.97124816 0.96058703 0.96768225 0.95693359
|
|
0.96412858 0.96778244 0.95713569 0.97137405]
|
|
|
|
mean value: 0.9627863336198357
|
|
|
|
key: test_accuracy
|
|
value: [0.98387097 0.91935484 0.98387097 0.96774194 0.96774194 0.98387097
|
|
1. 0.96774194 0.98360656 0.93442623]
|
|
|
|
mean value: 0.9692226335272343
|
|
|
|
key: train_accuracy
|
|
value: [0.98021583 0.97482014 0.98561151 0.98021583 0.98381295 0.97841727
|
|
0.98201439 0.98381295 0.97845601 0.98563734]
|
|
|
|
mean value: 0.9813014220580447
|
|
|
|
key: test_fscore
|
|
value: [0.98412698 0.92307692 0.98412698 0.96875 0.96875 0.98360656
|
|
1. 0.96774194 0.98412698 0.93333333]
|
|
|
|
mean value: 0.9697639701652129
|
|
|
|
key: train_fscore
|
|
value: [0.98046181 0.97526502 0.98566308 0.98039216 0.98389982 0.97857143
|
|
0.98214286 0.98395722 0.97864769 0.98576512]
|
|
|
|
mean value: 0.9814766206153425
|
|
|
|
key: test_precision
|
|
value: [0.96875 0.88235294 0.96875 0.93939394 0.93939394 1.
|
|
1. 0.96774194 0.96875 0.93333333]
|
|
|
|
mean value: 0.9568466088781554
|
|
|
|
key: train_precision
|
|
value: [0.96842105 0.95833333 0.98214286 0.97173145 0.97864769 0.97163121
|
|
0.9751773 0.97526502 0.96830986 0.97879859]
|
|
|
|
mean value: 0.972845835273727
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 1. 1. 1. 0.96774194
|
|
1. 0.96774194 1. 0.93333333]
|
|
|
|
mean value: 0.9836559139784946
|
|
|
|
key: train_recall
|
|
value: [0.99280576 0.99280576 0.98920863 0.98920863 0.98920863 0.98561151
|
|
0.98920863 0.99280576 0.98920863 0.99283154]
|
|
|
|
mean value: 0.9902903483664681
|
|
|
|
key: test_roc_auc
|
|
value: [0.98387097 0.91935484 0.98387097 0.96774194 0.96774194 0.98387097
|
|
1. 0.96774194 0.98333333 0.9344086 ]
|
|
|
|
mean value: 0.9691935483870968
|
|
|
|
key: train_roc_auc
|
|
value: [0.98021583 0.97482014 0.98561151 0.98021583 0.98381295 0.97841727
|
|
0.98201439 0.98381295 0.97847528 0.9856244 ]
|
|
|
|
mean value: 0.9813020551300895
|
|
|
|
key: test_jcc
|
|
value: [0.96875 0.85714286 0.96875 0.93939394 0.93939394 0.96774194
|
|
1. 0.9375 0.96875 0.875 ]
|
|
|
|
mean value: 0.9422422671414608
|
|
|
|
key: train_jcc
|
|
value: [0.96167247 0.95172414 0.97173145 0.96153846 0.96830986 0.95804196
|
|
0.96491228 0.96842105 0.95818815 0.97192982]
|
|
|
|
mean value: 0.9636469650502072
|
|
|
|
MCC on Blind test: 0.09
|
|
|
|
Accuracy on Blind test: 0.2
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02149725 0.00856233 0.00855422 0.00861883 0.00868273 0.00835061
|
|
0.00852489 0.00826049 0.00861621 0.008636 ]
|
|
|
|
mean value: 0.009830355644226074
|
|
|
|
key: score_time
|
|
value: [0.0092957 0.00866055 0.00866151 0.00840211 0.00860667 0.00837517
|
|
0.00868773 0.00860476 0.00860882 0.00852346]
|
|
|
|
mean value: 0.00864264965057373
|
|
|
|
key: test_mcc
|
|
value: [0.61807005 0.65372045 0.45374261 0.71004695 0.51856298 0.71004695
|
|
0.42023032 0.74193548 0.54251915 0.57419355]
|
|
|
|
mean value: 0.5943068479116385
|
|
|
|
key: train_mcc
|
|
value: [0.61176415 0.63718965 0.62604511 0.60075441 0.62596408 0.60075441
|
|
0.65528703 0.62262853 0.64839945 0.64106733]
|
|
|
|
mean value: 0.6269854139141487
|
|
|
|
key: test_accuracy
|
|
value: [0.80645161 0.82258065 0.72580645 0.85483871 0.75806452 0.85483871
|
|
0.70967742 0.87096774 0.7704918 0.78688525]
|
|
|
|
mean value: 0.7960602855631941
|
|
|
|
key: train_accuracy
|
|
value: [0.8057554 0.81834532 0.81294964 0.80035971 0.81294964 0.80035971
|
|
0.82733813 0.81115108 0.82405745 0.82046679]
|
|
|
|
mean value: 0.8133732870077367
|
|
|
|
key: test_fscore
|
|
value: [0.79310345 0.8358209 0.71186441 0.85245902 0.76923077 0.85245902
|
|
0.71875 0.87096774 0.76666667 0.78688525]
|
|
|
|
mean value: 0.7958207207099356
|
|
|
|
key: train_fscore
|
|
value: [0.80851064 0.82186949 0.81090909 0.79927667 0.8115942 0.79927667
|
|
0.83098592 0.81415929 0.82624113 0.82269504]
|
|
|
|
mean value: 0.814551814377158
|
|
|
|
key: test_precision
|
|
value: [0.85185185 0.77777778 0.75 0.86666667 0.73529412 0.86666667
|
|
0.6969697 0.87096774 0.79310345 0.77419355]
|
|
|
|
mean value: 0.7983491516178162
|
|
|
|
key: train_precision
|
|
value: [0.7972028 0.80622837 0.81985294 0.80363636 0.81751825 0.80363636
|
|
0.8137931 0.80139373 0.81468531 0.81403509]
|
|
|
|
mean value: 0.8091982321605484
|
|
|
|
key: test_recall
|
|
value: [0.74193548 0.90322581 0.67741935 0.83870968 0.80645161 0.83870968
|
|
0.74193548 0.87096774 0.74193548 0.8 ]
|
|
|
|
mean value: 0.7961290322580645
|
|
|
|
key: train_recall
|
|
value: [0.82014388 0.8381295 0.80215827 0.79496403 0.8057554 0.79496403
|
|
0.84892086 0.82733813 0.8381295 0.83154122]
|
|
|
|
mean value: 0.8202044815760295
|
|
|
|
key: test_roc_auc
|
|
value: [0.80645161 0.82258065 0.72580645 0.85483871 0.75806452 0.85483871
|
|
0.70967742 0.87096774 0.77096774 0.78709677]
|
|
|
|
mean value: 0.7961290322580645
|
|
|
|
key: train_roc_auc
|
|
value: [0.8057554 0.81834532 0.81294964 0.80035971 0.81294964 0.80035971
|
|
0.82733813 0.81115108 0.82408267 0.82044687]
|
|
|
|
mean value: 0.8133738170753719
|
|
|
|
key: test_jcc
|
|
value: [0.65714286 0.71794872 0.55263158 0.74285714 0.625 0.74285714
|
|
0.56097561 0.77142857 0.62162162 0.64864865]
|
|
|
|
mean value: 0.6641111891208169
|
|
|
|
key: train_jcc
|
|
value: [0.67857143 0.69760479 0.68195719 0.66566265 0.68292683 0.66566265
|
|
0.71084337 0.68656716 0.70392749 0.69879518]
|
|
|
|
mean value: 0.6872518746851146
|
|
|
|
MCC on Blind test: 0.18
|
|
|
|
Accuracy on Blind test: 0.52
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.09717035 0.05194712 0.05566955 0.05653906 0.06008887 0.0614419
|
|
0.06166148 0.06023955 0.06417036 0.05456114]
|
|
|
|
mean value: 0.06234893798828125
|
|
|
|
key: score_time
|
|
value: [0.01015568 0.00965595 0.00964165 0.00960851 0.00993562 0.00997877
|
|
0.01027107 0.00972724 0.00962043 0.00961185]
|
|
|
|
mean value: 0.009820675849914551
|
|
|
|
key: test_mcc
|
|
value: [1. 0.90369611 0.93548387 0.96824584 0.93743687 0.93743687
|
|
1. 0.96824584 0.90586325 0.8688172 ]
|
|
|
|
mean value: 0.942522584980111
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.9516129 0.96774194 0.98387097 0.96774194 0.96774194
|
|
1. 0.98387097 0.95081967 0.93442623]
|
|
|
|
mean value: 0.9707826546800635
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.95238095 0.96774194 0.98412698 0.96875 0.96666667
|
|
1. 0.98360656 0.95384615 0.93333333]
|
|
|
|
mean value: 0.971045258321501
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.9375 0.96774194 0.96875 0.93939394 1.
|
|
1. 1. 0.91176471 0.93333333]
|
|
|
|
mean value: 0.9658483914093496
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 0.96774194 1. 1. 0.93548387
|
|
1. 0.96774194 1. 0.93333333]
|
|
|
|
mean value: 0.9772043010752688
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.9516129 0.96774194 0.98387097 0.96774194 0.96774194
|
|
1. 0.98387097 0.95 0.9344086 ]
|
|
|
|
mean value: 0.9706989247311828
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.90909091 0.9375 0.96875 0.93939394 0.93548387
|
|
1. 0.96774194 0.91176471 0.875 ]
|
|
|
|
mean value: 0.9444725360818814
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.2
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01567197 0.04181266 0.04561234 0.0498898 0.04169583 0.04362798
|
|
0.04267979 0.04328203 0.0497613 0.0470922 ]
|
|
|
|
mean value: 0.04211258888244629
|
|
|
|
key: score_time
|
|
value: [0.01037931 0.02170181 0.01888394 0.01488948 0.01078224 0.02180099
|
|
0.02077007 0.0193584 0.0108037 0.01081157]
|
|
|
|
mean value: 0.016018152236938477
|
|
|
|
key: test_mcc
|
|
value: [0.93548387 0.84266484 0.93548387 0.93743687 0.87278605 1.
|
|
0.96824584 0.87278605 0.9344086 0.8688172 ]
|
|
|
|
mean value: 0.9168113188472994
|
|
|
|
key: train_mcc
|
|
value: [0.93914669 0.94653932 0.93563929 0.93914669 0.94266562 0.93195016
|
|
0.93238486 0.94283651 0.93575728 0.9427658 ]
|
|
|
|
mean value: 0.9388832217176918
|
|
|
|
key: test_accuracy
|
|
value: [0.96774194 0.91935484 0.96774194 0.96774194 0.93548387 1.
|
|
0.98387097 0.93548387 0.96721311 0.93442623]
|
|
|
|
mean value: 0.9579058699101005
|
|
|
|
key: train_accuracy
|
|
value: [0.96942446 0.97302158 0.9676259 0.96942446 0.97122302 0.96582734
|
|
0.96582734 0.97122302 0.96768402 0.97127469]
|
|
|
|
mean value: 0.969255582966302
|
|
|
|
key: test_fscore
|
|
value: [0.96774194 0.92307692 0.96774194 0.96875 0.9375 1.
|
|
0.98412698 0.93333333 0.96774194 0.93333333]
|
|
|
|
mean value: 0.9583346380322186
|
|
|
|
key: train_fscore
|
|
value: [0.96980462 0.97345133 0.96808511 0.96980462 0.97153025 0.96625222
|
|
0.9664903 0.97163121 0.96808511 0.97163121]
|
|
|
|
mean value: 0.9696765956964183
|
|
|
|
key: test_precision
|
|
value: [0.96774194 0.88235294 0.96774194 0.93939394 0.90909091 1.
|
|
0.96875 0.96551724 0.96774194 0.93333333]
|
|
|
|
mean value: 0.9501664170825576
|
|
|
|
key: train_precision
|
|
value: [0.95789474 0.95818815 0.95454545 0.95789474 0.96126761 0.95438596
|
|
0.94809689 0.95804196 0.95454545 0.96140351]
|
|
|
|
mean value: 0.9566264459258345
|
|
|
|
key: test_recall
|
|
value: [0.96774194 0.96774194 0.96774194 1. 0.96774194 1.
|
|
1. 0.90322581 0.96774194 0.93333333]
|
|
|
|
mean value: 0.9675268817204301
|
|
|
|
key: train_recall
|
|
value: [0.98201439 0.98920863 0.98201439 0.98201439 0.98201439 0.97841727
|
|
0.98561151 0.98561151 0.98201439 0.98207885]
|
|
|
|
mean value: 0.9830999716355947
|
|
|
|
key: test_roc_auc
|
|
value: [0.96774194 0.91935484 0.96774194 0.96774194 0.93548387 1.
|
|
0.98387097 0.93548387 0.9672043 0.9344086 ]
|
|
|
|
mean value: 0.9579032258064516
|
|
|
|
key: train_roc_auc
|
|
value: [0.96942446 0.97302158 0.9676259 0.96942446 0.97122302 0.96582734
|
|
0.96582734 0.97122302 0.9677097 0.97125525]
|
|
|
|
mean value: 0.9692562079368764
|
|
|
|
key: test_jcc
|
|
value: [0.9375 0.85714286 0.9375 0.93939394 0.88235294 1.
|
|
0.96875 0.875 0.9375 0.875 ]
|
|
|
|
mean value: 0.9210139737713268
|
|
|
|
key: train_jcc
|
|
value: [0.94137931 0.94827586 0.93814433 0.94137931 0.94463668 0.9347079
|
|
0.93515358 0.94482759 0.93814433 0.94482759]
|
|
|
|
mean value: 0.9411476480564737
|
|
|
|
MCC on Blind test: 0.14
|
|
|
|
Accuracy on Blind test: 0.35
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02284217 0.00781918 0.00835109 0.00833607 0.00749111 0.00754762
|
|
0.00822759 0.00809884 0.00830936 0.00828028]
|
|
|
|
mean value: 0.009530329704284668
|
|
|
|
key: score_time
|
|
value: [0.00877237 0.00818181 0.00863433 0.00789332 0.00792742 0.00775814
|
|
0.00851727 0.00839686 0.00838804 0.00852823]
|
|
|
|
mean value: 0.008299779891967774
|
|
|
|
key: test_mcc
|
|
value: [0.74193548 0.55301004 0.55895656 0.69047575 0.60677988 0.80813523
|
|
0.46358632 0.77459667 0.57576971 0.75310667]
|
|
|
|
mean value: 0.6526352311777236
|
|
|
|
key: train_mcc
|
|
value: [0.67282515 0.67609995 0.67144111 0.65172831 0.66087942 0.64772254
|
|
0.68595876 0.65901019 0.68263871 0.65745214]
|
|
|
|
mean value: 0.6665756264215859
|
|
|
|
key: test_accuracy
|
|
value: [0.87096774 0.77419355 0.77419355 0.83870968 0.79032258 0.90322581
|
|
0.72580645 0.88709677 0.78688525 0.86885246]
|
|
|
|
mean value: 0.8220253833950291
|
|
|
|
key: train_accuracy
|
|
value: [0.83273381 0.83453237 0.83273381 0.82194245 0.82733813 0.82014388
|
|
0.83992806 0.82553957 0.83842011 0.82585278]
|
|
|
|
mean value: 0.8299164976815675
|
|
|
|
key: test_fscore
|
|
value: [0.87096774 0.78787879 0.79411765 0.85294118 0.81690141 0.90625
|
|
0.75362319 0.88888889 0.8 0.87878788]
|
|
|
|
mean value: 0.8350356717876952
|
|
|
|
key: train_fscore
|
|
value: [0.84422111 0.84563758 0.84317032 0.83472454 0.83838384 0.83277592
|
|
0.84991568 0.83806344 0.84797297 0.83697479]
|
|
|
|
mean value: 0.8411840193764768
|
|
|
|
key: test_precision
|
|
value: [0.87096774 0.74285714 0.72972973 0.78378378 0.725 0.87878788
|
|
0.68421053 0.875 0.76470588 0.80555556]
|
|
|
|
mean value: 0.7860598241318305
|
|
|
|
key: train_precision
|
|
value: [0.78996865 0.79245283 0.79365079 0.7788162 0.78797468 0.778125
|
|
0.8 0.78193146 0.79936306 0.78797468]
|
|
|
|
mean value: 0.789025736384194
|
|
|
|
key: test_recall
|
|
value: [0.87096774 0.83870968 0.87096774 0.93548387 0.93548387 0.93548387
|
|
0.83870968 0.90322581 0.83870968 0.96666667]
|
|
|
|
mean value: 0.8934408602150538
|
|
|
|
key: train_recall
|
|
value: [0.90647482 0.90647482 0.89928058 0.89928058 0.89568345 0.89568345
|
|
0.90647482 0.9028777 0.9028777 0.89247312]
|
|
|
|
mean value: 0.9007581031948635
|
|
|
|
key: test_roc_auc
|
|
value: [0.87096774 0.77419355 0.77419355 0.83870968 0.79032258 0.90322581
|
|
0.72580645 0.88709677 0.78602151 0.87043011]
|
|
|
|
mean value: 0.8220967741935484
|
|
|
|
key: train_roc_auc
|
|
value: [0.83273381 0.83453237 0.83273381 0.82194245 0.82733813 0.82014388
|
|
0.83992806 0.82553957 0.83853562 0.82573296]
|
|
|
|
mean value: 0.829916067146283
|
|
|
|
key: test_jcc
|
|
value: [0.77142857 0.65 0.65853659 0.74358974 0.69047619 0.82857143
|
|
0.60465116 0.8 0.66666667 0.78378378]
|
|
|
|
mean value: 0.7197704132672936
|
|
|
|
key: train_jcc
|
|
value: [0.73043478 0.73255814 0.72886297 0.71633238 0.72173913 0.71346705
|
|
0.73900293 0.72126437 0.73607038 0.71965318]
|
|
|
|
mean value: 0.7259385314063227
|
|
|
|
MCC on Blind test: 0.21
|
|
|
|
Accuracy on Blind test: 0.5
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01066303 0.01422668 0.01254988 0.01286054 0.01497436 0.01457095
|
|
0.01441455 0.01421928 0.01397729 0.01655555]
|
|
|
|
mean value: 0.013901209831237793
|
|
|
|
key: score_time
|
|
value: [0.00798249 0.01007485 0.01002264 0.01035452 0.01045942 0.01085925
|
|
0.01050234 0.01053119 0.01049948 0.01047754]
|
|
|
|
mean value: 0.010176372528076173
|
|
|
|
key: test_mcc
|
|
value: [0.84983659 0.90369611 0.96824584 0.93743687 0.83914639 1.
|
|
0.93743687 0.84266484 0.9344086 0.90215054]
|
|
|
|
mean value: 0.9115022641468933
|
|
|
|
key: train_mcc
|
|
value: [0.85210391 0.96048758 0.90882979 0.95324358 0.8782527 0.93534863
|
|
0.935276 0.91827075 0.93969601 0.97130001]
|
|
|
|
mean value: 0.9252808948765114
|
|
|
|
key: test_accuracy
|
|
value: [0.91935484 0.9516129 0.98387097 0.96774194 0.91935484 1.
|
|
0.96774194 0.91935484 0.96721311 0.95081967]
|
|
|
|
mean value: 0.9547065044949762
|
|
|
|
key: train_accuracy
|
|
value: [0.92266187 0.98021583 0.95323741 0.97661871 0.93705036 0.9676259
|
|
0.9676259 0.95863309 0.96947935 0.98563734]
|
|
|
|
mean value: 0.9618785761337071
|
|
|
|
key: test_fscore
|
|
value: [0.9122807 0.95238095 0.98412698 0.96875 0.91803279 1.
|
|
0.96666667 0.91525424 0.96774194 0.95081967]
|
|
|
|
mean value: 0.9536053936717389
|
|
|
|
key: train_fscore
|
|
value: [0.91746641 0.980322 0.95486111 0.97666068 0.93383743 0.96785714
|
|
0.96750903 0.95764273 0.97001764 0.98561151]
|
|
|
|
mean value: 0.961178567797733
|
|
|
|
key: test_precision
|
|
value: [1. 0.9375 0.96875 0.93939394 0.93333333 1.
|
|
1. 0.96428571 0.96774194 0.93548387]
|
|
|
|
mean value: 0.96464887934646
|
|
|
|
key: train_precision
|
|
value: [0.98353909 0.97508897 0.92281879 0.97491039 0.98406375 0.96099291
|
|
0.97101449 0.98113208 0.95155709 0.98916968]
|
|
|
|
mean value: 0.9694287238395796
|
|
|
|
key: test_recall
|
|
value: [0.83870968 0.96774194 1. 1. 0.90322581 1.
|
|
0.93548387 0.87096774 0.96774194 0.96666667]
|
|
|
|
mean value: 0.9450537634408602
|
|
|
|
key: train_recall
|
|
value: [0.85971223 0.98561151 0.98920863 0.97841727 0.88848921 0.97482014
|
|
0.96402878 0.9352518 0.98920863 0.98207885]
|
|
|
|
mean value: 0.9546827054485444
|
|
|
|
key: test_roc_auc
|
|
value: [0.91935484 0.9516129 0.98387097 0.96774194 0.91935484 1.
|
|
0.96774194 0.91935484 0.9672043 0.95107527]
|
|
|
|
mean value: 0.954731182795699
|
|
|
|
key: train_roc_auc
|
|
value: [0.92266187 0.98021583 0.95323741 0.97661871 0.93705036 0.9676259
|
|
0.9676259 0.95863309 0.96951471 0.98564374]
|
|
|
|
mean value: 0.9618827518630257
|
|
|
|
key: test_jcc
|
|
value: [0.83870968 0.90909091 0.96875 0.93939394 0.84848485 1.
|
|
0.93548387 0.84375 0.9375 0.90625 ]
|
|
|
|
mean value: 0.9127413245356794
|
|
|
|
key: train_jcc
|
|
value: [0.84751773 0.96140351 0.91362126 0.95438596 0.87588652 0.93771626
|
|
0.93706294 0.91872792 0.94178082 0.97163121]
|
|
|
|
mean value: 0.925973413428646
|
|
|
|
MCC on Blind test: 0.13
|
|
|
|
Accuracy on Blind test: 0.39
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01376939 0.01323795 0.01437616 0.01475048 0.01206875 0.01291966
|
|
0.01519465 0.0118556 0.01220989 0.01324606]
|
|
|
|
mean value: 0.013362860679626465
|
|
|
|
key: score_time
|
|
value: [0.01042581 0.01045299 0.01043582 0.01047158 0.01042938 0.01043868
|
|
0.0104847 0.01037741 0.0103879 0.01042461]
|
|
|
|
mean value: 0.010432887077331542
|
|
|
|
key: test_mcc
|
|
value: [0.90748521 0.87278605 0.93548387 0.87096774 0.90369611 0.90369611
|
|
0.84983659 0.84983659 0.84710837 0.83638369]
|
|
|
|
mean value: 0.8777280337009071
|
|
|
|
key: train_mcc
|
|
value: [0.91267965 0.91482985 0.90302377 0.91827075 0.91106862 0.91267965
|
|
0.89008997 0.90161686 0.7528037 0.94982722]
|
|
|
|
mean value: 0.8966890034959883
|
|
|
|
key: test_accuracy
|
|
value: [0.9516129 0.93548387 0.96774194 0.93548387 0.9516129 0.9516129
|
|
0.91935484 0.91935484 0.91803279 0.91803279]
|
|
|
|
mean value: 0.9368323638286621
|
|
|
|
key: train_accuracy
|
|
value: [0.95503597 0.95683453 0.95143885 0.95863309 0.95503597 0.95503597
|
|
0.94244604 0.94964029 0.86355476 0.97486535]
|
|
|
|
mean value: 0.9462520827144388
|
|
|
|
key: test_fscore
|
|
value: [0.95384615 0.9375 0.96774194 0.93548387 0.95238095 0.95238095
|
|
0.9122807 0.9122807 0.92537313 0.91525424]
|
|
|
|
mean value: 0.9364522640184937
|
|
|
|
key: train_fscore
|
|
value: [0.95667244 0.95789474 0.95099819 0.95764273 0.95395948 0.95667244
|
|
0.9391635 0.94776119 0.87898089 0.97508897]
|
|
|
|
mean value: 0.9474834571073163
|
|
|
|
key: test_precision
|
|
value: [0.91176471 0.90909091 0.96774194 0.93548387 0.9375 0.9375
|
|
1. 1. 0.86111111 0.93103448]
|
|
|
|
mean value: 0.9391227015294606
|
|
|
|
key: train_precision
|
|
value: [0.92307692 0.93493151 0.95970696 0.98113208 0.97735849 0.92307692
|
|
0.99596774 0.98449612 0.78857143 0.96819788]
|
|
|
|
mean value: 0.9436516053144435
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 0.96774194 0.93548387 0.96774194 0.96774194
|
|
0.83870968 0.83870968 1. 0.9 ]
|
|
|
|
mean value: 0.9383870967741935
|
|
|
|
key: train_recall
|
|
value: [0.99280576 0.98201439 0.94244604 0.9352518 0.93165468 0.99280576
|
|
0.88848921 0.91366906 0.99280576 0.98207885]
|
|
|
|
mean value: 0.9554021299089761
|
|
|
|
key: test_roc_auc
|
|
value: [0.9516129 0.93548387 0.96774194 0.93548387 0.9516129 0.9516129
|
|
0.91935484 0.91935484 0.91666667 0.91774194]
|
|
|
|
mean value: 0.9366666666666668
|
|
|
|
key: train_roc_auc
|
|
value: [0.95503597 0.95683453 0.95143885 0.95863309 0.95503597 0.95503597
|
|
0.94244604 0.94964029 0.86378639 0.97485238]
|
|
|
|
mean value: 0.946273948583069
|
|
|
|
key: test_jcc
|
|
value: [0.91176471 0.88235294 0.9375 0.87878788 0.90909091 0.90909091
|
|
0.83870968 0.83870968 0.86111111 0.84375 ]
|
|
|
|
mean value: 0.8810867809978341
|
|
|
|
key: train_jcc
|
|
value: [0.91694352 0.91919192 0.90657439 0.91872792 0.91197183 0.91694352
|
|
0.88530466 0.90070922 0.78409091 0.95138889]
|
|
|
|
mean value: 0.9011846780361379
|
|
|
|
MCC on Blind test: 0.13
|
|
|
|
Accuracy on Blind test: 0.33
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10950232 0.09739041 0.09704351 0.09424162 0.09430504 0.09652781
|
|
0.09616399 0.10172677 0.10454583 0.09388137]
|
|
|
|
mean value: 0.09853286743164062
|
|
|
|
key: score_time
|
|
value: [0.01543546 0.014148 0.01423931 0.01424742 0.0141356 0.0143621
|
|
0.01467228 0.01546311 0.01425433 0.01426816]
|
|
|
|
mean value: 0.014522576332092285
|
|
|
|
key: test_mcc
|
|
value: [0.96824584 0.96824584 0.93548387 0.96824584 0.93743687 0.96824584
|
|
1. 1. 0.90586325 0.93649139]
|
|
|
|
mean value: 0.958825873085774
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98387097 0.98387097 0.96774194 0.98387097 0.96774194 0.98387097
|
|
1. 1. 0.95081967 0.96721311]
|
|
|
|
mean value: 0.9789000528820729
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98360656 0.98360656 0.96774194 0.98412698 0.96875 0.98360656
|
|
1. 1. 0.95384615 0.96774194]
|
|
|
|
mean value: 0.9793026681072028
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.96774194 0.96875 0.93939394 1.
|
|
1. 1. 0.91176471 0.9375 ]
|
|
|
|
mean value: 0.9725150580760163
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96774194 0.96774194 0.96774194 1. 1. 0.96774194
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9870967741935484
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98387097 0.98387097 0.96774194 0.98387097 0.96774194 0.98387097
|
|
1. 1. 0.95 0.96774194]
|
|
|
|
mean value: 0.9788709677419355
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96774194 0.96774194 0.9375 0.96875 0.93939394 0.96774194
|
|
1. 1. 0.91176471 0.9375 ]
|
|
|
|
mean value: 0.9598134451727905
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.09
|
|
|
|
Accuracy on Blind test: 0.21
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03574371 0.03871679 0.05276203 0.05378747 0.05436444 0.04580855
|
|
0.04448795 0.03396726 0.04315591 0.03275442]
|
|
|
|
mean value: 0.04355485439300537
|
|
|
|
key: score_time
|
|
value: [0.02251959 0.03061891 0.03506684 0.03417039 0.03329325 0.02393937
|
|
0.03167629 0.02193832 0.03418398 0.01938605]
|
|
|
|
mean value: 0.028679299354553222
|
|
|
|
key: test_mcc
|
|
value: [1. 0.87096774 1. 0.96824584 0.90369611 0.90748521
|
|
0.93743687 0.96824584 1. 0.8688172 ]
|
|
|
|
mean value: 0.9424894812989454
|
|
|
|
key: train_mcc
|
|
value: [0.99640932 0.99640932 0.99280576 0.99283145 0.98561151 0.99280576
|
|
0.99640932 0.99640932 0.99641572 0.99641577]
|
|
|
|
mean value: 0.9942523261997296
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.93548387 1. 0.98387097 0.9516129 0.9516129
|
|
0.96774194 0.98387097 1. 0.93442623]
|
|
|
|
mean value: 0.9708619777895293
|
|
|
|
key: train_accuracy
|
|
value: [0.99820144 0.99820144 0.99640288 0.99640288 0.99280576 0.99640288
|
|
0.99820144 0.99820144 0.99820467 0.99820467]
|
|
|
|
mean value: 0.9971229479612002
|
|
|
|
key: test_fscore
|
|
value: [1. 0.93548387 1. 0.98412698 0.95238095 0.94915254
|
|
0.96666667 0.98360656 1. 0.93333333]
|
|
|
|
mean value: 0.9704750907225609
|
|
|
|
key: train_fscore
|
|
value: [0.9981982 0.9981982 0.99640288 0.99638989 0.99280576 0.99640288
|
|
0.99820467 0.9981982 0.9981982 0.99820467]
|
|
|
|
mean value: 0.997120353100802
|
|
|
|
key: test_precision
|
|
value: [1. 0.93548387 1. 0.96875 0.9375 1.
|
|
1. 1. 1. 0.93333333]
|
|
|
|
mean value: 0.9775067204301076
|
|
|
|
key: train_precision
|
|
value: [1. 1. 0.99640288 1. 0.99280576 0.99640288
|
|
0.99641577 1. 1. 1. ]
|
|
|
|
mean value: 0.9982027281400686
|
|
|
|
key: test_recall
|
|
value: [1. 0.93548387 1. 1. 0.96774194 0.90322581
|
|
0.93548387 0.96774194 1. 0.93333333]
|
|
|
|
mean value: 0.9643010752688173
|
|
|
|
key: train_recall
|
|
value: [0.99640288 0.99640288 0.99640288 0.99280576 0.99280576 0.99640288
|
|
1. 0.99640288 0.99640288 0.99641577]
|
|
|
|
mean value: 0.9960444547587737
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.93548387 1. 0.98387097 0.9516129 0.9516129
|
|
0.96774194 0.98387097 1. 0.9344086 ]
|
|
|
|
mean value: 0.9708602150537635
|
|
|
|
key: train_roc_auc
|
|
value: [0.99820144 0.99820144 0.99640288 0.99640288 0.99280576 0.99640288
|
|
0.99820144 0.99820144 0.99820144 0.99820789]
|
|
|
|
mean value: 0.9971229468038473
|
|
|
|
key: test_jcc
|
|
value: [1. 0.87878788 1. 0.96875 0.90909091 0.90322581
|
|
0.93548387 0.96774194 1. 0.875 ]
|
|
|
|
mean value: 0.9438080400782014
|
|
|
|
key: train_jcc
|
|
value: [0.99640288 0.99640288 0.99283154 0.99280576 0.98571429 0.99283154
|
|
0.99641577 0.99640288 0.99640288 0.99641577]
|
|
|
|
mean value: 0.994262617555725
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.21
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.18538761 0.21855521 0.19818258 0.16929436 0.21186471 0.22627425
|
|
0.19866776 0.15631485 0.20869946 0.18409514]
|
|
|
|
mean value: 0.19573359489440917
|
|
|
|
key: score_time
|
|
value: [0.02069068 0.01292896 0.04060698 0.02179265 0.02101707 0.0241735
|
|
0.01276922 0.02045441 0.02064705 0.03370333]
|
|
|
|
mean value: 0.022878384590148924
|
|
|
|
key: test_mcc
|
|
value: [0.90748521 0.62471615 0.77459667 0.83914639 0.7190925 0.80813523
|
|
0.64820372 0.83914639 0.63939757 0.77096774]
|
|
|
|
mean value: 0.7570887579844512
|
|
|
|
key: train_mcc
|
|
value: [0.88143754 0.84999939 0.88509826 0.86366703 0.87437795 0.88157448
|
|
0.87826623 0.87806148 0.8713058 0.88511972]
|
|
|
|
mean value: 0.8748907880078497
|
|
|
|
key: test_accuracy
|
|
value: [0.9516129 0.80645161 0.88709677 0.91935484 0.85483871 0.90322581
|
|
0.82258065 0.91935484 0.81967213 0.8852459 ]
|
|
|
|
mean value: 0.8769434161819143
|
|
|
|
key: train_accuracy
|
|
value: [0.94064748 0.92446043 0.94244604 0.93165468 0.93705036 0.94064748
|
|
0.93884892 0.93884892 0.93536804 0.94254937]
|
|
|
|
mean value: 0.9372521731268486
|
|
|
|
key: test_fscore
|
|
value: [0.94915254 0.82352941 0.88888889 0.92063492 0.86567164 0.90625
|
|
0.83076923 0.91803279 0.82539683 0.8852459 ]
|
|
|
|
mean value: 0.8813572150143087
|
|
|
|
key: train_fscore
|
|
value: [0.94117647 0.92631579 0.9430605 0.93262411 0.93783304 0.94138544
|
|
0.93992933 0.93971631 0.93639576 0.94285714]
|
|
|
|
mean value: 0.9381293887479757
|
|
|
|
key: test_precision
|
|
value: [1. 0.75675676 0.875 0.90625 0.80555556 0.87878788
|
|
0.79411765 0.93333333 0.8125 0.87096774]
|
|
|
|
mean value: 0.8633268913427832
|
|
|
|
key: train_precision
|
|
value: [0.93286219 0.90410959 0.93309859 0.91958042 0.92631579 0.92982456
|
|
0.92361111 0.92657343 0.92013889 0.93950178]
|
|
|
|
mean value: 0.9255616347793583
|
|
|
|
key: test_recall
|
|
value: [0.90322581 0.90322581 0.90322581 0.93548387 0.93548387 0.93548387
|
|
0.87096774 0.90322581 0.83870968 0.9 ]
|
|
|
|
mean value: 0.9029032258064515
|
|
|
|
key: train_recall
|
|
value: [0.94964029 0.94964029 0.95323741 0.94604317 0.94964029 0.95323741
|
|
0.95683453 0.95323741 0.95323741 0.94623656]
|
|
|
|
mean value: 0.9510984760578634
|
|
|
|
key: test_roc_auc
|
|
value: [0.9516129 0.80645161 0.88709677 0.91935484 0.85483871 0.90322581
|
|
0.82258065 0.91935484 0.81935484 0.88548387]
|
|
|
|
mean value: 0.8769354838709678
|
|
|
|
key: train_roc_auc
|
|
value: [0.94064748 0.92446043 0.94244604 0.93165468 0.93705036 0.94064748
|
|
0.93884892 0.93884892 0.93540007 0.94254274]
|
|
|
|
mean value: 0.9372547123591449
|
|
|
|
key: test_jcc
|
|
value: [0.90322581 0.7 0.8 0.85294118 0.76315789 0.82857143
|
|
0.71052632 0.84848485 0.7027027 0.79411765]
|
|
|
|
mean value: 0.790372782026632
|
|
|
|
key: train_jcc
|
|
value: [0.88888889 0.8627451 0.89225589 0.87375415 0.88294314 0.88926174
|
|
0.88666667 0.88628763 0.88039867 0.89189189]
|
|
|
|
mean value: 0.8835093775860033
|
|
|
|
MCC on Blind test: 0.22
|
|
|
|
Accuracy on Blind test: 0.49
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.24915671 0.24478769 0.25460291 0.25357485 0.25452995 0.24537802
|
|
0.2443285 0.24822903 0.24816132 0.25430918]
|
|
|
|
mean value: 0.24970581531524658
|
|
|
|
key: score_time
|
|
value: [0.00863647 0.0090971 0.00848818 0.00925422 0.00943804 0.00865912
|
|
0.00864434 0.00895667 0.00889111 0.00870037]
|
|
|
|
mean value: 0.008876562118530273
|
|
|
|
key: test_mcc
|
|
value: [1. 0.87096774 1. 0.96824584 0.93743687 0.90748521
|
|
0.96824584 0.96824584 1. 0.8688172 ]
|
|
|
|
mean value: 0.9489444535426244
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.93548387 1. 0.98387097 0.96774194 0.9516129
|
|
0.98387097 0.98387097 1. 0.93442623]
|
|
|
|
mean value: 0.9740877842411423
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.93548387 1. 0.98412698 0.96875 0.94915254
|
|
0.98360656 0.98360656 1. 0.93333333]
|
|
|
|
mean value: 0.9738059845555039
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.93548387 1. 0.96875 0.93939394 1.
|
|
1. 1. 1. 0.93333333]
|
|
|
|
mean value: 0.9776961143695014
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.93548387 1. 1. 1. 0.90322581
|
|
0.96774194 0.96774194 1. 0.93333333]
|
|
|
|
mean value: 0.970752688172043
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.93548387 1. 0.98387097 0.96774194 0.9516129
|
|
0.98387097 0.98387097 1. 0.9344086 ]
|
|
|
|
mean value: 0.9740860215053764
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.87878788 1. 0.96875 0.93939394 0.90322581
|
|
0.96774194 0.96774194 1. 0.875 ]
|
|
|
|
mean value: 0.9500641495601173
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.05
|
|
|
|
Accuracy on Blind test: 0.19
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01151943 0.01374769 0.01429176 0.01392269 0.01409912 0.01630855
|
|
0.01395249 0.01366401 0.01401973 0.01421189]
|
|
|
|
mean value: 0.013973736763000488
|
|
|
|
key: score_time
|
|
value: [0.01094151 0.01091838 0.01081514 0.01111507 0.01110363 0.01155281
|
|
0.01108122 0.01173496 0.0118649 0.0110836 ]
|
|
|
|
mean value: 0.01122112274169922
|
|
|
|
key: test_mcc
|
|
value: [0.75623534 0.7130241 0.67419986 0.87831007 0.35659298 0.7284928
|
|
0.61807005 0.87278605 0.70874158 0.47128445]
|
|
|
|
mean value: 0.6777737268610616
|
|
|
|
key: train_mcc
|
|
value: [0.7898587 0.84192273 0.79323895 0.88226013 0.52711711 0.8046478
|
|
0.84911865 0.84598626 0.839052 0.5797551 ]
|
|
|
|
mean value: 0.7752957434463477
|
|
|
|
key: test_accuracy
|
|
value: [0.87096774 0.85483871 0.82258065 0.93548387 0.64516129 0.85483871
|
|
0.80645161 0.93548387 0.85245902 0.72131148]
|
|
|
|
mean value: 0.8299576943416181
|
|
|
|
key: train_accuracy
|
|
value: [0.88848921 0.92086331 0.88848921 0.94064748 0.71942446 0.89748201
|
|
0.92446043 0.92266187 0.91741472 0.76481149]
|
|
|
|
mean value: 0.8784744197460703
|
|
|
|
key: test_fscore
|
|
value: [0.88235294 0.86153846 0.79245283 0.93939394 0.5 0.86956522
|
|
0.79310345 0.9375 0.84745763 0.65306122]
|
|
|
|
mean value: 0.8076425689573157
|
|
|
|
key: train_fscore
|
|
value: [0.89768977 0.92 0.876 0.94200351 0.6119403 0.9048414
|
|
0.92363636 0.92416226 0.91287879 0.70561798]
|
|
|
|
mean value: 0.861877037129891
|
|
|
|
key: test_precision
|
|
value: [0.81081081 0.82352941 0.95454545 0.88571429 0.84615385 0.78947368
|
|
0.85185185 0.90909091 0.89285714 0.84210526]
|
|
|
|
mean value: 0.8606132660157428
|
|
|
|
key: train_precision
|
|
value: [0.82926829 0.93014706 0.98648649 0.9209622 0.99193548 0.84423676
|
|
0.93382353 0.90657439 0.964 0.94578313]
|
|
|
|
mean value: 0.9253217337706788
|
|
|
|
key: test_recall
|
|
value: [0.96774194 0.90322581 0.67741935 1. 0.35483871 0.96774194
|
|
0.74193548 0.96774194 0.80645161 0.53333333]
|
|
|
|
mean value: 0.7920430107526881
|
|
|
|
key: train_recall
|
|
value: [0.97841727 0.91007194 0.78776978 0.96402878 0.44244604 0.97482014
|
|
0.91366906 0.94244604 0.86690647 0.56272401]
|
|
|
|
mean value: 0.8343299553905263
|
|
|
|
key: test_roc_auc
|
|
value: [0.87096774 0.85483871 0.82258065 0.93548387 0.64516129 0.85483871
|
|
0.80645161 0.93548387 0.85322581 0.71827957]
|
|
|
|
mean value: 0.8297311827956989
|
|
|
|
key: train_roc_auc
|
|
value: [0.88848921 0.92086331 0.88848921 0.94064748 0.71942446 0.89748201
|
|
0.92446043 0.92266187 0.91732421 0.76517496]
|
|
|
|
mean value: 0.8785017147572265
|
|
|
|
key: test_jcc
|
|
value: [0.78947368 0.75675676 0.65625 0.88571429 0.33333333 0.76923077
|
|
0.65714286 0.88235294 0.73529412 0.48484848]
|
|
|
|
mean value: 0.6950397230060543
|
|
|
|
key: train_jcc
|
|
value: [0.81437126 0.85185185 0.77935943 0.89036545 0.44086022 0.82621951
|
|
0.85810811 0.85901639 0.83972125 0.54513889]
|
|
|
|
mean value: 0.7705012360490753
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.77
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03200245 0.03018045 0.02014041 0.03684139 0.03164029 0.0324831
|
|
0.03526735 0.02932453 0.02801824 0.02917194]
|
|
|
|
mean value: 0.0305070161819458
|
|
|
|
key: score_time
|
|
value: [0.0254848 0.03329682 0.01865816 0.02000928 0.02121401 0.01249504
|
|
0.01821375 0.02222514 0.02327013 0.02214599]
|
|
|
|
mean value: 0.02170131206512451
|
|
|
|
key: test_mcc
|
|
value: [0.96824584 0.84266484 0.90369611 0.93743687 0.90748521 0.93548387
|
|
0.90369611 0.90369611 0.90215054 0.80322581]
|
|
|
|
mean value: 0.9007781314102745
|
|
|
|
key: train_mcc
|
|
value: [0.94283651 0.93585746 0.92124484 0.9354697 0.93563929 0.92494527
|
|
0.91054923 0.93563929 0.92138939 0.92878086]
|
|
|
|
mean value: 0.929235183258956
|
|
|
|
key: test_accuracy
|
|
value: [0.98387097 0.91935484 0.9516129 0.96774194 0.9516129 0.96774194
|
|
0.9516129 0.9516129 0.95081967 0.90163934]
|
|
|
|
mean value: 0.9497620306716024
|
|
|
|
key: train_accuracy
|
|
value: [0.97122302 0.9676259 0.96043165 0.9676259 0.9676259 0.96223022
|
|
0.95503597 0.9676259 0.96050269 0.96409336]
|
|
|
|
mean value: 0.9644020510700955
|
|
|
|
key: test_fscore
|
|
value: [0.98412698 0.92307692 0.95081967 0.96875 0.95384615 0.96774194
|
|
0.95081967 0.95081967 0.95081967 0.9 ]
|
|
|
|
mean value: 0.9500820685058522
|
|
|
|
key: train_fscore
|
|
value: [0.97163121 0.96819788 0.96099291 0.96797153 0.96808511 0.96283186
|
|
0.95575221 0.96808511 0.96099291 0.96478873]
|
|
|
|
mean value: 0.9649329447341147
|
|
|
|
key: test_precision
|
|
value: [0.96875 0.88235294 0.96666667 0.93939394 0.91176471 0.96774194
|
|
0.96666667 0.96666667 0.96666667 0.9 ]
|
|
|
|
mean value: 0.9436670188603301
|
|
|
|
key: train_precision
|
|
value: [0.95804196 0.95138889 0.94755245 0.95774648 0.95454545 0.94773519
|
|
0.94076655 0.95454545 0.94755245 0.94809689]
|
|
|
|
mean value: 0.9507971757973318
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 0.93548387 1. 1. 0.96774194
|
|
0.93548387 0.93548387 0.93548387 0.9 ]
|
|
|
|
mean value: 0.957741935483871
|
|
|
|
key: train_recall
|
|
value: [0.98561151 0.98561151 0.97482014 0.97841727 0.98201439 0.97841727
|
|
0.97122302 0.98201439 0.97482014 0.98207885]
|
|
|
|
mean value: 0.9795028493334365
|
|
|
|
key: test_roc_auc
|
|
value: [0.98387097 0.91935484 0.9516129 0.96774194 0.9516129 0.96774194
|
|
0.9516129 0.9516129 0.95107527 0.9016129 ]
|
|
|
|
mean value: 0.9497849462365592
|
|
|
|
key: train_roc_auc
|
|
value: [0.97122302 0.9676259 0.96043165 0.9676259 0.9676259 0.96223022
|
|
0.95503597 0.9676259 0.96052835 0.96406101]
|
|
|
|
mean value: 0.9644013821201104
|
|
|
|
key: test_jcc
|
|
value: [0.96875 0.85714286 0.90625 0.93939394 0.91176471 0.9375
|
|
0.90625 0.90625 0.90625 0.81818182]
|
|
|
|
mean value: 0.9057733320600968
|
|
|
|
key: train_jcc
|
|
value: [0.94482759 0.93835616 0.92491468 0.93793103 0.93814433 0.92832765
|
|
0.91525424 0.93814433 0.92491468 0.93197279]
|
|
|
|
mean value: 0.9322787467857844
|
|
|
|
MCC on Blind test: 0.19
|
|
|
|
Accuracy on Blind test: 0.44
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: /home/tanu/git/LSHTM_analysis/scripts/ml/./katg_config.py:163: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
ros_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./katg_config.py:166: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
ros_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.25432348 0.2824018 0.20859909 0.19930434 0.2196362 0.19928908
|
|
0.19986963 0.19727373 0.24380755 0.21062398]
|
|
|
|
mean value: 0.22151288986206055
|
|
|
|
key: score_time
|
|
value: [0.02141023 0.0218761 0.02016091 0.01457238 0.01933861 0.01388955
|
|
0.01085019 0.01082468 0.0215745 0.02148724]
|
|
|
|
mean value: 0.017598438262939452
|
|
|
|
key: test_mcc
|
|
value: [0.96824584 0.84266484 0.90369611 0.93743687 0.87278605 0.96824584
|
|
0.93548387 0.90369611 0.9344086 0.83638369]
|
|
|
|
mean value: 0.9103047822115174
|
|
|
|
key: train_mcc
|
|
value: [0.94283651 0.94283651 0.93563929 0.9354697 0.93900081 0.92844206
|
|
0.93238486 0.9393413 0.93207468 0.9355825 ]
|
|
|
|
mean value: 0.9363608223697346
|
|
|
|
key: test_accuracy
|
|
value: [0.98387097 0.91935484 0.9516129 0.96774194 0.93548387 0.98387097
|
|
0.96774194 0.9516129 0.96721311 0.91803279]
|
|
|
|
mean value: 0.9546536224219989
|
|
|
|
key: train_accuracy
|
|
value: [0.97122302 0.97122302 0.9676259 0.9676259 0.96942446 0.96402878
|
|
0.96582734 0.96942446 0.96588869 0.96768402]
|
|
|
|
mean value: 0.9679975588649368
|
|
|
|
key: test_fscore
|
|
value: [0.98412698 0.92307692 0.95081967 0.96875 0.9375 0.98360656
|
|
0.96774194 0.95081967 0.96774194 0.91525424]
|
|
|
|
mean value: 0.9549437917099128
|
|
|
|
key: train_fscore
|
|
value: [0.97163121 0.97163121 0.96808511 0.96797153 0.96969697 0.96453901
|
|
0.9664903 0.9699115 0.96625222 0.96808511]
|
|
|
|
mean value: 0.9684294155648834
|
|
|
|
key: test_precision
|
|
value: [0.96875 0.88235294 0.96666667 0.93939394 0.90909091 1.
|
|
0.96774194 0.96666667 0.96774194 0.93103448]
|
|
|
|
mean value: 0.9499439476721016
|
|
|
|
key: train_precision
|
|
value: [0.95804196 0.95804196 0.95454545 0.95774648 0.96113074 0.95104895
|
|
0.94809689 0.95470383 0.95438596 0.95789474]
|
|
|
|
mean value: 0.9555636962921179
|
|
|
|
key: test_recall
|
|
value: [1. 0.96774194 0.93548387 1. 0.96774194 0.96774194
|
|
0.96774194 0.93548387 0.96774194 0.9 ]
|
|
|
|
mean value: 0.9609677419354838
|
|
|
|
key: train_recall
|
|
value: [0.98561151 0.98561151 0.98201439 0.97841727 0.97841727 0.97841727
|
|
0.98561151 0.98561151 0.97841727 0.97849462]
|
|
|
|
mean value: 0.9816624120058792
|
|
|
|
key: test_roc_auc
|
|
value: [0.98387097 0.91935484 0.9516129 0.96774194 0.93548387 0.98387097
|
|
0.96774194 0.9516129 0.9672043 0.91774194]
|
|
|
|
mean value: 0.9546236559139786
|
|
|
|
key: train_roc_auc
|
|
value: [0.97122302 0.97122302 0.9676259 0.9676259 0.96942446 0.96402878
|
|
0.96582734 0.96942446 0.96591114 0.96766458]
|
|
|
|
mean value: 0.9679978597766948
|
|
|
|
key: test_jcc
|
|
value: [0.96875 0.85714286 0.90625 0.93939394 0.88235294 0.96774194
|
|
0.9375 0.90625 0.9375 0.84375 ]
|
|
|
|
mean value: 0.9146631673197139
|
|
|
|
key: train_jcc
|
|
value: [0.94482759 0.94482759 0.93814433 0.93793103 0.94117647 0.93150685
|
|
0.93515358 0.94158076 0.9347079 0.93814433]
|
|
|
|
mean value: 0.9388000430005232
|
|
|
|
MCC on Blind test: 0.15
|
|
|
|
Accuracy on Blind test: 0.38
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02441096 0.02007389 0.02128315 0.01857615 0.01912689 0.01964045
|
|
0.02089906 0.01830864 0.02194476 0.0219276 ]
|
|
|
|
mean value: 0.02061915397644043
|
|
|
|
key: score_time
|
|
value: [0.01061249 0.01058674 0.01089358 0.01047206 0.01052213 0.01050425
|
|
0.01066399 0.01052094 0.01055193 0.01059175]
|
|
|
|
mean value: 0.010591983795166016
|
|
|
|
key: test_mcc
|
|
value: [0.56360186 0.56360186 0.75 0.68884672 0.8819171 0.82717019
|
|
0.9375 0.87083333 0.80753845 0.82078268]
|
|
|
|
mean value: 0.7711792204154371
|
|
|
|
key: train_mcc
|
|
value: [0.83904826 0.83305418 0.804094 0.83230783 0.81084496 0.79737782
|
|
0.84634011 0.79137125 0.81153605 0.79748625]
|
|
|
|
mean value: 0.8163460715624157
|
|
|
|
key: test_accuracy
|
|
value: [0.78125 0.78125 0.875 0.84375 0.9375 0.90625
|
|
0.96774194 0.93548387 0.90322581 0.90322581]
|
|
|
|
mean value: 0.8834677419354838
|
|
|
|
key: train_accuracy
|
|
value: [0.91901408 0.91549296 0.90140845 0.91549296 0.90492958 0.89788732
|
|
0.92280702 0.89473684 0.90526316 0.89824561]
|
|
|
|
mean value: 0.9075277983691623
|
|
|
|
key: test_fscore
|
|
value: [0.78787879 0.77419355 0.875 0.84848485 0.94117647 0.91428571
|
|
0.96774194 0.93333333 0.90909091 0.91428571]
|
|
|
|
mean value: 0.886547126181851
|
|
|
|
key: train_fscore
|
|
value: [0.9209622 0.91836735 0.90410959 0.91780822 0.90721649 0.90102389
|
|
0.92465753 0.89864865 0.90721649 0.90034364]
|
|
|
|
mean value: 0.9100354060453281
|
|
|
|
key: test_precision
|
|
value: [0.76470588 0.8 0.875 0.82352941 0.88888889 0.84210526
|
|
0.9375 0.93333333 0.88235294 0.84210526]
|
|
|
|
mean value: 0.858952098383213
|
|
|
|
key: train_precision
|
|
value: [0.89932886 0.88815789 0.88 0.89333333 0.88590604 0.87417219
|
|
0.90604027 0.86928105 0.88590604 0.87919463]
|
|
|
|
mean value: 0.8861320298178448
|
|
|
|
key: test_recall
|
|
value: [0.8125 0.75 0.875 0.875 1. 1.
|
|
1. 0.93333333 0.9375 1. ]
|
|
|
|
mean value: 0.9183333333333333
|
|
|
|
key: train_recall
|
|
value: [0.94366197 0.95070423 0.92957746 0.94366197 0.92957746 0.92957746
|
|
0.94405594 0.93006993 0.92957746 0.92253521]
|
|
|
|
mean value: 0.9352999113562493
|
|
|
|
key: test_roc_auc
|
|
value: [0.78125 0.78125 0.875 0.84375 0.9375 0.90625
|
|
0.96875 0.93541667 0.90208333 0.9 ]
|
|
|
|
mean value: 0.883125
|
|
|
|
key: train_roc_auc
|
|
value: [0.91901408 0.91549296 0.90140845 0.91549296 0.90492958 0.89788732
|
|
0.9227322 0.89461243 0.90534817 0.89833054]
|
|
|
|
mean value: 0.9075248694967005
|
|
|
|
key: test_jcc
|
|
value: [0.65 0.63157895 0.77777778 0.73684211 0.88888889 0.84210526
|
|
0.9375 0.875 0.83333333 0.84210526]
|
|
|
|
mean value: 0.8015131578947369
|
|
|
|
key: train_jcc
|
|
value: [0.85350318 0.8490566 0.825 0.84810127 0.83018868 0.81987578
|
|
0.85987261 0.81595092 0.83018868 0.81875 ]
|
|
|
|
mean value: 0.8350487720908194
|
|
|
|
MCC on Blind test: 0.22
|
|
|
|
Accuracy on Blind test: 0.54
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.582335 0.77262783 0.64726782 0.610358 0.68636823 0.73911905
|
|
0.64701462 0.70605779 0.70633364 0.64915323]
|
|
|
|
mean value: 0.6746635198593139
|
|
|
|
key: score_time
|
|
value: [0.02002954 0.01183629 0.01188588 0.01181579 0.01440883 0.01112986
|
|
0.01124287 0.01208949 0.01121664 0.01208615]
|
|
|
|
mean value: 0.012774133682250976
|
|
|
|
key: test_mcc
|
|
value: [0.68884672 0.68884672 0.81409158 0.93933644 0.93933644 0.93933644
|
|
0.87866878 1. 0.87083333 0.87770745]
|
|
|
|
mean value: 0.8637003892680549
|
|
|
|
key: train_mcc
|
|
value: [1. 0.99298237 0.94375558 0.93720088 0.97192739 0.95129413
|
|
0.96512319 0.93704438 0.9720266 0.9582759 ]
|
|
|
|
mean value: 0.9629630433458233
|
|
|
|
key: test_accuracy
|
|
value: [0.84375 0.84375 0.90625 0.96875 0.96875 0.96875
|
|
0.93548387 1. 0.93548387 0.93548387]
|
|
|
|
mean value: 0.9306451612903226
|
|
|
|
key: train_accuracy
|
|
value: [1. 0.99647887 0.97183099 0.96830986 0.98591549 0.97535211
|
|
0.98245614 0.96842105 0.98596491 0.97894737]
|
|
|
|
mean value: 0.9813676797627873
|
|
|
|
key: test_fscore
|
|
value: [0.83870968 0.83870968 0.90322581 0.96774194 0.96969697 0.96774194
|
|
0.9375 1. 0.9375 0.94117647]
|
|
|
|
mean value: 0.9302002472543269
|
|
|
|
key: train_fscore
|
|
value: [1. 0.99646643 0.97202797 0.96885813 0.98601399 0.97577855
|
|
0.98269896 0.96885813 0.98601399 0.97916667]
|
|
|
|
mean value: 0.9815882813444314
|
|
|
|
key: test_precision
|
|
value: [0.86666667 0.86666667 0.93333333 1. 0.94117647 1.
|
|
0.88235294 1. 0.9375 0.88888889]
|
|
|
|
mean value: 0.9316584967320262
|
|
|
|
key: train_precision
|
|
value: [1. 1. 0.96527778 0.95238095 0.97916667 0.95918367
|
|
0.97260274 0.95890411 0.97916667 0.96575342]
|
|
|
|
mean value: 0.9732436010934054
|
|
|
|
key: test_recall
|
|
value: [0.8125 0.8125 0.875 0.9375 1. 0.9375 1. 1. 0.9375 1. ]
|
|
|
|
mean value: 0.93125
|
|
|
|
key: train_recall
|
|
value: [1. 0.99295775 0.97887324 0.98591549 0.99295775 0.99295775
|
|
0.99300699 0.97902098 0.99295775 0.99295775]
|
|
|
|
mean value: 0.9901605436816705
|
|
|
|
key: test_roc_auc
|
|
value: [0.84375 0.84375 0.90625 0.96875 0.96875 0.96875
|
|
0.9375 1. 0.93541667 0.93333333]
|
|
|
|
mean value: 0.930625
|
|
|
|
key: train_roc_auc
|
|
value: [1. 0.99647887 0.97183099 0.96830986 0.98591549 0.97535211
|
|
0.98241899 0.96838373 0.98598936 0.97899636]
|
|
|
|
mean value: 0.981367576085886
|
|
|
|
key: test_jcc
|
|
value: [0.72222222 0.72222222 0.82352941 0.9375 0.94117647 0.9375
|
|
0.88235294 1. 0.88235294 0.88888889]
|
|
|
|
mean value: 0.8737745098039216
|
|
|
|
key: train_jcc
|
|
value: [1. 0.99295775 0.94557823 0.93959732 0.97241379 0.9527027
|
|
0.96598639 0.93959732 0.97241379 0.95918367]
|
|
|
|
mean value: 0.9640430965580683
|
|
|
|
MCC on Blind test: 0.17
|
|
|
|
Accuracy on Blind test: 0.43
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00992465 0.00951338 0.00734925 0.007375 0.00736427 0.00711226
|
|
0.007195 0.00721049 0.00700927 0.00702572]
|
|
|
|
mean value: 0.007707929611206055
|
|
|
|
key: score_time
|
|
value: [0.0107615 0.00938344 0.00825024 0.0080514 0.00798678 0.00803542
|
|
0.00796485 0.00782204 0.00787354 0.00787258]
|
|
|
|
mean value: 0.008400177955627442
|
|
|
|
key: test_mcc
|
|
value: [0.625 0.62994079 0.62994079 0.68884672 0.75592895 0.75592895
|
|
0.69203857 0.6125 0.69203857 0.82078268]
|
|
|
|
mean value: 0.6902946005610465
|
|
|
|
key: train_mcc
|
|
value: [0.74714613 0.73268511 0.71170894 0.74714613 0.71859502 0.73355944
|
|
0.73273302 0.71308876 0.7285593 0.7124563 ]
|
|
|
|
mean value: 0.7277678155822408
|
|
|
|
key: test_accuracy
|
|
value: [0.8125 0.8125 0.8125 0.84375 0.875 0.875
|
|
0.83870968 0.80645161 0.83870968 0.90322581]
|
|
|
|
mean value: 0.8418346774193548
|
|
|
|
key: train_accuracy
|
|
value: [0.87323944 0.86619718 0.8556338 0.87323944 0.85915493 0.86619718
|
|
0.86315789 0.85614035 0.86315789 0.85614035]
|
|
|
|
mean value: 0.8632258463059056
|
|
|
|
key: test_fscore
|
|
value: [0.8125 0.82352941 0.82352941 0.84848485 0.88235294 0.88235294
|
|
0.84848485 0.8 0.82758621 0.91428571]
|
|
|
|
mean value: 0.8463106324034316
|
|
|
|
key: train_fscore
|
|
value: [0.87586207 0.86805556 0.85813149 0.87586207 0.86111111 0.86986301
|
|
0.87213115 0.86006826 0.86779661 0.85714286]
|
|
|
|
mean value: 0.8666024180424603
|
|
|
|
key: test_precision
|
|
value: [0.8125 0.77777778 0.77777778 0.82352941 0.83333333 0.83333333
|
|
0.77777778 0.8 0.92307692 0.84210526]
|
|
|
|
mean value: 0.8201211597999524
|
|
|
|
key: train_precision
|
|
value: [0.85810811 0.85616438 0.84353741 0.85810811 0.84931507 0.84666667
|
|
0.82098765 0.84 0.83660131 0.84827586]
|
|
|
|
mean value: 0.845776457348316
|
|
|
|
key: test_recall
|
|
value: [0.8125 0.875 0.875 0.875 0.9375 0.9375
|
|
0.93333333 0.8 0.75 1. ]
|
|
|
|
mean value: 0.8795833333333334
|
|
|
|
key: train_recall
|
|
value: [0.8943662 0.88028169 0.87323944 0.8943662 0.87323944 0.8943662
|
|
0.93006993 0.88111888 0.90140845 0.86619718]
|
|
|
|
mean value: 0.8888653599921206
|
|
|
|
key: test_roc_auc
|
|
value: [0.8125 0.8125 0.8125 0.84375 0.875 0.875
|
|
0.84166667 0.80625 0.84166667 0.9 ]
|
|
|
|
mean value: 0.8420833333333333
|
|
|
|
key: train_roc_auc
|
|
value: [0.87323944 0.86619718 0.8556338 0.87323944 0.85915493 0.86619718
|
|
0.86292229 0.8560524 0.86329164 0.85617551]
|
|
|
|
mean value: 0.8632103811681276
|
|
|
|
key: test_jcc
|
|
value: [0.68421053 0.7 0.7 0.73684211 0.78947368 0.78947368
|
|
0.73684211 0.66666667 0.70588235 0.84210526]
|
|
|
|
mean value: 0.7351496388028895
|
|
|
|
key: train_jcc
|
|
value: [0.7791411 0.76687117 0.75151515 0.7791411 0.75609756 0.76969697
|
|
0.77325581 0.75449102 0.76646707 0.75 ]
|
|
|
|
mean value: 0.7646676954206684
|
|
|
|
MCC on Blind test: 0.22
|
|
|
|
Accuracy on Blind test: 0.59
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.0074749 0.00728703 0.00717139 0.00743604 0.00727081 0.00725842
|
|
0.00724769 0.00720358 0.00726557 0.00720382]
|
|
|
|
mean value: 0.0072819232940673825
|
|
|
|
key: score_time
|
|
value: [0.00798202 0.00782013 0.00794554 0.00785279 0.00791287 0.00786495
|
|
0.0078876 0.00792551 0.00783515 0.00804877]
|
|
|
|
mean value: 0.007907533645629882
|
|
|
|
key: test_mcc
|
|
value: [0.68884672 0.56360186 0.68884672 0.625 0.438357 0.68884672
|
|
0.48954403 0.48333333 0.55573827 0.55573827]
|
|
|
|
mean value: 0.5777852941864914
|
|
|
|
key: train_mcc
|
|
value: [0.64814452 0.64814452 0.6479516 0.63405443 0.65572679 0.62714946
|
|
0.62393794 0.65616074 0.64212548 0.6494089 ]
|
|
|
|
mean value: 0.6432804381067745
|
|
|
|
key: test_accuracy
|
|
value: [0.84375 0.78125 0.84375 0.8125 0.71875 0.84375
|
|
0.74193548 0.74193548 0.77419355 0.77419355]
|
|
|
|
mean value: 0.7876008064516129
|
|
|
|
key: train_accuracy
|
|
value: [0.82394366 0.82394366 0.82394366 0.81690141 0.82746479 0.81338028
|
|
0.81052632 0.82807018 0.82105263 0.8245614 ]
|
|
|
|
mean value: 0.8213787991104522
|
|
|
|
key: test_fscore
|
|
value: [0.83870968 0.77419355 0.84848485 0.8125 0.70967742 0.84848485
|
|
0.75 0.73333333 0.8 0.8 ]
|
|
|
|
mean value: 0.791538367546432
|
|
|
|
key: train_fscore
|
|
value: [0.82638889 0.82638889 0.82269504 0.81944444 0.83161512 0.816609
|
|
0.82 0.82807018 0.82105263 0.82638889]
|
|
|
|
mean value: 0.8238653070404355
|
|
|
|
key: test_precision
|
|
value: [0.86666667 0.8 0.82352941 0.8125 0.73333333 0.82352941
|
|
0.70588235 0.73333333 0.73684211 0.73684211]
|
|
|
|
mean value: 0.7772458720330238
|
|
|
|
key: train_precision
|
|
value: [0.81506849 0.81506849 0.82857143 0.80821918 0.81208054 0.80272109
|
|
0.78343949 0.83098592 0.81818182 0.81506849]
|
|
|
|
mean value: 0.8129404935574437
|
|
|
|
key: test_recall
|
|
value: [0.8125 0.75 0.875 0.8125 0.6875 0.875
|
|
0.8 0.73333333 0.875 0.875 ]
|
|
|
|
mean value: 0.8095833333333333
|
|
|
|
key: train_recall
|
|
value: [0.83802817 0.83802817 0.81690141 0.83098592 0.85211268 0.83098592
|
|
0.86013986 0.82517483 0.82394366 0.83802817]
|
|
|
|
mean value: 0.8354328769821727
|
|
|
|
key: test_roc_auc
|
|
value: [0.84375 0.78125 0.84375 0.8125 0.71875 0.84375
|
|
0.74375 0.74166667 0.77083333 0.77083333]
|
|
|
|
mean value: 0.7870833333333334
|
|
|
|
key: train_roc_auc
|
|
value: [0.82394366 0.82394366 0.82394366 0.81690141 0.82746479 0.81338028
|
|
0.81035162 0.82808037 0.82106274 0.82460849]
|
|
|
|
mean value: 0.8213680685511672
|
|
|
|
key: test_jcc
|
|
value: [0.72222222 0.63157895 0.73684211 0.68421053 0.55 0.73684211
|
|
0.6 0.57894737 0.66666667 0.66666667]
|
|
|
|
mean value: 0.6573976608187134
|
|
|
|
key: train_jcc
|
|
value: [0.70414201 0.70414201 0.69879518 0.69411765 0.71176471 0.69005848
|
|
0.69491525 0.70658683 0.69642857 0.70414201]
|
|
|
|
mean value: 0.7005092700712355
|
|
|
|
MCC on Blind test: 0.19
|
|
|
|
Accuracy on Blind test: 0.54
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00720549 0.00690293 0.00749421 0.0067699 0.00747395 0.00748181
|
|
0.00723648 0.00759244 0.0074594 0.00673008]
|
|
|
|
mean value: 0.0072346687316894535
|
|
|
|
key: score_time
|
|
value: [0.01040697 0.01126409 0.01092076 0.01008987 0.01062059 0.01053739
|
|
0.01394534 0.01144624 0.01064205 0.01186824]
|
|
|
|
mean value: 0.011174154281616212
|
|
|
|
key: test_mcc
|
|
value: [0.62994079 0.31311215 0.56360186 0.56360186 0.31814238 0.82717019
|
|
0.82285074 0.67916667 0.57461167 0.68826048]
|
|
|
|
mean value: 0.5980458781591939
|
|
|
|
key: train_mcc
|
|
value: [0.71838112 0.74655293 0.7253701 0.74655293 0.71142639 0.68311553
|
|
0.69826652 0.67718901 0.70556653 0.67774254]
|
|
|
|
mean value: 0.7090163590202463
|
|
|
|
key: test_accuracy
|
|
value: [0.8125 0.65625 0.78125 0.78125 0.65625 0.90625
|
|
0.90322581 0.83870968 0.77419355 0.83870968]
|
|
|
|
mean value: 0.794858870967742
|
|
|
|
key: train_accuracy
|
|
value: [0.85915493 0.87323944 0.86267606 0.87323944 0.8556338 0.8415493
|
|
0.84912281 0.83859649 0.85263158 0.83859649]
|
|
|
|
mean value: 0.8544440326167532
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.66666667 0.78787879 0.77419355 0.62068966 0.91428571
|
|
0.90909091 0.83870968 0.81081081 0.85714286]
|
|
|
|
mean value: 0.7979468626854611
|
|
|
|
key: train_fscore
|
|
value: [0.85815603 0.87412587 0.86315789 0.87412587 0.85714286 0.8409894
|
|
0.84912281 0.83916084 0.85416667 0.83453237]
|
|
|
|
mean value: 0.8544680614739297
|
|
|
|
key: test_precision
|
|
value: [0.85714286 0.64705882 0.76470588 0.8 0.69230769 0.84210526
|
|
0.83333333 0.8125 0.71428571 0.78947368]
|
|
|
|
mean value: 0.7752913250320371
|
|
|
|
key: train_precision
|
|
value: [0.86428571 0.86805556 0.86013986 0.86805556 0.84827586 0.84397163
|
|
0.85211268 0.83916084 0.84246575 0.85294118]
|
|
|
|
mean value: 0.8539464623923748
|
|
|
|
key: test_recall
|
|
value: [0.75 0.6875 0.8125 0.75 0.5625 1.
|
|
1. 0.86666667 0.9375 0.9375 ]
|
|
|
|
mean value: 0.8304166666666667
|
|
|
|
key: train_recall
|
|
value: [0.85211268 0.88028169 0.86619718 0.88028169 0.86619718 0.83802817
|
|
0.84615385 0.83916084 0.86619718 0.81690141]
|
|
|
|
mean value: 0.8551511868413277
|
|
|
|
key: test_roc_auc
|
|
value: [0.8125 0.65625 0.78125 0.78125 0.65625 0.90625
|
|
0.90625 0.83958333 0.76875 0.83541667]
|
|
|
|
mean value: 0.794375
|
|
|
|
key: train_roc_auc
|
|
value: [0.85915493 0.87323944 0.86267606 0.87323944 0.8556338 0.8415493
|
|
0.84913326 0.8385945 0.85267901 0.83852063]
|
|
|
|
mean value: 0.854442036836403
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.5 0.65 0.63157895 0.45 0.84210526
|
|
0.83333333 0.72222222 0.68181818 0.75 ]
|
|
|
|
mean value: 0.672772461456672
|
|
|
|
key: train_jcc
|
|
value: [0.7515528 0.77639752 0.75925926 0.77639752 0.75 0.72560976
|
|
0.73780488 0.72289157 0.74545455 0.71604938]
|
|
|
|
mean value: 0.7461417213928212
|
|
|
|
MCC on Blind test: 0.18
|
|
|
|
Accuracy on Blind test: 0.56
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01138067 0.01109099 0.01127625 0.01128221 0.01073098 0.0110662
|
|
0.01142335 0.01140451 0.01070118 0.01133704]
|
|
|
|
mean value: 0.011169338226318359
|
|
|
|
key: score_time
|
|
value: [0.00933075 0.00924778 0.00916719 0.00923133 0.00920081 0.00921702
|
|
0.00934625 0.00917053 0.00837874 0.0092051 ]
|
|
|
|
mean value: 0.009149551391601562
|
|
|
|
key: test_mcc
|
|
value: [0.625 0.50395263 0.57265629 0.64549722 0.81409158 0.77459667
|
|
0.76948376 0.80833333 0.6310315 0.76594169]
|
|
|
|
mean value: 0.6910584675818924
|
|
|
|
key: train_mcc
|
|
value: [0.7618988 0.7476577 0.76035829 0.75897979 0.73060671 0.72554232
|
|
0.7375982 0.72956319 0.72987459 0.71397006]
|
|
|
|
mean value: 0.7396049640194965
|
|
|
|
key: test_accuracy
|
|
value: [0.8125 0.75 0.78125 0.8125 0.90625 0.875
|
|
0.87096774 0.90322581 0.80645161 0.87096774]
|
|
|
|
mean value: 0.8389112903225806
|
|
|
|
key: train_accuracy
|
|
value: [0.87676056 0.86971831 0.87676056 0.87676056 0.86267606 0.85915493
|
|
0.86315789 0.85964912 0.85964912 0.85263158]
|
|
|
|
mean value: 0.8656918705213739
|
|
|
|
key: test_fscore
|
|
value: [0.8125 0.76470588 0.8 0.83333333 0.90909091 0.88888889
|
|
0.88235294 0.90322581 0.83333333 0.88888889]
|
|
|
|
mean value: 0.8516319983516378
|
|
|
|
key: train_fscore
|
|
value: [0.8852459 0.87868852 0.88448845 0.88372093 0.87043189 0.86842105
|
|
0.87459807 0.87096774 0.87012987 0.8627451 ]
|
|
|
|
mean value: 0.8749437532470357
|
|
|
|
key: test_precision
|
|
value: [0.8125 0.72222222 0.73684211 0.75 0.88235294 0.8
|
|
0.78947368 0.875 0.75 0.8 ]
|
|
|
|
mean value: 0.7918390952872377
|
|
|
|
key: train_precision
|
|
value: [0.82822086 0.82208589 0.83229814 0.83647799 0.82389937 0.81481481
|
|
0.80952381 0.80838323 0.80722892 0.80487805]
|
|
|
|
mean value: 0.8187811065917483
|
|
|
|
key: test_recall
|
|
value: [0.8125 0.8125 0.875 0.9375 0.9375 1.
|
|
1. 0.93333333 0.9375 1. ]
|
|
|
|
mean value: 0.9245833333333333
|
|
|
|
key: train_recall
|
|
value: [0.95070423 0.94366197 0.94366197 0.93661972 0.92253521 0.92957746
|
|
0.95104895 0.94405594 0.94366197 0.92957746]
|
|
|
|
mean value: 0.9395104895104895
|
|
|
|
key: test_roc_auc
|
|
value: [0.8125 0.75 0.78125 0.8125 0.90625 0.875
|
|
0.875 0.90416667 0.80208333 0.86666667]
|
|
|
|
mean value: 0.8385416666666666
|
|
|
|
key: train_roc_auc
|
|
value: [0.87676056 0.86971831 0.87676056 0.87676056 0.86267606 0.85915493
|
|
0.86284842 0.85935192 0.85994287 0.85290062]
|
|
|
|
mean value: 0.865687481532552
|
|
|
|
key: test_jcc
|
|
value: [0.68421053 0.61904762 0.66666667 0.71428571 0.83333333 0.8
|
|
0.78947368 0.82352941 0.71428571 0.8 ]
|
|
|
|
mean value: 0.744483266991007
|
|
|
|
key: train_jcc
|
|
value: [0.79411765 0.78362573 0.79289941 0.79166667 0.77058824 0.76744186
|
|
0.77714286 0.77142857 0.77011494 0.75862069]
|
|
|
|
mean value: 0.7777646609518236
|
|
|
|
MCC on Blind test: 0.23
|
|
|
|
Accuracy on Blind test: 0.48
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.94244218 1.01735759 0.8850472 1.05389714 0.87144995 0.99552441
|
|
0.90028095 0.87230182 1.03594947 0.87329125]
|
|
|
|
mean value: 0.9447541952133178
|
|
|
|
key: score_time
|
|
value: [0.01177907 0.013484 0.01336789 0.01362443 0.01371074 0.01331043
|
|
0.01345372 0.01344275 0.01364231 0.01379061]
|
|
|
|
mean value: 0.013360595703125
|
|
|
|
key: test_mcc
|
|
value: [0.68884672 0.68884672 0.69991324 0.875 0.8819171 0.875
|
|
0.80833333 0.9375 0.74166667 0.82078268]
|
|
|
|
mean value: 0.8017806465017004
|
|
|
|
key: train_mcc
|
|
value: [1. 0.99298237 0.99298237 0.99298237 0.98591549 0.99298237
|
|
0.98596474 0.9789707 0.99300699 0.99300665]
|
|
|
|
mean value: 0.9908794051074042
|
|
|
|
key: test_accuracy
|
|
value: [0.84375 0.84375 0.84375 0.9375 0.9375 0.9375
|
|
0.90322581 0.96774194 0.87096774 0.90322581]
|
|
|
|
mean value: 0.8988911290322581
|
|
|
|
key: train_accuracy
|
|
value: [1. 0.99647887 0.99647887 0.99647887 0.99295775 0.99647887
|
|
0.99298246 0.98947368 0.99649123 0.99649123]
|
|
|
|
mean value: 0.9954311835927848
|
|
|
|
key: test_fscore
|
|
value: [0.84848485 0.83870968 0.85714286 0.9375 0.94117647 0.9375
|
|
0.90322581 0.96774194 0.875 0.91428571]
|
|
|
|
mean value: 0.9020767309856494
|
|
|
|
key: train_fscore
|
|
value: [1. 0.99646643 0.99646643 0.99646643 0.99295775 0.99646643
|
|
0.99300699 0.98954704 0.99649123 0.99646643]
|
|
|
|
mean value: 0.99543351613606
|
|
|
|
key: test_precision
|
|
value: [0.82352941 0.86666667 0.78947368 0.9375 0.88888889 0.9375
|
|
0.875 0.9375 0.875 0.84210526]
|
|
|
|
mean value: 0.8773163914688682
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 0.99295775 1.
|
|
0.99300699 0.98611111 0.99300699 1. ]
|
|
|
|
mean value: 0.9965082843603971
|
|
|
|
key: test_recall
|
|
value: [0.875 0.8125 0.9375 0.9375 1. 0.9375
|
|
0.93333333 1. 0.875 1. ]
|
|
|
|
mean value: 0.9308333333333333
|
|
|
|
key: train_recall
|
|
value: [1. 0.99295775 0.99295775 0.99295775 0.99295775 0.99295775
|
|
0.99300699 0.99300699 1. 0.99295775]
|
|
|
|
mean value: 0.9943760464887226
|
|
|
|
key: test_roc_auc
|
|
value: [0.84375 0.84375 0.84375 0.9375 0.9375 0.9375
|
|
0.90416667 0.96875 0.87083333 0.9 ]
|
|
|
|
mean value: 0.89875
|
|
|
|
key: train_roc_auc
|
|
value: [1. 0.99647887 0.99647887 0.99647887 0.99295775 0.99647887
|
|
0.99298237 0.98946124 0.9965035 0.99647887]
|
|
|
|
mean value: 0.9954299221904855
|
|
|
|
key: test_jcc
|
|
value: [0.73684211 0.72222222 0.75 0.88235294 0.88888889 0.88235294
|
|
0.82352941 0.9375 0.77777778 0.84210526]
|
|
|
|
mean value: 0.8243571551427589
|
|
|
|
key: train_jcc
|
|
value: [1. 0.99295775 0.99295775 0.99295775 0.98601399 0.99295775
|
|
0.98611111 0.97931034 0.99300699 0.99295775]
|
|
|
|
mean value: 0.9909231167354042
|
|
|
|
MCC on Blind test: 0.18
|
|
|
|
Accuracy on Blind test: 0.45
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01128316 0.01111746 0.00982285 0.009413 0.00901008 0.00897694
|
|
0.00881934 0.00902772 0.00959873 0.00934815]
|
|
|
|
mean value: 0.009641742706298828
|
|
|
|
key: score_time
|
|
value: [0.01056886 0.00906277 0.00891089 0.00860476 0.0085628 0.00862837
|
|
0.00838184 0.00824451 0.00853562 0.00857925]
|
|
|
|
mean value: 0.008807969093322755
|
|
|
|
key: test_mcc
|
|
value: [0.81409158 0.68884672 0.875 1. 0.8819171 0.93933644
|
|
0.9375 1. 0.80833333 0.80753845]
|
|
|
|
mean value: 0.8752563621702886
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.90625 0.84375 0.9375 1. 0.9375 0.96875
|
|
0.96774194 1. 0.90322581 0.90322581]
|
|
|
|
mean value: 0.9367943548387097
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.83870968 0.9375 1. 0.94117647 0.96774194
|
|
0.96774194 1. 0.90322581 0.90909091]
|
|
|
|
mean value: 0.9374277643608763
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.88235294 0.86666667 0.9375 1. 0.88888889 1.
|
|
0.9375 1. 0.93333333 0.88235294]
|
|
|
|
mean value: 0.932859477124183
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.9375 0.8125 0.9375 1. 1. 0.9375 1. 1. 0.875 0.9375]
|
|
|
|
mean value: 0.94375
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.90625 0.84375 0.9375 1. 0.9375 0.96875
|
|
0.96875 1. 0.90416667 0.90208333]
|
|
|
|
mean value: 0.936875
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.72222222 0.88235294 1. 0.88888889 0.9375
|
|
0.9375 1. 0.82352941 0.83333333]
|
|
|
|
mean value: 0.8858660130718954
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.02
|
|
|
|
Accuracy on Blind test: 0.22
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.09702897 0.09689069 0.096277 0.09499073 0.09550571 0.09818435
|
|
0.09888387 0.09795642 0.09792018 0.09375334]
|
|
|
|
mean value: 0.09673912525177002
|
|
|
|
key: score_time
|
|
value: [0.01839042 0.01852298 0.0182128 0.01794076 0.01818895 0.01855779
|
|
0.01845098 0.01811409 0.01863813 0.01832128]
|
|
|
|
mean value: 0.018333816528320314
|
|
|
|
key: test_mcc
|
|
value: [0.68884672 0.68884672 0.68884672 0.62994079 0.81409158 0.93933644
|
|
0.9375 1. 0.87083333 0.87770745]
|
|
|
|
mean value: 0.8135949748773968
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.84375 0.84375 0.84375 0.8125 0.90625 0.96875
|
|
0.96774194 1. 0.93548387 0.93548387]
|
|
|
|
mean value: 0.9057459677419355
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.84848485 0.84848485 0.84848485 0.82352941 0.90909091 0.96969697
|
|
0.96774194 1. 0.9375 0.94117647]
|
|
|
|
mean value: 0.9094190242079236
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.82352941 0.82352941 0.82352941 0.77777778 0.88235294 0.94117647
|
|
0.9375 1. 0.9375 0.88888889]
|
|
|
|
mean value: 0.883578431372549
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.875 0.875 0.875 0.875 0.9375 1. 1. 1. 0.9375 1. ]
|
|
|
|
mean value: 0.9375
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.84375 0.84375 0.84375 0.8125 0.90625 0.96875
|
|
0.96875 1. 0.93541667 0.93333333]
|
|
|
|
mean value: 0.905625
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.73684211 0.73684211 0.73684211 0.7 0.83333333 0.94117647
|
|
0.9375 1. 0.88235294 0.88888889]
|
|
|
|
mean value: 0.8393777949776402
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.23
|
|
|
|
Accuracy on Blind test: 0.45
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00812912 0.00792956 0.00760221 0.00785184 0.00799227 0.0079267
|
|
0.00804663 0.00806618 0.00824046 0.00799036]
|
|
|
|
mean value: 0.007977533340454101
|
|
|
|
key: score_time
|
|
value: [0.00857091 0.00843978 0.00853491 0.00855112 0.0085063 0.00849175
|
|
0.00858927 0.00863695 0.00858855 0.00860786]
|
|
|
|
mean value: 0.008551740646362304
|
|
|
|
key: test_mcc
|
|
value: [0.5 0.69991324 0.50395263 0.77459667 0.82717019 0.82717019
|
|
0.74689528 0.82078268 0.35983579 0.6125 ]
|
|
|
|
mean value: 0.6672816673588119
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.75 0.84375 0.75 0.875 0.90625 0.90625
|
|
0.87096774 0.90322581 0.67741935 0.80645161]
|
|
|
|
mean value: 0.8289314516129032
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.75 0.85714286 0.73333333 0.88888889 0.91428571 0.89655172
|
|
0.85714286 0.88888889 0.66666667 0.8125 ]
|
|
|
|
mean value: 0.8265400930487138
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.75 0.78947368 0.78571429 0.8 0.84210526 1.
|
|
0.92307692 1. 0.71428571 0.8125 ]
|
|
|
|
mean value: 0.8417155870445344
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.75 0.9375 0.6875 1. 1. 0.8125 0.8 0.8 0.625 0.8125]
|
|
|
|
mean value: 0.8225
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.84375 0.75 0.875 0.90625 0.90625
|
|
0.86875 0.9 0.67916667 0.80625 ]
|
|
|
|
mean value: 0.8285416666666667
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.6 0.75 0.57894737 0.8 0.84210526 0.8125
|
|
0.75 0.8 0.5 0.68421053]
|
|
|
|
mean value: 0.7117763157894736
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.49
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.19962478 1.20865035 1.21551681 1.2219255 1.21645331 1.22310948
|
|
1.23031688 1.22133994 1.21362448 1.22138143]
|
|
|
|
mean value: 1.2171942949295045
|
|
|
|
key: score_time
|
|
value: [0.15371752 0.09660053 0.09662104 0.09716916 0.09705114 0.09664798
|
|
0.09718585 0.09721947 0.09733677 0.09707975]
|
|
|
|
mean value: 0.10266292095184326
|
|
|
|
key: test_mcc
|
|
value: [0.81409158 0.875 0.875 0.8819171 0.8819171 1.
|
|
0.9375 1. 1. 0.9372467 ]
|
|
|
|
mean value: 0.9202672483593498
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.90625 0.9375 0.9375 0.9375 0.9375 1.
|
|
0.96774194 1. 1. 0.96774194]
|
|
|
|
mean value: 0.9591733870967742
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.9375 0.9375 0.94117647 0.94117647 1.
|
|
0.96774194 1. 1. 0.96969697]
|
|
|
|
mean value: 0.9603882755448221
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.88235294 0.9375 0.9375 0.88888889 0.88888889 1.
|
|
0.9375 1. 1. 0.94117647]
|
|
|
|
mean value: 0.9413807189542484
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.9375 0.9375 0.9375 1. 1. 1. 1. 1. 1. 1. ]
|
|
|
|
mean value: 0.98125
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.90625 0.9375 0.9375 0.9375 0.9375 1.
|
|
0.96875 1. 1. 0.96666667]
|
|
|
|
mean value: 0.9591666666666667
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.88235294 0.88235294 0.88888889 0.88888889 1.
|
|
0.9375 1. 1. 0.94117647]
|
|
|
|
mean value: 0.9254493464052287
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.09
|
|
|
|
Accuracy on Blind test: 0.21
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.87549925 0.98825598 0.90560436 0.87923479 0.87618375 0.90693331
|
|
0.90976977 0.86167288 0.91743851 0.89844847]
|
|
|
|
mean value: 0.9019041061401367
|
|
|
|
key: score_time
|
|
value: [0.26297545 0.16814804 0.23696327 0.21888471 0.23352385 0.23716521
|
|
0.25041318 0.24322152 0.20568323 0.21444941]
|
|
|
|
mean value: 0.2271427869796753
|
|
|
|
key: test_mcc
|
|
value: [0.68884672 0.875 0.81409158 0.8819171 0.8819171 0.93933644
|
|
0.9375 1. 1. 0.9372467 ]
|
|
|
|
mean value: 0.8955855640414887
|
|
|
|
key: train_mcc
|
|
value: [0.96500412 0.95812669 0.95091647 0.93775982 0.94403659 0.94403659
|
|
0.95108379 0.94422558 0.94423649 0.94423649]
|
|
|
|
mean value: 0.9483662624447999
|
|
|
|
key: test_accuracy
|
|
value: [0.84375 0.9375 0.90625 0.9375 0.9375 0.96875
|
|
0.96774194 1. 1. 0.96774194]
|
|
|
|
mean value: 0.9466733870967742
|
|
|
|
key: train_accuracy
|
|
value: [0.98239437 0.97887324 0.97535211 0.96830986 0.97183099 0.97183099
|
|
0.9754386 0.97192982 0.97192982 0.97192982]
|
|
|
|
mean value: 0.9739819619471214
|
|
|
|
key: test_fscore
|
|
value: [0.84848485 0.9375 0.90909091 0.94117647 0.94117647 0.96969697
|
|
0.96774194 1. 1. 0.96969697]
|
|
|
|
mean value: 0.9484564573630039
|
|
|
|
key: train_fscore
|
|
value: [0.9825784 0.97916667 0.97560976 0.96907216 0.97222222 0.97222222
|
|
0.97577855 0.97241379 0.97222222 0.97222222]
|
|
|
|
mean value: 0.9743508213630365
|
|
|
|
key: test_precision
|
|
value: [0.82352941 0.9375 0.88235294 0.88888889 0.88888889 0.94117647
|
|
0.9375 1. 1. 0.94117647]
|
|
|
|
mean value: 0.9241013071895424
|
|
|
|
key: train_precision
|
|
value: [0.97241379 0.96575342 0.96551724 0.94630872 0.95890411 0.95890411
|
|
0.96575342 0.95918367 0.95890411 0.95890411]
|
|
|
|
mean value: 0.9610546720455594
|
|
|
|
key: test_recall
|
|
value: [0.875 0.9375 0.9375 1. 1. 1. 1. 1. 1. 1. ]
|
|
|
|
mean value: 0.975
|
|
|
|
key: train_recall
|
|
value: [0.99295775 0.99295775 0.98591549 0.99295775 0.98591549 0.98591549
|
|
0.98601399 0.98601399 0.98591549 0.98591549]
|
|
|
|
mean value: 0.9880478676253325
|
|
|
|
key: test_roc_auc
|
|
value: [0.84375 0.9375 0.90625 0.9375 0.9375 0.96875
|
|
0.96875 1. 1. 0.96666667]
|
|
|
|
mean value: 0.9466666666666667
|
|
|
|
key: train_roc_auc
|
|
value: [0.98239437 0.97887324 0.97535211 0.96830986 0.97183099 0.97183099
|
|
0.97540136 0.97188023 0.97197873 0.97197873]
|
|
|
|
mean value: 0.9739830591943268
|
|
|
|
key: test_jcc
|
|
value: [0.73684211 0.88235294 0.83333333 0.88888889 0.88888889 0.94117647
|
|
0.9375 1. 1. 0.94117647]
|
|
|
|
mean value: 0.905015909872721
|
|
|
|
key: train_jcc
|
|
value: [0.96575342 0.95918367 0.95238095 0.94 0.94594595 0.94594595
|
|
0.9527027 0.94630872 0.94594595 0.94594595]
|
|
|
|
mean value: 0.9500113261826575
|
|
|
|
MCC on Blind test: 0.15
|
|
|
|
Accuracy on Blind test: 0.3
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01972842 0.00707984 0.00703955 0.00708413 0.00708127 0.00711823
|
|
0.0071497 0.00713921 0.00711536 0.00711989]
|
|
|
|
mean value: 0.008365559577941894
|
|
|
|
key: score_time
|
|
value: [0.00945282 0.00773811 0.00782251 0.00773239 0.00773835 0.00778341
|
|
0.00787854 0.00775337 0.00779319 0.00774002]
|
|
|
|
mean value: 0.007943272590637207
|
|
|
|
key: test_mcc
|
|
value: [0.68884672 0.56360186 0.68884672 0.625 0.438357 0.68884672
|
|
0.48954403 0.48333333 0.55573827 0.55573827]
|
|
|
|
mean value: 0.5777852941864914
|
|
|
|
key: train_mcc
|
|
value: [0.64814452 0.64814452 0.6479516 0.63405443 0.65572679 0.62714946
|
|
0.62393794 0.65616074 0.64212548 0.6494089 ]
|
|
|
|
mean value: 0.6432804381067745
|
|
|
|
key: test_accuracy
|
|
value: [0.84375 0.78125 0.84375 0.8125 0.71875 0.84375
|
|
0.74193548 0.74193548 0.77419355 0.77419355]
|
|
|
|
mean value: 0.7876008064516129
|
|
|
|
key: train_accuracy
|
|
value: [0.82394366 0.82394366 0.82394366 0.81690141 0.82746479 0.81338028
|
|
0.81052632 0.82807018 0.82105263 0.8245614 ]
|
|
|
|
mean value: 0.8213787991104522
|
|
|
|
key: test_fscore
|
|
value: [0.83870968 0.77419355 0.84848485 0.8125 0.70967742 0.84848485
|
|
0.75 0.73333333 0.8 0.8 ]
|
|
|
|
mean value: 0.791538367546432
|
|
|
|
key: train_fscore
|
|
value: [0.82638889 0.82638889 0.82269504 0.81944444 0.83161512 0.816609
|
|
0.82 0.82807018 0.82105263 0.82638889]
|
|
|
|
mean value: 0.8238653070404355
|
|
|
|
key: test_precision
|
|
value: [0.86666667 0.8 0.82352941 0.8125 0.73333333 0.82352941
|
|
0.70588235 0.73333333 0.73684211 0.73684211]
|
|
|
|
mean value: 0.7772458720330238
|
|
|
|
key: train_precision
|
|
value: [0.81506849 0.81506849 0.82857143 0.80821918 0.81208054 0.80272109
|
|
0.78343949 0.83098592 0.81818182 0.81506849]
|
|
|
|
mean value: 0.8129404935574437
|
|
|
|
key: test_recall
|
|
value: [0.8125 0.75 0.875 0.8125 0.6875 0.875
|
|
0.8 0.73333333 0.875 0.875 ]
|
|
|
|
mean value: 0.8095833333333333
|
|
|
|
key: train_recall
|
|
value: [0.83802817 0.83802817 0.81690141 0.83098592 0.85211268 0.83098592
|
|
0.86013986 0.82517483 0.82394366 0.83802817]
|
|
|
|
mean value: 0.8354328769821727
|
|
|
|
key: test_roc_auc
|
|
value: [0.84375 0.78125 0.84375 0.8125 0.71875 0.84375
|
|
0.74375 0.74166667 0.77083333 0.77083333]
|
|
|
|
mean value: 0.7870833333333334
|
|
|
|
key: train_roc_auc
|
|
value: [0.82394366 0.82394366 0.82394366 0.81690141 0.82746479 0.81338028
|
|
0.81035162 0.82808037 0.82106274 0.82460849]
|
|
|
|
mean value: 0.8213680685511672
|
|
|
|
key: test_jcc
|
|
value: [0.72222222 0.63157895 0.73684211 0.68421053 0.55 0.73684211
|
|
0.6 0.57894737 0.66666667 0.66666667]
|
|
|
|
mean value: 0.6573976608187134
|
|
|
|
key: train_jcc
|
|
value: [0.70414201 0.70414201 0.69879518 0.69411765 0.71176471 0.69005848
|
|
0.69491525 0.70658683 0.69642857 0.70414201]
|
|
|
|
mean value: 0.7005092700712355
|
|
|
|
MCC on Blind test: 0.19
|
|
|
|
Accuracy on Blind test: 0.54
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.10866404 0.04417276 0.08032727 0.0377512 0.03836942 0.03934383
|
|
0.0412488 0.73144245 0.03698397 0.03865409]
|
|
|
|
mean value: 0.11969578266143799
|
|
|
|
key: score_time
|
|
value: [0.0095489 0.00957394 0.00984144 0.00939536 0.0093596 0.00946164
|
|
0.00942516 0.00999594 0.01063395 0.00950313]
|
|
|
|
mean value: 0.00967390537261963
|
|
|
|
key: test_mcc
|
|
value: [0.81409158 0.81409158 0.875 0.93933644 0.8819171 1.
|
|
0.9375 1. 0.9375 0.87770745]
|
|
|
|
mean value: 0.9077144148609821
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.90625 0.90625 0.9375 0.96875 0.9375 1.
|
|
0.96774194 1. 0.96774194 0.93548387]
|
|
|
|
mean value: 0.9527217741935484
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.90322581 0.9375 0.96969697 0.94117647 1.
|
|
0.96774194 1. 0.96774194 0.94117647]
|
|
|
|
mean value: 0.9537350497383704
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.88235294 0.93333333 0.9375 0.94117647 0.88888889 1.
|
|
0.9375 1. 1. 0.88888889]
|
|
|
|
mean value: 0.9409640522875817
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.9375 0.875 0.9375 1. 1. 1. 1. 1. 0.9375 1. ]
|
|
|
|
mean value: 0.96875
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.90625 0.90625 0.9375 0.96875 0.9375 1.
|
|
0.96875 1. 0.96875 0.93333333]
|
|
|
|
mean value: 0.9527083333333334
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.82352941 0.88235294 0.94117647 0.88888889 1.
|
|
0.9375 1. 0.9375 0.88888889]
|
|
|
|
mean value: 0.9133169934640523
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.2
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01153302 0.0145793 0.014395 0.0144124 0.01459241 0.0143168
|
|
0.01439118 0.01458287 0.01461196 0.0144515 ]
|
|
|
|
mean value: 0.014186644554138183
|
|
|
|
key: score_time
|
|
value: [0.01013279 0.01050425 0.0104897 0.0105176 0.01051116 0.01054525
|
|
0.01043344 0.01050496 0.01060581 0.01054263]
|
|
|
|
mean value: 0.010478758811950683
|
|
|
|
key: test_mcc
|
|
value: [0.81409158 0.81409158 0.93933644 1. 0.8819171 1.
|
|
0.87083333 1. 1. 0.9372467 ]
|
|
|
|
mean value: 0.9257516728277053
|
|
|
|
key: train_mcc
|
|
value: [0.95812669 0.95812669 0.94403659 0.93720088 0.94403659 0.93720088
|
|
0.95108379 0.95145657 0.94470481 0.9582759 ]
|
|
|
|
mean value: 0.948424939171215
|
|
|
|
key: test_accuracy
|
|
value: [0.90625 0.90625 0.96875 1. 0.9375 1.
|
|
0.93548387 1. 1. 0.96774194]
|
|
|
|
mean value: 0.9621975806451613
|
|
|
|
key: train_accuracy
|
|
value: [0.97887324 0.97887324 0.97183099 0.96830986 0.97183099 0.96830986
|
|
0.9754386 0.9754386 0.97192982 0.97894737]
|
|
|
|
mean value: 0.9739782554978997
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.90322581 0.96969697 1. 0.94117647 1.
|
|
0.93333333 1. 1. 0.96969697]
|
|
|
|
mean value: 0.962622045885803
|
|
|
|
key: train_fscore
|
|
value: [0.97916667 0.97916667 0.97222222 0.96885813 0.97222222 0.96885813
|
|
0.97577855 0.97594502 0.97241379 0.97916667]
|
|
|
|
mean value: 0.9743798064418605
|
|
|
|
key: test_precision
|
|
value: [0.88235294 0.93333333 0.94117647 1. 0.88888889 1.
|
|
0.93333333 1. 1. 0.94117647]
|
|
|
|
mean value: 0.9520261437908497
|
|
|
|
key: train_precision
|
|
value: [0.96575342 0.96575342 0.95890411 0.95238095 0.95890411 0.95238095
|
|
0.96575342 0.95945946 0.9527027 0.96575342]
|
|
|
|
mean value: 0.9597745984732285
|
|
|
|
key: test_recall
|
|
value: [0.9375 0.875 1. 1. 1. 1.
|
|
0.93333333 1. 1. 1. ]
|
|
|
|
mean value: 0.9745833333333334
|
|
|
|
key: train_recall
|
|
value: [0.99295775 0.99295775 0.98591549 0.98591549 0.98591549 0.98591549
|
|
0.98601399 0.99300699 0.99295775 0.99295775]
|
|
|
|
mean value: 0.9894513936767458
|
|
|
|
key: test_roc_auc
|
|
value: [0.90625 0.90625 0.96875 1. 0.9375 1.
|
|
0.93541667 1. 1. 0.96666667]
|
|
|
|
mean value: 0.9620833333333333
|
|
|
|
key: train_roc_auc
|
|
value: [0.97887324 0.97887324 0.97183099 0.96830986 0.97183099 0.96830986
|
|
0.97540136 0.97537674 0.97200335 0.97899636]
|
|
|
|
mean value: 0.9739805968679208
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.82352941 0.94117647 1. 0.88888889 1.
|
|
0.875 1. 1. 0.94117647]
|
|
|
|
mean value: 0.9303104575163399
|
|
|
|
key: train_jcc
|
|
value: [0.95918367 0.95918367 0.94594595 0.93959732 0.94594595 0.93959732
|
|
0.9527027 0.95302013 0.94630872 0.95918367]
|
|
|
|
mean value: 0.9500669104935644
|
|
|
|
MCC on Blind test: 0.16
|
|
|
|
Accuracy on Blind test: 0.36
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00940704 0.00748181 0.00721669 0.00730562 0.00781703 0.00796103
|
|
0.0077672 0.00788808 0.00784159 0.00789857]
|
|
|
|
mean value: 0.007858467102050782
|
|
|
|
key: score_time
|
|
value: [0.00908256 0.00800729 0.00791621 0.00769114 0.00820541 0.00846505
|
|
0.00846457 0.00859737 0.00852084 0.00845647]
|
|
|
|
mean value: 0.008340692520141602
|
|
|
|
key: test_mcc
|
|
value: [0.62994079 0.50395263 0.62994079 0.68884672 0.62994079 0.75592895
|
|
0.67916667 0.61925228 0.74689528 0.66057826]
|
|
|
|
mean value: 0.6544443153383147
|
|
|
|
key: train_mcc
|
|
value: [0.67386056 0.69575325 0.68038921 0.67508446 0.67277821 0.66621443
|
|
0.66189073 0.68037155 0.67635913 0.66649204]
|
|
|
|
mean value: 0.67491935676675
|
|
|
|
key: test_accuracy
|
|
value: [0.8125 0.75 0.8125 0.84375 0.8125 0.875
|
|
0.83870968 0.80645161 0.87096774 0.80645161]
|
|
|
|
mean value: 0.822883064516129
|
|
|
|
key: train_accuracy
|
|
value: [0.83450704 0.84507042 0.83802817 0.83450704 0.83450704 0.83098592
|
|
0.82807018 0.83859649 0.83508772 0.83157895]
|
|
|
|
mean value: 0.835093896713615
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.76470588 0.82352941 0.84848485 0.82352941 0.88235294
|
|
0.83870968 0.8125 0.88235294 0.84210526]
|
|
|
|
mean value: 0.8318270377297392
|
|
|
|
key: train_fscore
|
|
value: [0.84385382 0.85430464 0.84666667 0.84488449 0.84280936 0.84
|
|
0.83934426 0.84666667 0.84488449 0.83892617]
|
|
|
|
mean value: 0.844234056793084
|
|
|
|
key: test_precision
|
|
value: [0.85714286 0.72222222 0.77777778 0.82352941 0.77777778 0.83333333
|
|
0.8125 0.76470588 0.83333333 0.72727273]
|
|
|
|
mean value: 0.7929595322977676
|
|
|
|
key: train_precision
|
|
value: [0.79874214 0.80625 0.80379747 0.79503106 0.80254777 0.79746835
|
|
0.79012346 0.8089172 0.79503106 0.80128205]
|
|
|
|
mean value: 0.7999190549175873
|
|
|
|
key: test_recall
|
|
value: [0.75 0.8125 0.875 0.875 0.875 0.9375
|
|
0.86666667 0.86666667 0.9375 1. ]
|
|
|
|
mean value: 0.8795833333333334
|
|
|
|
key: train_recall
|
|
value: [0.8943662 0.9084507 0.8943662 0.90140845 0.88732394 0.88732394
|
|
0.8951049 0.88811189 0.90140845 0.88028169]
|
|
|
|
mean value: 0.8938146360681573
|
|
|
|
key: test_roc_auc
|
|
value: [0.8125 0.75 0.8125 0.84375 0.8125 0.875
|
|
0.83958333 0.80833333 0.86875 0.8 ]
|
|
|
|
mean value: 0.8222916666666666
|
|
|
|
key: train_roc_auc
|
|
value: [0.83450704 0.84507042 0.83802817 0.83450704 0.83450704 0.83098592
|
|
0.82783414 0.83842214 0.83531961 0.83174924]
|
|
|
|
mean value: 0.8350930759381464
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.61904762 0.7 0.73684211 0.7 0.78947368
|
|
0.72222222 0.68421053 0.78947368 0.72727273]
|
|
|
|
mean value: 0.7135209235209236
|
|
|
|
key: train_jcc
|
|
value: [0.72988506 0.74566474 0.73410405 0.73142857 0.7283237 0.72413793
|
|
0.72316384 0.73410405 0.73142857 0.72254335]
|
|
|
|
mean value: 0.7304783857563864
|
|
|
|
MCC on Blind test: 0.22
|
|
|
|
Accuracy on Blind test: 0.54
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00985122 0.01013541 0.01139951 0.01236129 0.01133943 0.01101375
|
|
0.01197648 0.01175308 0.01180387 0.0115931 ]
|
|
|
|
mean value: 0.011322712898254395
|
|
|
|
key: score_time
|
|
value: [0.00835967 0.01045895 0.01057839 0.01042914 0.01039219 0.01044703
|
|
0.01043487 0.01063824 0.01045942 0.01041865]
|
|
|
|
mean value: 0.0102616548538208
|
|
|
|
key: test_mcc
|
|
value: [0.75592895 0.68884672 0.8819171 0.67419986 0.8819171 0.81409158
|
|
0.87866878 0.9375 0.87083333 0.9372467 ]
|
|
|
|
mean value: 0.8321150124795701
|
|
|
|
key: train_mcc
|
|
value: [0.97183099 0.92966968 0.93775982 0.8661418 0.92365817 0.90901439
|
|
0.95798651 0.9114673 0.78397114 0.94395469]
|
|
|
|
mean value: 0.9135454491091561
|
|
|
|
key: test_accuracy
|
|
value: [0.875 0.84375 0.9375 0.8125 0.9375 0.90625
|
|
0.93548387 0.96774194 0.93548387 0.96774194]
|
|
|
|
mean value: 0.9118951612903226
|
|
|
|
key: train_accuracy
|
|
value: [0.98591549 0.96478873 0.96830986 0.92957746 0.96126761 0.95422535
|
|
0.97894737 0.95438596 0.89122807 0.97192982]
|
|
|
|
mean value: 0.9560575735112429
|
|
|
|
key: test_fscore
|
|
value: [0.88235294 0.83870968 0.94117647 0.76923077 0.94117647 0.90909091
|
|
0.9375 0.96774194 0.9375 0.96969697]
|
|
|
|
mean value: 0.9094176143274815
|
|
|
|
key: train_fscore
|
|
value: [0.98591549 0.96503497 0.96907216 0.92481203 0.96219931 0.9550173
|
|
0.97916667 0.95622896 0.88727273 0.97202797]
|
|
|
|
mean value: 0.9556747588965514
|
|
|
|
key: test_precision
|
|
value: [0.83333333 0.86666667 0.88888889 1. 0.88888889 0.88235294
|
|
0.88235294 0.9375 0.9375 0.94117647]
|
|
|
|
mean value: 0.9058660130718954
|
|
|
|
key: train_precision
|
|
value: [0.98591549 0.95833333 0.94630872 0.99193548 0.93959732 0.93877551
|
|
0.97241379 0.92207792 0.91729323 0.96527778]
|
|
|
|
mean value: 0.9537928586676441
|
|
|
|
key: test_recall
|
|
value: [0.9375 0.8125 1. 0.625 1. 0.9375 1. 1. 0.9375 1. ]
|
|
|
|
mean value: 0.925
|
|
|
|
key: train_recall
|
|
value: [0.98591549 0.97183099 0.99295775 0.86619718 0.98591549 0.97183099
|
|
0.98601399 0.99300699 0.85915493 0.97887324]
|
|
|
|
mean value: 0.9591697035359007
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.84375 0.9375 0.8125 0.9375 0.90625
|
|
0.9375 0.96875 0.93541667 0.96666667]
|
|
|
|
mean value: 0.9120833333333334
|
|
|
|
key: train_roc_auc
|
|
value: [0.98591549 0.96478873 0.96830986 0.92957746 0.96126761 0.95422535
|
|
0.97892249 0.95424998 0.89111593 0.9719541 ]
|
|
|
|
mean value: 0.9560326996946715
|
|
|
|
key: test_jcc
|
|
value: [0.78947368 0.72222222 0.88888889 0.625 0.88888889 0.83333333
|
|
0.88235294 0.9375 0.88235294 0.94117647]
|
|
|
|
mean value: 0.8391189370485036
|
|
|
|
key: train_jcc
|
|
value: [0.97222222 0.93243243 0.94 0.86013986 0.92715232 0.91390728
|
|
0.95918367 0.91612903 0.79738562 0.94557823]
|
|
|
|
mean value: 0.9164130675378523
|
|
|
|
MCC on Blind test: 0.18
|
|
|
|
Accuracy on Blind test: 0.49
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01099372 0.01085377 0.01167774 0.011415 0.01157284 0.01085854
|
|
0.01104522 0.01073241 0.01113367 0.01111031]
|
|
|
|
mean value: 0.011139321327209472
|
|
|
|
key: score_time
|
|
value: [0.0103929 0.0103898 0.01037788 0.0103898 0.01040602 0.01039386
|
|
0.01040792 0.01042295 0.01051378 0.01045513]
|
|
|
|
mean value: 0.010415005683898925
|
|
|
|
key: test_mcc
|
|
value: [0.44539933 0.32025631 0.81409158 0.57735027 0.77459667 0.75592895
|
|
0.87866878 0.9375 0.87083333 0.76594169]
|
|
|
|
mean value: 0.714056690328539
|
|
|
|
key: train_mcc
|
|
value: [0.87107074 0.62077843 0.83774371 0.57207859 0.80452795 0.84114227
|
|
0.89199759 0.83981496 0.95090121 0.86664533]
|
|
|
|
mean value: 0.8096700777785382
|
|
|
|
key: test_accuracy
|
|
value: [0.71875 0.625 0.90625 0.75 0.875 0.875
|
|
0.93548387 0.96774194 0.93548387 0.87096774]
|
|
|
|
mean value: 0.8459677419354839
|
|
|
|
key: train_accuracy
|
|
value: [0.93309859 0.77816901 0.91549296 0.75 0.8943662 0.91549296
|
|
0.94385965 0.91578947 0.9754386 0.92982456]
|
|
|
|
mean value: 0.8951531999011614
|
|
|
|
key: test_fscore
|
|
value: [0.68965517 0.45454545 0.90909091 0.66666667 0.88888889 0.88235294
|
|
0.9375 0.96774194 0.9375 0.88888889]
|
|
|
|
mean value: 0.8222830857154942
|
|
|
|
key: train_fscore
|
|
value: [0.92936803 0.71493213 0.9205298 0.66976744 0.90384615 0.92156863
|
|
0.94666667 0.92156863 0.9754386 0.93377483]
|
|
|
|
mean value: 0.8837460905964674
|
|
|
|
key: test_precision
|
|
value: [0.76923077 0.83333333 0.88235294 1. 0.8 0.83333333
|
|
0.88235294 0.9375 0.9375 0.8 ]
|
|
|
|
mean value: 0.8675603318250378
|
|
|
|
key: train_precision
|
|
value: [0.98425197 1. 0.86875 0.98630137 0.82941176 0.8597561
|
|
0.9044586 0.86503067 0.97202797 0.88125 ]
|
|
|
|
mean value: 0.9151238446234521
|
|
|
|
key: test_recall
|
|
value: [0.625 0.3125 0.9375 0.5 1. 0.9375 1. 1. 0.9375 1. ]
|
|
|
|
mean value: 0.825
|
|
|
|
key: train_recall
|
|
value: [0.88028169 0.55633803 0.97887324 0.50704225 0.99295775 0.99295775
|
|
0.99300699 0.98601399 0.97887324 0.99295775]
|
|
|
|
mean value: 0.8859302669161824
|
|
|
|
key: test_roc_auc
|
|
value: [0.71875 0.625 0.90625 0.75 0.875 0.875
|
|
0.9375 0.96875 0.93541667 0.86666667]
|
|
|
|
mean value: 0.8458333333333333
|
|
|
|
key: train_roc_auc
|
|
value: [0.93309859 0.77816901 0.91549296 0.75 0.8943662 0.91549296
|
|
0.9436866 0.9155422 0.97545061 0.93004531]
|
|
|
|
mean value: 0.8951344430217669
|
|
|
|
key: test_jcc
|
|
value: [0.52631579 0.29411765 0.83333333 0.5 0.8 0.78947368
|
|
0.88235294 0.9375 0.88235294 0.8 ]
|
|
|
|
mean value: 0.7245446336429309
|
|
|
|
key: train_jcc
|
|
value: [0.86805556 0.55633803 0.85276074 0.5034965 0.8245614 0.85454545
|
|
0.89873418 0.85454545 0.95205479 0.8757764 ]
|
|
|
|
mean value: 0.8040868505268339
|
|
|
|
MCC on Blind test: 0.08
|
|
|
|
Accuracy on Blind test: 0.19
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.09240365 0.08133793 0.08125806 0.08046818 0.08067393 0.08131313
|
|
0.08116984 0.0813272 0.08116603 0.08120728]
|
|
|
|
mean value: 0.08223252296447754
|
|
|
|
key: score_time
|
|
value: [0.01535177 0.0154326 0.01515222 0.01522565 0.01519728 0.0153811
|
|
0.01536131 0.01532435 0.01529288 0.01531577]
|
|
|
|
mean value: 0.015303492546081543
|
|
|
|
key: test_mcc
|
|
value: [0.81409158 0.875 0.93933644 0.81409158 0.93933644 1.
|
|
0.9375 1. 1. 0.87770745]
|
|
|
|
mean value: 0.9197063481549348
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.90625 0.9375 0.96875 0.90625 0.96875 1.
|
|
0.96774194 1. 1. 0.93548387]
|
|
|
|
mean value: 0.9590725806451613
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.9375 0.96969697 0.90322581 0.96969697 1.
|
|
0.96774194 1. 1. 0.94117647]
|
|
|
|
mean value: 0.9598129061008568
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.88235294 0.9375 0.94117647 0.93333333 0.94117647 1.
|
|
0.9375 1. 1. 0.88888889]
|
|
|
|
mean value: 0.9461928104575164
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.9375 0.9375 1. 0.875 1. 1. 1. 1. 1. 1. ]
|
|
|
|
mean value: 0.975
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.90625 0.9375 0.96875 0.90625 0.96875 1.
|
|
0.96875 1. 1. 0.93333333]
|
|
|
|
mean value: 0.9589583333333334
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.88235294 0.94117647 0.82352941 0.94117647 1.
|
|
0.9375 1. 1. 0.88888889]
|
|
|
|
mean value: 0.9247957516339869
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.05
|
|
|
|
Accuracy on Blind test: 0.19
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03232431 0.02845907 0.02876759 0.02964163 0.04149294 0.0316689
|
|
0.04340553 0.03582311 0.04470134 0.04033637]
|
|
|
|
mean value: 0.035662078857421876
|
|
|
|
key: score_time
|
|
value: [0.01759839 0.02218199 0.01889658 0.01943088 0.02917433 0.03231716
|
|
0.03496408 0.03411865 0.02071142 0.01735568]
|
|
|
|
mean value: 0.02467491626739502
|
|
|
|
key: test_mcc
|
|
value: [0.81409158 0.81409158 0.875 0.93933644 1. 1.
|
|
0.87866878 1. 0.87866878 0.9372467 ]
|
|
|
|
mean value: 0.913710384964254
|
|
|
|
key: train_mcc
|
|
value: [0.99298237 1. 0.99298237 1. 0.99298237 0.98591549
|
|
1. 0.98596474 0.99300665 0.98596474]
|
|
|
|
mean value: 0.9929798730055359
|
|
|
|
key: test_accuracy
|
|
value: [0.90625 0.90625 0.9375 0.96875 1. 1.
|
|
0.93548387 1. 0.93548387 0.96774194]
|
|
|
|
mean value: 0.9557459677419354
|
|
|
|
key: train_accuracy
|
|
value: [0.99647887 1. 0.99647887 1. 0.99647887 0.99295775
|
|
1. 0.99298246 0.99649123 0.99298246]
|
|
|
|
mean value: 0.996485050654806
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.90322581 0.9375 0.96774194 1. 1.
|
|
0.9375 1. 0.93333333 0.96969697]
|
|
|
|
mean value: 0.9558088954056696
|
|
|
|
key: train_fscore
|
|
value: [0.99646643 1. 0.99646643 1. 0.99646643 0.99295775
|
|
1. 0.99300699 0.99646643 0.99295775]
|
|
|
|
mean value: 0.9964788210346365
|
|
|
|
key: test_precision
|
|
value: [0.88235294 0.93333333 0.9375 1. 1. 1.
|
|
0.88235294 1. 1. 0.94117647]
|
|
|
|
mean value: 0.957671568627451
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 0.99295775
|
|
1. 0.99300699 1. 0.99295775]
|
|
|
|
mean value: 0.997892248596474
|
|
|
|
key: test_recall
|
|
value: [0.9375 0.875 0.9375 0.9375 1. 1. 1. 1. 0.875 1. ]
|
|
|
|
mean value: 0.95625
|
|
|
|
key: train_recall
|
|
value: [0.99295775 1. 0.99295775 1. 0.99295775 0.99295775
|
|
1. 0.99300699 0.99295775 0.99295775]
|
|
|
|
mean value: 0.9950753471880233
|
|
|
|
key: test_roc_auc
|
|
value: [0.90625 0.90625 0.9375 0.96875 1. 1.
|
|
0.9375 1. 0.9375 0.96666667]
|
|
|
|
mean value: 0.9560416666666667
|
|
|
|
key: train_roc_auc
|
|
value: [0.99647887 1. 0.99647887 1. 0.99647887 0.99295775
|
|
1. 0.99298237 0.99647887 0.99298237]
|
|
|
|
mean value: 0.9964837978922486
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.82352941 0.88235294 0.9375 1. 1.
|
|
0.88235294 1. 0.875 0.94117647]
|
|
|
|
mean value: 0.9175245098039215
|
|
|
|
key: train_jcc
|
|
value: [0.99295775 1. 0.99295775 1. 0.99295775 0.98601399
|
|
1. 0.98611111 0.99295775 0.98601399]
|
|
|
|
mean value: 0.9929970069054577
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.2
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.05620861 0.0974803 0.04944372 0.04494715 0.06924796 0.05304265
|
|
0.03282356 0.03296423 0.03626871 0.06697369]
|
|
|
|
mean value: 0.0539400577545166
|
|
|
|
key: score_time
|
|
value: [0.02177811 0.02000475 0.01141953 0.01396704 0.02782083 0.01147771
|
|
0.011482 0.01143765 0.01138997 0.02080035]
|
|
|
|
mean value: 0.01615779399871826
|
|
|
|
key: test_mcc
|
|
value: [0.62994079 0.438357 0.56360186 0.68884672 0.75 0.68884672
|
|
0.80833333 0.74166667 0.68826048 0.76594169]
|
|
|
|
mean value: 0.6763795258534475
|
|
|
|
key: train_mcc
|
|
value: [0.8612933 0.86052165 0.83971646 0.85382934 0.85314992 0.86794223
|
|
0.84766497 0.84023701 0.85436741 0.84697783]
|
|
|
|
mean value: 0.8525700111060143
|
|
|
|
key: test_accuracy
|
|
value: [0.8125 0.71875 0.78125 0.84375 0.875 0.84375
|
|
0.90322581 0.87096774 0.83870968 0.87096774]
|
|
|
|
mean value: 0.8358870967741936
|
|
|
|
key: train_accuracy
|
|
value: [0.92957746 0.92957746 0.91901408 0.92605634 0.92605634 0.93309859
|
|
0.92280702 0.91929825 0.92631579 0.92280702]
|
|
|
|
mean value: 0.925460835186558
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.70967742 0.78787879 0.84848485 0.875 0.84848485
|
|
0.90322581 0.86666667 0.85714286 0.88888889]
|
|
|
|
mean value: 0.838545012335335
|
|
|
|
key: train_fscore
|
|
value: [0.93197279 0.93150685 0.92150171 0.92832765 0.92783505 0.93515358
|
|
0.92567568 0.9220339 0.92832765 0.92465753]
|
|
|
|
mean value: 0.9276992378409221
|
|
|
|
key: test_precision
|
|
value: [0.85714286 0.73333333 0.76470588 0.82352941 0.875 0.82352941
|
|
0.875 0.86666667 0.78947368 0.8 ]
|
|
|
|
mean value: 0.8208381247235736
|
|
|
|
key: train_precision
|
|
value: [0.90131579 0.90666667 0.89403974 0.90066225 0.90604027 0.90728477
|
|
0.89542484 0.89473684 0.90066225 0.9 ]
|
|
|
|
mean value: 0.9006833409925814
|
|
|
|
key: test_recall
|
|
value: [0.75 0.6875 0.8125 0.875 0.875 0.875
|
|
0.93333333 0.86666667 0.9375 1. ]
|
|
|
|
mean value: 0.86125
|
|
|
|
key: train_recall
|
|
value: [0.96478873 0.95774648 0.95070423 0.95774648 0.95070423 0.96478873
|
|
0.95804196 0.95104895 0.95774648 0.95070423]
|
|
|
|
mean value: 0.9564020486555698
|
|
|
|
key: test_roc_auc
|
|
value: [0.8125 0.71875 0.78125 0.84375 0.875 0.84375
|
|
0.90416667 0.87083333 0.83541667 0.86666667]
|
|
|
|
mean value: 0.8352083333333333
|
|
|
|
key: train_roc_auc
|
|
value: [0.92957746 0.92957746 0.91901408 0.92605634 0.92605634 0.93309859
|
|
0.92268295 0.91918645 0.92642569 0.92290456]
|
|
|
|
mean value: 0.9254579927115139
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.55 0.65 0.73684211 0.77777778 0.73684211
|
|
0.82352941 0.76470588 0.75 0.8 ]
|
|
|
|
mean value: 0.7256363949088407
|
|
|
|
key: train_jcc
|
|
value: [0.87261146 0.87179487 0.85443038 0.86624204 0.86538462 0.87820513
|
|
0.86163522 0.85534591 0.86624204 0.85987261]
|
|
|
|
mean value: 0.8651764280073164
|
|
|
|
MCC on Blind test: 0.18
|
|
|
|
Accuracy on Blind test: 0.54
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.16305089 0.16052961 0.15664053 0.15771556 0.15495181 0.15255976
|
|
0.15471911 0.15581322 0.15795827 0.15890527]
|
|
|
|
mean value: 0.15728440284729003
|
|
|
|
key: score_time
|
|
value: [0.00907922 0.00902605 0.00912547 0.0093677 0.00861001 0.00851989
|
|
0.00923562 0.00842047 0.00907159 0.00920391]
|
|
|
|
mean value: 0.00896599292755127
|
|
|
|
key: test_mcc
|
|
value: [0.81409158 0.875 0.875 1. 1. 1.
|
|
0.9375 1. 1. 0.9372467 ]
|
|
|
|
mean value: 0.9438838276217104
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.90625 0.9375 0.9375 1. 1. 1.
|
|
0.96774194 1. 1. 0.96774194]
|
|
|
|
mean value: 0.9716733870967742
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.9375 0.9375 1. 1. 1.
|
|
0.96774194 1. 1. 0.96969697]
|
|
|
|
mean value: 0.972152981427175
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.88235294 0.9375 0.9375 1. 1. 1.
|
|
0.9375 1. 1. 0.94117647]
|
|
|
|
mean value: 0.9636029411764706
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.9375 0.9375 0.9375 1. 1. 1. 1. 1. 1. 1. ]
|
|
|
|
mean value: 0.98125
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.90625 0.9375 0.9375 1. 1. 1.
|
|
0.96875 1. 1. 0.96666667]
|
|
|
|
mean value: 0.9716666666666667
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.88235294 0.88235294 1. 1. 1.
|
|
0.9375 1. 1. 0.94117647]
|
|
|
|
mean value: 0.9476715686274509
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.05
|
|
|
|
Accuracy on Blind test: 0.19
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01113367 0.01251006 0.01261091 0.01772285 0.01263809 0.01343751
|
|
0.0127852 0.01266503 0.01297426 0.01285744]
|
|
|
|
mean value: 0.013133502006530762
|
|
|
|
key: score_time
|
|
value: [0.01069093 0.01078391 0.0107305 0.01083326 0.01099324 0.01084566
|
|
0.01136661 0.01079345 0.01084757 0.01162291]
|
|
|
|
mean value: 0.010950803756713867
|
|
|
|
key: test_mcc
|
|
value: [0.68884672 0.59215653 0.81409158 0.56360186 0.77459667 0.75
|
|
0.74896053 0.54812195 0.53006813 0.82078268]
|
|
|
|
mean value: 0.6831226650318738
|
|
|
|
key: train_mcc
|
|
value: [0.8145351 0.86223926 0.87332606 0.85924016 0.86725157 0.87541287
|
|
0.7742616 0.84773912 0.81144956 0.88848951]
|
|
|
|
mean value: 0.8473944811490282
|
|
|
|
key: test_accuracy
|
|
value: [0.84375 0.78125 0.90625 0.78125 0.875 0.875
|
|
0.87096774 0.77419355 0.74193548 0.90322581]
|
|
|
|
mean value: 0.8352822580645162
|
|
|
|
key: train_accuracy
|
|
value: [0.90140845 0.92957746 0.93661972 0.92957746 0.93309859 0.93661972
|
|
0.88070175 0.92280702 0.90175439 0.94385965]
|
|
|
|
mean value: 0.9216024215468248
|
|
|
|
key: test_fscore
|
|
value: [0.83870968 0.81081081 0.90322581 0.78787879 0.88888889 0.875
|
|
0.875 0.75862069 0.69230769 0.91428571]
|
|
|
|
mean value: 0.8344728067698034
|
|
|
|
key: train_fscore
|
|
value: [0.89230769 0.92647059 0.93706294 0.92907801 0.9347079 0.93430657
|
|
0.89102564 0.92028986 0.89393939 0.94244604]
|
|
|
|
mean value: 0.9201634638116422
|
|
|
|
key: test_precision
|
|
value: [0.86666667 0.71428571 0.93333333 0.76470588 0.8 0.875
|
|
0.82352941 0.78571429 0.9 0.84210526]
|
|
|
|
mean value: 0.8305340557275542
|
|
|
|
key: train_precision
|
|
value: [0.98305085 0.96923077 0.93055556 0.93571429 0.91275168 0.96969697
|
|
0.82248521 0.95488722 0.96721311 0.96323529]
|
|
|
|
mean value: 0.9408820939525007
|
|
|
|
key: test_recall
|
|
value: [0.8125 0.9375 0.875 0.8125 1. 0.875
|
|
0.93333333 0.73333333 0.5625 1. ]
|
|
|
|
mean value: 0.8541666666666666
|
|
|
|
key: train_recall
|
|
value: [0.81690141 0.88732394 0.94366197 0.92253521 0.95774648 0.90140845
|
|
0.97202797 0.88811189 0.83098592 0.92253521]
|
|
|
|
mean value: 0.9043238451689156
|
|
|
|
key: test_roc_auc
|
|
value: [0.84375 0.78125 0.90625 0.78125 0.875 0.875
|
|
0.87291667 0.77291667 0.74791667 0.9 ]
|
|
|
|
mean value: 0.8356250000000001
|
|
|
|
key: train_roc_auc
|
|
value: [0.90140845 0.92957746 0.93661972 0.92957746 0.93309859 0.93661972
|
|
0.88038018 0.92292918 0.90150694 0.94378509]
|
|
|
|
mean value: 0.9215502807052103
|
|
|
|
key: test_jcc
|
|
value: [0.72222222 0.68181818 0.82352941 0.65 0.8 0.77777778
|
|
0.77777778 0.61111111 0.52941176 0.84210526]
|
|
|
|
mean value: 0.7215753510335554
|
|
|
|
key: train_jcc
|
|
value: [0.80555556 0.8630137 0.88157895 0.86754967 0.87741935 0.87671233
|
|
0.80346821 0.85234899 0.80821918 0.89115646]
|
|
|
|
mean value: 0.8527022396082421
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.58
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01892829 0.02864075 0.02735972 0.02840042 0.0284605 0.02859259
|
|
0.02847409 0.02875304 0.02716875 0.01797819]
|
|
|
|
mean value: 0.026275634765625
|
|
|
|
key: score_time
|
|
value: [0.01286125 0.01076007 0.01073647 0.01065683 0.01062083 0.01068902
|
|
0.01063824 0.03288555 0.0109849 0.02092481]
|
|
|
|
mean value: 0.014175796508789062
|
|
|
|
key: test_mcc
|
|
value: [0.68884672 0.75 0.81409158 0.93933644 0.8819171 1.
|
|
0.87083333 0.9372467 0.87083333 0.9372467 ]
|
|
|
|
mean value: 0.8690351901199767
|
|
|
|
key: train_mcc
|
|
value: [0.92994649 0.90955652 0.90901439 0.89492115 0.91585639 0.90955652
|
|
0.93741093 0.90253931 0.90988464 0.90897898]
|
|
|
|
mean value: 0.9127665317222325
|
|
|
|
key: test_accuracy
|
|
value: [0.84375 0.875 0.90625 0.96875 0.9375 1.
|
|
0.93548387 0.96774194 0.93548387 0.96774194]
|
|
|
|
mean value: 0.9337701612903225
|
|
|
|
key: train_accuracy
|
|
value: [0.96478873 0.95422535 0.95422535 0.9471831 0.95774648 0.95422535
|
|
0.96842105 0.95087719 0.95438596 0.95438596]
|
|
|
|
mean value: 0.9560464541635779
|
|
|
|
key: test_fscore
|
|
value: [0.84848485 0.875 0.90322581 0.96969697 0.94117647 1.
|
|
0.93333333 0.96551724 0.9375 0.96969697]
|
|
|
|
mean value: 0.934363163963128
|
|
|
|
key: train_fscore
|
|
value: [0.96527778 0.95532646 0.9550173 0.94809689 0.95833333 0.95532646
|
|
0.96907216 0.95205479 0.95532646 0.95470383]
|
|
|
|
mean value: 0.9568535471627235
|
|
|
|
key: test_precision
|
|
value: [0.82352941 0.875 0.93333333 0.94117647 0.88888889 1.
|
|
0.93333333 1. 0.9375 0.94117647]
|
|
|
|
mean value: 0.9273937908496732
|
|
|
|
key: train_precision
|
|
value: [0.95205479 0.93288591 0.93877551 0.93197279 0.94520548 0.93288591
|
|
0.9527027 0.93288591 0.93288591 0.94482759]
|
|
|
|
mean value: 0.9397082486363004
|
|
|
|
key: test_recall
|
|
value: [0.875 0.875 0.875 1. 1. 1.
|
|
0.93333333 0.93333333 0.9375 1. ]
|
|
|
|
mean value: 0.9429166666666666
|
|
|
|
key: train_recall
|
|
value: [0.97887324 0.97887324 0.97183099 0.96478873 0.97183099 0.97887324
|
|
0.98601399 0.97202797 0.97887324 0.96478873]
|
|
|
|
mean value: 0.9746774352408155
|
|
|
|
key: test_roc_auc
|
|
value: [0.84375 0.875 0.90625 0.96875 0.9375 1.
|
|
0.93541667 0.96666667 0.93541667 0.96666667]
|
|
|
|
mean value: 0.9335416666666667
|
|
|
|
key: train_roc_auc
|
|
value: [0.96478873 0.95422535 0.95422535 0.9471831 0.95774648 0.95422535
|
|
0.96835911 0.95080272 0.95447158 0.95442234]
|
|
|
|
mean value: 0.9560450113267015
|
|
|
|
key: test_jcc
|
|
value: [0.73684211 0.77777778 0.82352941 0.94117647 0.88888889 1.
|
|
0.875 0.93333333 0.88235294 0.94117647]
|
|
|
|
mean value: 0.8800077399380805
|
|
|
|
key: train_jcc
|
|
value: [0.93288591 0.91447368 0.91390728 0.90131579 0.92 0.91447368
|
|
0.94 0.90849673 0.91447368 0.91333333]
|
|
|
|
mean value: 0.9173360098273221
|
|
|
|
MCC on Blind test: 0.22
|
|
|
|
Accuracy on Blind test: 0.49
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: /home/tanu/git/LSHTM_analysis/scripts/ml/./katg_config.py:183: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rus_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./katg_config.py:186: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rus_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.16565156 0.08813143 0.17730904 0.20824528 0.18379951 0.1740315
|
|
0.17967129 0.17941952 0.18022633 0.19038606]
|
|
|
|
mean value: 0.17268714904785157
|
|
|
|
key: score_time
|
|
value: [0.0107646 0.01237965 0.01942182 0.01081586 0.01998901 0.01066494
|
|
0.01124215 0.01264286 0.01966715 0.02074218]
|
|
|
|
mean value: 0.014833021163940429
|
|
|
|
key: test_mcc
|
|
value: [0.81409158 0.75 0.81409158 1. 0.8819171 1.
|
|
0.87083333 1. 1. 0.87770745]
|
|
|
|
mean value: 0.900864104531543
|
|
|
|
key: train_mcc
|
|
value: [0.95129413 0.94450549 0.94403659 0.93720088 0.94403659 0.93720088
|
|
0.93741093 0.93741093 0.93130575 0.95146839]
|
|
|
|
mean value: 0.9415870567033926
|
|
|
|
key: test_accuracy
|
|
value: [0.90625 0.875 0.90625 1. 0.9375 1.
|
|
0.93548387 1. 1. 0.93548387]
|
|
|
|
mean value: 0.9495967741935484
|
|
|
|
key: train_accuracy
|
|
value: [0.97535211 0.97183099 0.97183099 0.96830986 0.97183099 0.96830986
|
|
0.96842105 0.96842105 0.96491228 0.9754386 ]
|
|
|
|
mean value: 0.9704657771188535
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.875 0.90322581 1. 0.94117647 1.
|
|
0.93333333 1. 1. 0.94117647]
|
|
|
|
mean value: 0.9503002990052326
|
|
|
|
key: train_fscore
|
|
value: [0.97577855 0.97241379 0.97222222 0.96885813 0.97222222 0.96885813
|
|
0.96907216 0.96907216 0.96575342 0.97577855]
|
|
|
|
mean value: 0.9710029348503718
|
|
|
|
key: test_precision
|
|
value: [0.88235294 0.875 0.93333333 1. 0.88888889 1.
|
|
0.93333333 1. 1. 0.88888889]
|
|
|
|
mean value: 0.9401797385620915
|
|
|
|
key: train_precision
|
|
value: [0.95918367 0.9527027 0.95890411 0.95238095 0.95890411 0.95238095
|
|
0.9527027 0.9527027 0.94 0.95918367]
|
|
|
|
mean value: 0.953904557898687
|
|
|
|
key: test_recall
|
|
value: [0.9375 0.875 0.875 1. 1. 1.
|
|
0.93333333 1. 1. 1. ]
|
|
|
|
mean value: 0.9620833333333333
|
|
|
|
key: train_recall
|
|
value: [0.99295775 0.99295775 0.98591549 0.98591549 0.98591549 0.98591549
|
|
0.98601399 0.98601399 0.99295775 0.99295775]
|
|
|
|
mean value: 0.9887520929774452
|
|
|
|
key: test_roc_auc
|
|
value: [0.90625 0.875 0.90625 1. 0.9375 1.
|
|
0.93541667 1. 1. 0.93333333]
|
|
|
|
mean value: 0.949375
|
|
|
|
key: train_roc_auc
|
|
value: [0.97535211 0.97183099 0.97183099 0.96830986 0.97183099 0.96830986
|
|
0.96835911 0.96835911 0.96501034 0.97549985]
|
|
|
|
mean value: 0.9704693194129814
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.77777778 0.82352941 1. 0.88888889 1.
|
|
0.875 1. 1. 0.88888889]
|
|
|
|
mean value: 0.9087418300653595
|
|
|
|
key: train_jcc
|
|
value: [0.9527027 0.94630872 0.94594595 0.93959732 0.94594595 0.93959732
|
|
0.94 0.94 0.93377483 0.9527027 ]
|
|
|
|
mean value: 0.9436575487439082
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.43
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02582741 0.02434468 0.02545023 0.02682304 0.02640653 0.02720022
|
|
0.02612591 0.02765298 0.02734327 0.04823351]
|
|
|
|
mean value: 0.028540778160095214
|
|
|
|
key: score_time
|
|
value: [0.01104569 0.01089406 0.01136374 0.01076293 0.01096463 0.01084781
|
|
0.01096702 0.01096272 0.01116037 0.01098251]
|
|
|
|
mean value: 0.010995149612426758
|
|
|
|
key: test_mcc
|
|
value: [0.81325006 0.87831007 0.80813523 0.78446454 0.77459667 0.83914639
|
|
0.80813523 0.90748521 0.73763441 0.77382584]
|
|
|
|
mean value: 0.8124983647487063
|
|
|
|
key: train_mcc
|
|
value: [0.83119879 0.83472681 0.83507281 0.87790234 0.85985131 0.84227171
|
|
0.84207536 0.83472681 0.85645761 0.83886705]
|
|
|
|
mean value: 0.8453150611021845
|
|
|
|
key: test_accuracy
|
|
value: [0.90322581 0.93548387 0.90322581 0.88709677 0.88709677 0.91935484
|
|
0.90322581 0.9516129 0.86885246 0.8852459 ]
|
|
|
|
mean value: 0.9044420941300899
|
|
|
|
key: train_accuracy
|
|
value: [0.91546763 0.91726619 0.91726619 0.93884892 0.92985612 0.92086331
|
|
0.92086331 0.91726619 0.92818671 0.91921005]
|
|
|
|
mean value: 0.9225094610128773
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.93939394 0.90625 0.87719298 0.88888889 0.92063492
|
|
0.90625 0.95384615 0.86666667 0.89230769]
|
|
|
|
mean value: 0.9060522153285311
|
|
|
|
key: train_fscore
|
|
value: [0.91651865 0.91814947 0.91872792 0.93950178 0.93048128 0.92226148
|
|
0.92198582 0.91814947 0.92882562 0.92035398]
|
|
|
|
mean value: 0.9234955465227851
|
|
|
|
key: test_precision
|
|
value: [0.85714286 0.88571429 0.87878788 0.96153846 0.875 0.90625
|
|
0.87878788 0.91176471 0.86666667 0.85294118]
|
|
|
|
mean value: 0.887459391099097
|
|
|
|
key: train_precision
|
|
value: [0.90526316 0.9084507 0.90277778 0.92957746 0.92226148 0.90625
|
|
0.90909091 0.9084507 0.92226148 0.90592334]
|
|
|
|
mean value: 0.9120307031148476
|
|
|
|
key: test_recall
|
|
value: [0.96774194 1. 0.93548387 0.80645161 0.90322581 0.93548387
|
|
0.93548387 1. 0.86666667 0.93548387]
|
|
|
|
mean value: 0.9286021505376344
|
|
|
|
key: train_recall
|
|
value: [0.92805755 0.92805755 0.9352518 0.94964029 0.93884892 0.93884892
|
|
0.9352518 0.92805755 0.93548387 0.9352518 ]
|
|
|
|
mean value: 0.9352750058018101
|
|
|
|
key: test_roc_auc
|
|
value: [0.90322581 0.93548387 0.90322581 0.88709677 0.88709677 0.91935484
|
|
0.90322581 0.9516129 0.8688172 0.8844086 ]
|
|
|
|
mean value: 0.9043548387096774
|
|
|
|
key: train_roc_auc
|
|
value: [0.91546763 0.91726619 0.91726619 0.93884892 0.92985612 0.92086331
|
|
0.92086331 0.91726619 0.92817359 0.9192388 ]
|
|
|
|
mean value: 0.922511023439313
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.88571429 0.82857143 0.78125 0.8 0.85294118
|
|
0.82857143 0.91176471 0.76470588 0.80555556]
|
|
|
|
mean value: 0.8292407796451914
|
|
|
|
key: train_jcc
|
|
value: [0.84590164 0.84868421 0.8496732 0.88590604 0.87 0.8557377
|
|
0.85526316 0.84868421 0.86710963 0.85245902]
|
|
|
|
mean value: 0.8579418817037436
|
|
|
|
MCC on Blind test: 0.21
|
|
|
|
Accuracy on Blind test: 0.53
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.77656746 0.70605206 0.85919523 0.69673634 0.68120766 0.78336358
|
|
0.78208661 0.70059681 0.83346748 0.76766825]
|
|
|
|
mean value: 0.7586941480636596
|
|
|
|
key: score_time
|
|
value: [0.01191044 0.01261759 0.01475716 0.01256537 0.0127914 0.01143336
|
|
0.01280951 0.01418829 0.01239324 0.01240849]
|
|
|
|
mean value: 0.012787485122680664
|
|
|
|
key: test_mcc
|
|
value: [0.90369611 0.93743687 0.90369611 0.82199494 0.84266484 0.93743687
|
|
0.90369611 0.87278605 0.87055472 0.96770777]
|
|
|
|
mean value: 0.8961670394093372
|
|
|
|
key: train_mcc
|
|
value: [0.97124816 0.96043787 0.96402878 0.94966486 0.9497386 0.96402878
|
|
0.94986154 0.96405373 0.97487139 0.96768995]
|
|
|
|
mean value: 0.9615623654982854
|
|
|
|
key: test_accuracy
|
|
value: [0.9516129 0.96774194 0.9516129 0.90322581 0.91935484 0.96774194
|
|
0.9516129 0.93548387 0.93442623 0.98360656]
|
|
|
|
mean value: 0.9466419883659439
|
|
|
|
key: train_accuracy
|
|
value: [0.98561151 0.98021583 0.98201439 0.97482014 0.97482014 0.98201439
|
|
0.97482014 0.98201439 0.98743268 0.98384201]
|
|
|
|
mean value: 0.9807605621068675
|
|
|
|
key: test_fscore
|
|
value: [0.95081967 0.96666667 0.95081967 0.89285714 0.92307692 0.96875
|
|
0.95238095 0.93333333 0.93103448 0.98412698]
|
|
|
|
mean value: 0.9453865829462917
|
|
|
|
key: train_fscore
|
|
value: [0.98566308 0.98018018 0.98201439 0.97491039 0.975 0.98201439
|
|
0.97508897 0.98207885 0.98747764 0.98378378]
|
|
|
|
mean value: 0.9808211677303444
|
|
|
|
key: test_precision
|
|
value: [0.96666667 1. 0.96666667 1. 0.88235294 0.93939394
|
|
0.9375 0.96551724 0.96428571 0.96875 ]
|
|
|
|
mean value: 0.9591133169568768
|
|
|
|
key: train_precision
|
|
value: [0.98214286 0.98194946 0.98201439 0.97142857 0.96808511 0.98201439
|
|
0.96478873 0.97857143 0.98571429 0.98555957]
|
|
|
|
mean value: 0.9782268783883663
|
|
|
|
key: test_recall
|
|
value: [0.93548387 0.93548387 0.93548387 0.80645161 0.96774194 1.
|
|
0.96774194 0.90322581 0.9 1. ]
|
|
|
|
mean value: 0.9351612903225807
|
|
|
|
key: train_recall
|
|
value: [0.98920863 0.97841727 0.98201439 0.97841727 0.98201439 0.98201439
|
|
0.98561151 0.98561151 0.98924731 0.98201439]
|
|
|
|
mean value: 0.9834571052835152
|
|
|
|
key: test_roc_auc
|
|
value: [0.9516129 0.96774194 0.9516129 0.90322581 0.91935484 0.96774194
|
|
0.9516129 0.93548387 0.93387097 0.98333333]
|
|
|
|
mean value: 0.9465591397849463
|
|
|
|
key: train_roc_auc
|
|
value: [0.98561151 0.98021583 0.98201439 0.97482014 0.97482014 0.98201439
|
|
0.97482014 0.98201439 0.98742941 0.98383874]
|
|
|
|
mean value: 0.9807599082024703
|
|
|
|
key: test_jcc
|
|
value: [0.90625 0.93548387 0.90625 0.80645161 0.85714286 0.93939394
|
|
0.90909091 0.875 0.87096774 0.96875 ]
|
|
|
|
mean value: 0.8974780931434158
|
|
|
|
key: train_jcc
|
|
value: [0.97173145 0.96113074 0.96466431 0.95104895 0.95121951 0.96466431
|
|
0.95138889 0.96478873 0.97526502 0.96808511]
|
|
|
|
mean value: 0.9623987021299
|
|
|
|
MCC on Blind test: 0.14
|
|
|
|
Accuracy on Blind test: 0.35
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01096439 0.01291323 0.00820684 0.00848889 0.00764108 0.0079782
|
|
0.00760031 0.00791764 0.00761533 0.00789833]
|
|
|
|
mean value: 0.008722424507141113
|
|
|
|
key: score_time
|
|
value: [0.01091075 0.00868392 0.00824642 0.00876021 0.00801086 0.00799608
|
|
0.00807261 0.00803852 0.00840473 0.00831747]
|
|
|
|
mean value: 0.008544158935546876
|
|
|
|
key: test_mcc
|
|
value: [0.67883359 0.64549722 0.7130241 0.52981294 0.74193548 0.7130241
|
|
0.80813523 0.81325006 0.50860215 0.77072165]
|
|
|
|
mean value: 0.6922836529141403
|
|
|
|
key: train_mcc
|
|
value: [0.71239616 0.71972253 0.72313855 0.6419512 0.73033396 0.70874774
|
|
0.69849277 0.6908084 0.72712387 0.72023891]
|
|
|
|
mean value: 0.7072954079422489
|
|
|
|
key: test_accuracy
|
|
value: [0.83870968 0.82258065 0.85483871 0.75806452 0.87096774 0.85483871
|
|
0.90322581 0.90322581 0.75409836 0.8852459 ]
|
|
|
|
mean value: 0.8445795875198308
|
|
|
|
key: train_accuracy
|
|
value: [0.85611511 0.85971223 0.86151079 0.82014388 0.86510791 0.85431655
|
|
0.84892086 0.84532374 0.86355476 0.85996409]
|
|
|
|
mean value: 0.8534669930124124
|
|
|
|
key: test_fscore
|
|
value: [0.84375 0.82539683 0.86153846 0.72727273 0.87096774 0.86153846
|
|
0.90625 0.90909091 0.75409836 0.88888889]
|
|
|
|
mean value: 0.8448792376317495
|
|
|
|
key: train_fscore
|
|
value: [0.85765125 0.86170213 0.8627451 0.81343284 0.86631016 0.85561497
|
|
0.85211268 0.84697509 0.86428571 0.86170213]
|
|
|
|
mean value: 0.8542532047730724
|
|
|
|
key: test_precision
|
|
value: [0.81818182 0.8125 0.82352941 0.83333333 0.87096774 0.82352941
|
|
0.87878788 0.85714286 0.74193548 0.875 ]
|
|
|
|
mean value: 0.833490793678175
|
|
|
|
key: train_precision
|
|
value: [0.84859155 0.84965035 0.85512367 0.84496124 0.85865724 0.84805654
|
|
0.83448276 0.83802817 0.86120996 0.84965035]
|
|
|
|
mean value: 0.8488411836784526
|
|
|
|
key: test_recall
|
|
value: [0.87096774 0.83870968 0.90322581 0.64516129 0.87096774 0.90322581
|
|
0.93548387 0.96774194 0.76666667 0.90322581]
|
|
|
|
mean value: 0.8605376344086022
|
|
|
|
key: train_recall
|
|
value: [0.86690647 0.87410072 0.8705036 0.78417266 0.87410072 0.86330935
|
|
0.8705036 0.85611511 0.86738351 0.87410072]
|
|
|
|
mean value: 0.8601196462185091
|
|
|
|
key: test_roc_auc
|
|
value: [0.83870968 0.82258065 0.85483871 0.75806452 0.87096774 0.85483871
|
|
0.90322581 0.90322581 0.75430108 0.88494624]
|
|
|
|
mean value: 0.8445698924731183
|
|
|
|
key: train_roc_auc
|
|
value: [0.85611511 0.85971223 0.86151079 0.82014388 0.86510791 0.85431655
|
|
0.84892086 0.84532374 0.86354787 0.85998943]
|
|
|
|
mean value: 0.8534688378329595
|
|
|
|
key: test_jcc
|
|
value: [0.72972973 0.7027027 0.75675676 0.57142857 0.77142857 0.75675676
|
|
0.82857143 0.83333333 0.60526316 0.8 ]
|
|
|
|
mean value: 0.7355971008602588
|
|
|
|
key: train_jcc
|
|
value: [0.75077882 0.75700935 0.75862069 0.68553459 0.76415094 0.74766355
|
|
0.74233129 0.7345679 0.76100629 0.75700935]
|
|
|
|
mean value: 0.7458672762322701
|
|
|
|
MCC on Blind test: 0.21
|
|
|
|
Accuracy on Blind test: 0.57
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00830388 0.00819135 0.00802517 0.00794339 0.00828314 0.00875449
|
|
0.00812817 0.0083952 0.0085578 0.00874829]
|
|
|
|
mean value: 0.008333086967468262
|
|
|
|
key: score_time
|
|
value: [0.00846505 0.00840521 0.008214 0.00820637 0.00816274 0.00835466
|
|
0.00821352 0.00871086 0.0093646 0.00880384]
|
|
|
|
mean value: 0.00849008560180664
|
|
|
|
key: test_mcc
|
|
value: [0.51639778 0.56761348 0.61290323 0.65372045 0.74348441 0.5809475
|
|
0.58834841 0.7130241 0.58264312 0.54086022]
|
|
|
|
mean value: 0.6099942679846233
|
|
|
|
key: train_mcc
|
|
value: [0.62249953 0.6079176 0.63414469 0.60794907 0.59713776 0.61543051
|
|
0.64482423 0.62249953 0.6375268 0.6122178 ]
|
|
|
|
mean value: 0.620214750789007
|
|
|
|
key: test_accuracy
|
|
value: [0.75806452 0.77419355 0.80645161 0.82258065 0.87096774 0.79032258
|
|
0.79032258 0.85483871 0.78688525 0.7704918 ]
|
|
|
|
mean value: 0.8025118984664199
|
|
|
|
key: train_accuracy
|
|
value: [0.81115108 0.80395683 0.81654676 0.80395683 0.79856115 0.80755396
|
|
0.82194245 0.81115108 0.81867145 0.80610413]
|
|
|
|
mean value: 0.8099595727367837
|
|
|
|
key: test_fscore
|
|
value: [0.75409836 0.74074074 0.80645161 0.80701754 0.875 0.79365079
|
|
0.80597015 0.86153846 0.8 0.77419355]
|
|
|
|
mean value: 0.8018661210989436
|
|
|
|
key: train_fscore
|
|
value: [0.80874317 0.8036036 0.82167832 0.80500894 0.79928315 0.80438757
|
|
0.82661996 0.81349911 0.82123894 0.80505415]
|
|
|
|
mean value: 0.8109116928454192
|
|
|
|
key: test_precision
|
|
value: [0.76666667 0.86956522 0.80645161 0.88461538 0.84848485 0.78125
|
|
0.75 0.82352941 0.74285714 0.77419355]
|
|
|
|
mean value: 0.8047613833070375
|
|
|
|
key: train_precision
|
|
value: [0.81918819 0.80505415 0.79931973 0.80071174 0.79642857 0.81784387
|
|
0.80546075 0.80350877 0.81118881 0.80797101]
|
|
|
|
mean value: 0.8066675601234072
|
|
|
|
key: test_recall
|
|
value: [0.74193548 0.64516129 0.80645161 0.74193548 0.90322581 0.80645161
|
|
0.87096774 0.90322581 0.86666667 0.77419355]
|
|
|
|
mean value: 0.8060215053763441
|
|
|
|
key: train_recall
|
|
value: [0.79856115 0.80215827 0.84532374 0.80935252 0.80215827 0.79136691
|
|
0.84892086 0.82374101 0.83154122 0.80215827]
|
|
|
|
mean value: 0.8155282225832238
|
|
|
|
key: test_roc_auc
|
|
value: [0.75806452 0.77419355 0.80645161 0.82258065 0.87096774 0.79032258
|
|
0.79032258 0.85483871 0.78817204 0.77043011]
|
|
|
|
mean value: 0.8026344086021505
|
|
|
|
key: train_roc_auc
|
|
value: [0.81115108 0.80395683 0.81654676 0.80395683 0.79856115 0.80755396
|
|
0.82194245 0.81115108 0.81864831 0.80609706]
|
|
|
|
mean value: 0.8099565508883215
|
|
|
|
key: test_jcc
|
|
value: [0.60526316 0.58823529 0.67567568 0.67647059 0.77777778 0.65789474
|
|
0.675 0.75675676 0.66666667 0.63157895]
|
|
|
|
mean value: 0.6711319601335082
|
|
|
|
key: train_jcc
|
|
value: [0.67889908 0.67168675 0.69732938 0.67365269 0.66567164 0.67278287
|
|
0.70447761 0.68562874 0.6966967 0.67371601]
|
|
|
|
mean value: 0.6820541480667476
|
|
|
|
MCC on Blind test: 0.18
|
|
|
|
Accuracy on Blind test: 0.52
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00809216 0.00795221 0.00791621 0.00790453 0.00721383 0.00737166
|
|
0.00789046 0.00771284 0.00766468 0.00788069]
|
|
|
|
mean value: 0.0077599287033081055
|
|
|
|
key: score_time
|
|
value: [0.01314855 0.01184964 0.0114398 0.0146842 0.01099205 0.01099849
|
|
0.01176476 0.0116837 0.01157475 0.01158309]
|
|
|
|
mean value: 0.01197190284729004
|
|
|
|
key: test_mcc
|
|
value: [0.45760432 0.48488114 0.67883359 0.55301004 0.67883359 0.67883359
|
|
0.54953196 0.74348441 0.40967742 0.70780713]
|
|
|
|
mean value: 0.5942497191157756
|
|
|
|
key: train_mcc
|
|
value: [0.7125253 0.73779681 0.71605437 0.74499483 0.7125253 0.71313508
|
|
0.726788 0.73745301 0.75237261 0.72554668]
|
|
|
|
mean value: 0.7279191995608599
|
|
|
|
key: test_accuracy
|
|
value: [0.72580645 0.74193548 0.83870968 0.77419355 0.83870968 0.83870968
|
|
0.77419355 0.87096774 0.70491803 0.85245902]
|
|
|
|
mean value: 0.7960602855631941
|
|
|
|
key: train_accuracy
|
|
value: [0.85611511 0.86870504 0.85791367 0.87230216 0.85611511 0.85611511
|
|
0.86330935 0.86870504 0.87612208 0.86175943]
|
|
|
|
mean value: 0.8637162083618563
|
|
|
|
key: test_fscore
|
|
value: [0.70175439 0.75 0.83333333 0.75862069 0.83333333 0.84375
|
|
0.78125 0.86666667 0.7 0.86153846]
|
|
|
|
mean value: 0.7930246870491879
|
|
|
|
key: train_fscore
|
|
value: [0.8540146 0.86654479 0.856102 0.8702011 0.8540146 0.85239852
|
|
0.86181818 0.86799277 0.87522604 0.85607477]
|
|
|
|
mean value: 0.8614387366046266
|
|
|
|
key: test_precision
|
|
value: [0.76923077 0.72727273 0.86206897 0.81481481 0.86206897 0.81818182
|
|
0.75757576 0.89655172 0.7 0.82352941]
|
|
|
|
mean value: 0.8031294954013006
|
|
|
|
key: train_precision
|
|
value: [0.86666667 0.88104089 0.86715867 0.88475836 0.86666667 0.875
|
|
0.87132353 0.87272727 0.88321168 0.89105058]
|
|
|
|
mean value: 0.8759604326054368
|
|
|
|
key: test_recall
|
|
value: [0.64516129 0.77419355 0.80645161 0.70967742 0.80645161 0.87096774
|
|
0.80645161 0.83870968 0.7 0.90322581]
|
|
|
|
mean value: 0.7861290322580645
|
|
|
|
key: train_recall
|
|
value: [0.84172662 0.85251799 0.84532374 0.85611511 0.84172662 0.83093525
|
|
0.85251799 0.86330935 0.86738351 0.82374101]
|
|
|
|
mean value: 0.8475297181609551
|
|
|
|
key: test_roc_auc
|
|
value: [0.72580645 0.74193548 0.83870968 0.77419355 0.83870968 0.83870968
|
|
0.77419355 0.87096774 0.70483871 0.8516129 ]
|
|
|
|
mean value: 0.7959677419354838
|
|
|
|
key: train_roc_auc
|
|
value: [0.85611511 0.86870504 0.85791367 0.87230216 0.85611511 0.85611511
|
|
0.86330935 0.86870504 0.8761378 0.86169129]
|
|
|
|
mean value: 0.8637109667105026
|
|
|
|
key: test_jcc
|
|
value: [0.54054054 0.6 0.71428571 0.61111111 0.71428571 0.72972973
|
|
0.64102564 0.76470588 0.53846154 0.75675676]
|
|
|
|
mean value: 0.6610902628549687
|
|
|
|
key: train_jcc
|
|
value: [0.74522293 0.76451613 0.74840764 0.77022654 0.74522293 0.74276527
|
|
0.7571885 0.76677316 0.77813505 0.74836601]
|
|
|
|
mean value: 0.7566824165390956
|
|
|
|
MCC on Blind test: 0.16
|
|
|
|
Accuracy on Blind test: 0.57
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01524782 0.01526904 0.01670051 0.01508474 0.01485252 0.01501393
|
|
0.01506996 0.01522017 0.01478338 0.01487541]
|
|
|
|
mean value: 0.015211749076843261
|
|
|
|
key: score_time
|
|
value: [0.00945497 0.00925422 0.00928378 0.00928211 0.00977159 0.00912595
|
|
0.00927424 0.00921845 0.00913382 0.00917101]
|
|
|
|
mean value: 0.009297013282775879
|
|
|
|
key: test_mcc
|
|
value: [0.64820372 0.75623534 0.80813523 0.71004695 0.74819006 0.7284928
|
|
0.7190925 0.70116959 0.61256703 0.6844511 ]
|
|
|
|
mean value: 0.7116584311085777
|
|
|
|
key: train_mcc
|
|
value: [0.78485761 0.79151169 0.79209132 0.85451608 0.77632088 0.78285538
|
|
0.75529076 0.75529076 0.78851732 0.80529218]
|
|
|
|
mean value: 0.7886543984062245
|
|
|
|
key: test_accuracy
|
|
value: [0.82258065 0.87096774 0.90322581 0.85483871 0.87096774 0.85483871
|
|
0.85483871 0.83870968 0.80327869 0.83606557]
|
|
|
|
mean value: 0.8510312004230566
|
|
|
|
key: train_accuracy
|
|
value: [0.89028777 0.89388489 0.89388489 0.92625899 0.88489209 0.88848921
|
|
0.87410072 0.87410072 0.89228007 0.90125673]
|
|
|
|
mean value: 0.8919436084884337
|
|
|
|
key: test_fscore
|
|
value: [0.83076923 0.88235294 0.90625 0.85245902 0.87878788 0.86956522
|
|
0.86567164 0.85714286 0.8125 0.85294118]
|
|
|
|
mean value: 0.8608439959922817
|
|
|
|
key: train_fscore
|
|
value: [0.8957265 0.89879931 0.8991453 0.92869565 0.89189189 0.89491525
|
|
0.88215488 0.88215488 0.89761092 0.90500864]
|
|
|
|
mean value: 0.8976103228458596
|
|
|
|
key: test_precision
|
|
value: [0.79411765 0.81081081 0.87878788 0.86666667 0.82857143 0.78947368
|
|
0.80555556 0.76923077 0.76470588 0.78378378]
|
|
|
|
mean value: 0.8091704107029185
|
|
|
|
key: train_precision
|
|
value: [0.8534202 0.85901639 0.85667752 0.8989899 0.84076433 0.84615385
|
|
0.82911392 0.82911392 0.85667752 0.87043189]
|
|
|
|
mean value: 0.8540359455885207
|
|
|
|
key: test_recall
|
|
value: [0.87096774 0.96774194 0.93548387 0.83870968 0.93548387 0.96774194
|
|
0.93548387 0.96774194 0.86666667 0.93548387]
|
|
|
|
mean value: 0.9221505376344086
|
|
|
|
key: train_recall
|
|
value: [0.94244604 0.94244604 0.94604317 0.96043165 0.94964029 0.94964029
|
|
0.94244604 0.94244604 0.94265233 0.94244604]
|
|
|
|
mean value: 0.9460637941259895
|
|
|
|
key: test_roc_auc
|
|
value: [0.82258065 0.87096774 0.90322581 0.85483871 0.87096774 0.85483871
|
|
0.85483871 0.83870968 0.80430108 0.8344086 ]
|
|
|
|
mean value: 0.8509677419354839
|
|
|
|
key: train_roc_auc
|
|
value: [0.89028777 0.89388489 0.89388489 0.92625899 0.88489209 0.88848921
|
|
0.87410072 0.87410072 0.89218947 0.90133055]
|
|
|
|
mean value: 0.8919419303267063
|
|
|
|
key: test_jcc
|
|
value: [0.71052632 0.78947368 0.82857143 0.74285714 0.78378378 0.76923077
|
|
0.76315789 0.75 0.68421053 0.74358974]
|
|
|
|
mean value: 0.7565401289085499
|
|
|
|
key: train_jcc
|
|
value: [0.81114551 0.81619938 0.81677019 0.86688312 0.80487805 0.80981595
|
|
0.78915663 0.78915663 0.81424149 0.82649842]
|
|
|
|
mean value: 0.8144745352495302
|
|
|
|
MCC on Blind test: 0.26
|
|
|
|
Accuracy on Blind test: 0.5
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.6481297 1.49946976 1.67077136 1.65997696 1.63008213 1.52724123
|
|
1.67578554 1.65596056 1.49459696 1.68944907]
|
|
|
|
mean value: 1.6151463270187378
|
|
|
|
key: score_time
|
|
value: [0.01430917 0.01388526 0.01319432 0.01351166 0.01167202 0.01358342
|
|
0.01357841 0.01354527 0.01401711 0.01371384]
|
|
|
|
mean value: 0.01350104808807373
|
|
|
|
key: test_mcc
|
|
value: [0.96824584 0.96824584 0.93548387 0.7190925 0.90369611 0.93743687
|
|
1. 1. 0.83655914 1. ]
|
|
|
|
mean value: 0.9268760160039228
|
|
|
|
key: train_mcc
|
|
value: [0.99280576 0.99283145 0.99640932 1. 0.99283145 0.99283145
|
|
0.99283145 0.99283145 0.99284434 0.98923428]
|
|
|
|
mean value: 0.9935450945650737
|
|
|
|
key: test_accuracy
|
|
value: [0.98387097 0.98387097 0.96774194 0.85483871 0.9516129 0.96774194
|
|
1. 1. 0.91803279 1. ]
|
|
|
|
mean value: 0.9627710206240084
|
|
|
|
key: train_accuracy
|
|
value: [0.99640288 0.99640288 0.99820144 1. 0.99640288 0.99640288
|
|
0.99640288 0.99640288 0.99640934 0.994614 ]
|
|
|
|
mean value: 0.9967642044353745
|
|
|
|
key: test_fscore
|
|
value: [0.98360656 0.98360656 0.96774194 0.84210526 0.95081967 0.96875
|
|
1. 1. 0.91803279 1. ]
|
|
|
|
mean value: 0.9614662772412258
|
|
|
|
key: train_fscore
|
|
value: [0.99640288 0.99638989 0.9981982 1. 0.99638989 0.99638989
|
|
0.99638989 0.99638989 0.99640288 0.99459459]
|
|
|
|
mean value: 0.9967548006672231
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.96774194 0.92307692 0.96666667 0.93939394
|
|
1. 1. 0.90322581 1. ]
|
|
|
|
mean value: 0.9700105271073013
|
|
|
|
key: train_precision
|
|
value: [0.99640288 1. 1. 1. 1. 1.
|
|
1. 1. 1. 0.99638989]
|
|
|
|
mean value: 0.9992792769394593
|
|
|
|
key: test_recall
|
|
value: [0.96774194 0.96774194 0.96774194 0.77419355 0.93548387 1.
|
|
1. 1. 0.93333333 1. ]
|
|
|
|
mean value: 0.9546236559139785
|
|
|
|
key: train_recall
|
|
value: [0.99640288 0.99280576 0.99640288 1. 0.99280576 0.99280576
|
|
0.99280576 0.99280576 0.99283154 0.99280576]
|
|
|
|
mean value: 0.9942471828988423
|
|
|
|
key: test_roc_auc
|
|
value: [0.98387097 0.98387097 0.96774194 0.85483871 0.9516129 0.96774194
|
|
1. 1. 0.91827957 1. ]
|
|
|
|
mean value: 0.9627956989247312
|
|
|
|
key: train_roc_auc
|
|
value: [0.99640288 0.99640288 0.99820144 1. 0.99640288 0.99640288
|
|
0.99640288 0.99640288 0.99641577 0.99461076]
|
|
|
|
mean value: 0.9967645238647792
|
|
|
|
key: test_jcc
|
|
value: [0.96774194 0.96774194 0.9375 0.72727273 0.90625 0.93939394
|
|
1. 1. 0.84848485 1. ]
|
|
|
|
mean value: 0.9294385386119257
|
|
|
|
key: train_jcc
|
|
value: [0.99283154 0.99280576 0.99640288 1. 0.99280576 0.99280576
|
|
0.99280576 0.99280576 0.99283154 0.98924731]
|
|
|
|
mean value: 0.9935342048941492
|
|
|
|
MCC on Blind test: 0.09
|
|
|
|
Accuracy on Blind test: 0.24
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01340103 0.01229239 0.00977087 0.00982738 0.00973988 0.01050377
|
|
0.00967884 0.01013613 0.01014376 0.01027131]
|
|
|
|
mean value: 0.010576534271240234
|
|
|
|
key: score_time
|
|
value: [0.01074123 0.00902033 0.00799775 0.00793815 0.00800681 0.00789976
|
|
0.00837636 0.00792694 0.00824547 0.00833321]
|
|
|
|
mean value: 0.008448600769042969
|
|
|
|
key: test_mcc
|
|
value: [0.90748521 0.96824584 0.96824584 1. 0.93743687 0.93548387
|
|
0.93743687 0.93743687 0.9344086 0.96774194]
|
|
|
|
mean value: 0.9493921894362165
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9516129 0.98387097 0.98387097 1. 0.96774194 0.96774194
|
|
0.96774194 0.96774194 0.96721311 0.98360656]
|
|
|
|
mean value: 0.9741142252776309
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94915254 0.98360656 0.98412698 1. 0.96666667 0.96774194
|
|
0.96666667 0.96666667 0.96666667 0.98360656]
|
|
|
|
mean value: 0.9734901243404501
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.96875 1. 1. 0.96774194
|
|
1. 1. 0.96666667 1. ]
|
|
|
|
mean value: 0.9903158602150538
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.90322581 0.96774194 1. 1. 0.93548387 0.96774194
|
|
0.93548387 0.93548387 0.96666667 0.96774194]
|
|
|
|
mean value: 0.9579569892473119
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.9516129 0.98387097 0.98387097 1. 0.96774194 0.96774194
|
|
0.96774194 0.96774194 0.9672043 0.98387097]
|
|
|
|
mean value: 0.9741397849462365
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.90322581 0.96774194 0.96875 1. 0.93548387 0.9375
|
|
0.93548387 0.93548387 0.93548387 0.96774194]
|
|
|
|
mean value: 0.9486895161290323
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.01
|
|
|
|
Accuracy on Blind test: 0.2
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10707998 0.10903525 0.10817385 0.10511184 0.10628986 0.10499215
|
|
0.10362315 0.10446763 0.10430741 0.10113478]
|
|
|
|
mean value: 0.10542159080505371
|
|
|
|
key: score_time
|
|
value: [0.01860476 0.01862955 0.01860476 0.01870513 0.01833129 0.01816988
|
|
0.01843429 0.01767302 0.01715016 0.01741219]
|
|
|
|
mean value: 0.01817150115966797
|
|
|
|
key: test_mcc
|
|
value: [0.93548387 1. 0.93548387 0.87831007 0.90369611 0.93743687
|
|
1. 0.96824584 0.90215054 0.93635873]
|
|
|
|
mean value: 0.9397165895399419
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96774194 1. 0.96774194 0.93548387 0.9516129 0.96774194
|
|
1. 0.98387097 0.95081967 0.96721311]
|
|
|
|
mean value: 0.9692226335272343
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96774194 1. 0.96774194 0.93103448 0.95081967 0.96875
|
|
1. 0.98412698 0.95081967 0.96875 ]
|
|
|
|
mean value: 0.9689784682115642
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96774194 1. 0.96774194 1. 0.96666667 0.93939394
|
|
1. 0.96875 0.93548387 0.93939394]
|
|
|
|
mean value: 0.9685172287390029
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96774194 1. 0.96774194 0.87096774 0.93548387 1.
|
|
1. 1. 0.96666667 1. ]
|
|
|
|
mean value: 0.9708602150537634
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96774194 1. 0.96774194 0.93548387 0.9516129 0.96774194
|
|
1. 0.98387097 0.95107527 0.96666667]
|
|
|
|
mean value: 0.9691935483870968
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.9375 1. 0.9375 0.87096774 0.90625 0.93939394
|
|
1. 0.96875 0.90625 0.93939394]
|
|
|
|
mean value: 0.9406005620723363
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.36
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00863886 0.00797391 0.0083859 0.00775075 0.00766373 0.0079093
|
|
0.00830865 0.00843334 0.00793123 0.00765133]
|
|
|
|
mean value: 0.008064699172973634
|
|
|
|
key: score_time
|
|
value: [0.00806904 0.00858569 0.00859904 0.00799298 0.00799918 0.00797725
|
|
0.00856709 0.00818801 0.00789118 0.00795794]
|
|
|
|
mean value: 0.008182740211486817
|
|
|
|
key: test_mcc
|
|
value: [0.75623534 0.87831007 0.87278605 0.83914639 0.84266484 0.64820372
|
|
0.74348441 0.90748521 0.77072165 0.83655914]
|
|
|
|
mean value: 0.8095596827565272
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.87096774 0.93548387 0.93548387 0.91935484 0.91935484 0.82258065
|
|
0.87096774 0.9516129 0.8852459 0.91803279]
|
|
|
|
mean value: 0.9029085140137494
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 0.93103448 0.93333333 0.92063492 0.91525424 0.81355932
|
|
0.86666667 0.94915254 0.88135593 0.91803279]
|
|
|
|
mean value: 0.8986167081319949
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96 1. 0.96551724 0.90625 0.96428571 0.85714286
|
|
0.89655172 1. 0.89655172 0.93333333]
|
|
|
|
mean value: 0.9379632594417078
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.77419355 0.87096774 0.90322581 0.93548387 0.87096774 0.77419355
|
|
0.83870968 0.90322581 0.86666667 0.90322581]
|
|
|
|
mean value: 0.8640860215053763
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.87096774 0.93548387 0.93548387 0.91935484 0.91935484 0.82258065
|
|
0.87096774 0.9516129 0.88494624 0.91827957]
|
|
|
|
mean value: 0.9029032258064517
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.75 0.87096774 0.875 0.85294118 0.84375 0.68571429
|
|
0.76470588 0.90322581 0.78787879 0.84848485]
|
|
|
|
mean value: 0.8182668529288548
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.26
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.34418821 1.34185529 1.3479538 1.36781883 1.42743945 1.3655982
|
|
1.38340139 1.37809682 1.39602447 1.33490944]
|
|
|
|
mean value: 1.3687285900115966
|
|
|
|
key: score_time
|
|
value: [0.09742594 0.09719825 0.09524751 0.09951448 0.09094286 0.0994525
|
|
0.09763288 0.09727025 0.09892535 0.09526753]
|
|
|
|
mean value: 0.09688775539398194
|
|
|
|
key: test_mcc
|
|
value: [0.96824584 0.96824584 0.93548387 0.96824584 0.96824584 0.96824584
|
|
1. 1. 0.90215054 1. ]
|
|
|
|
mean value: 0.9678863591361422
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98387097 0.98387097 0.96774194 0.98387097 0.98387097 0.98387097
|
|
1. 1. 0.95081967 1. ]
|
|
|
|
mean value: 0.9837916446324696
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98360656 0.98360656 0.96774194 0.98360656 0.98412698 0.98412698
|
|
1. 1. 0.95081967 1. ]
|
|
|
|
mean value: 0.9837635248000134
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.96774194 1. 0.96875 0.96875
|
|
1. 1. 0.93548387 1. ]
|
|
|
|
mean value: 0.9840725806451613
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96774194 0.96774194 0.96774194 0.96774194 1. 1.
|
|
1. 1. 0.96666667 1. ]
|
|
|
|
mean value: 0.983763440860215
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98387097 0.98387097 0.96774194 0.98387097 0.98387097 0.98387097
|
|
1. 1. 0.95107527 1. ]
|
|
|
|
mean value: 0.9838172043010753
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96774194 0.96774194 0.9375 0.96774194 0.96875 0.96875
|
|
1. 1. 0.90625 1. ]
|
|
|
|
mean value: 0.9684475806451613
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.08
|
|
|
|
Accuracy on Blind test: 0.19
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.99160314 0.89481354 0.90568662 0.9475925 0.90248585 0.94201088
|
|
0.93975306 0.95513034 0.88768649 0.92357564]
|
|
|
|
mean value: 0.9290338039398194
|
|
|
|
key: score_time
|
|
value: [0.15050101 0.24627447 0.24356008 0.27248359 0.27095199 0.25157189
|
|
0.20301151 0.27629042 0.26423383 0.23688626]
|
|
|
|
mean value: 0.24157650470733644
|
|
|
|
key: test_mcc
|
|
value: [0.93548387 0.96824584 0.93548387 0.96824584 0.90748521 0.96824584
|
|
1. 0.96824584 0.87082935 0.96770777]
|
|
|
|
mean value: 0.9489973426546622
|
|
|
|
key: train_mcc
|
|
value: [0.96425338 0.96058703 0.96425338 0.96058703 0.96412858 0.97132357
|
|
0.95353974 0.96412858 0.96783888 0.96065866]
|
|
|
|
mean value: 0.9631298857914714
|
|
|
|
key: test_accuracy
|
|
value: [0.96774194 0.98387097 0.96774194 0.98387097 0.9516129 0.98387097
|
|
1. 0.98387097 0.93442623 0.98360656]
|
|
|
|
mean value: 0.9740613432046537
|
|
|
|
key: train_accuracy
|
|
value: [0.98201439 0.98021583 0.98201439 0.98021583 0.98201439 0.98561151
|
|
0.97661871 0.98201439 0.98384201 0.98025135]
|
|
|
|
mean value: 0.9814812781731527
|
|
|
|
key: test_fscore
|
|
value: [0.96774194 0.98360656 0.96774194 0.98360656 0.95384615 0.98412698
|
|
1. 0.98360656 0.93548387 0.98412698]
|
|
|
|
mean value: 0.9743887536166753
|
|
|
|
key: train_fscore
|
|
value: [0.98220641 0.98039216 0.98220641 0.98039216 0.98214286 0.98571429
|
|
0.97690941 0.98214286 0.98401421 0.98039216]
|
|
|
|
mean value: 0.9816512905421962
|
|
|
|
key: test_precision
|
|
value: [0.96774194 1. 0.96774194 1. 0.91176471 0.96875
|
|
1. 1. 0.90625 0.96875 ]
|
|
|
|
mean value: 0.9690998576850095
|
|
|
|
key: train_precision
|
|
value: [0.97183099 0.97173145 0.97183099 0.97173145 0.9751773 0.9787234
|
|
0.96491228 0.9751773 0.97535211 0.97173145]
|
|
|
|
mean value: 0.9728198725682946
|
|
|
|
key: test_recall
|
|
value: [0.96774194 0.96774194 0.96774194 0.96774194 1. 1.
|
|
1. 0.96774194 0.96666667 1. ]
|
|
|
|
mean value: 0.9805376344086022
|
|
|
|
key: train_recall
|
|
value: [0.99280576 0.98920863 0.99280576 0.98920863 0.98920863 0.99280576
|
|
0.98920863 0.98920863 0.99283154 0.98920863]
|
|
|
|
mean value: 0.9906500605966839
|
|
|
|
key: test_roc_auc
|
|
value: [0.96774194 0.98387097 0.96774194 0.98387097 0.9516129 0.98387097
|
|
1. 0.98387097 0.93494624 0.98333333]
|
|
|
|
mean value: 0.9740860215053764
|
|
|
|
key: train_roc_auc
|
|
value: [0.98201439 0.98021583 0.98201439 0.98021583 0.98201439 0.98561151
|
|
0.97661871 0.98201439 0.98382584 0.9802674 ]
|
|
|
|
mean value: 0.9814812665996235
|
|
|
|
key: test_jcc
|
|
value: [0.9375 0.96774194 0.9375 0.96774194 0.91176471 0.96875
|
|
1. 0.96774194 0.87878788 0.96875 ]
|
|
|
|
mean value: 0.9506278391121845
|
|
|
|
key: train_jcc
|
|
value: [0.96503497 0.96153846 0.96503497 0.96153846 0.96491228 0.97183099
|
|
0.95486111 0.96491228 0.96853147 0.96153846]
|
|
|
|
mean value: 0.9639733441646896
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.23
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01971221 0.00761104 0.00768089 0.00756931 0.00756836 0.00765538
|
|
0.00759244 0.00763845 0.00757504 0.00766015]
|
|
|
|
mean value: 0.008826327323913575
|
|
|
|
key: score_time
|
|
value: [0.01263118 0.00788474 0.00787878 0.00782609 0.00785947 0.00789833
|
|
0.00783944 0.00784731 0.00786543 0.00787163]
|
|
|
|
mean value: 0.008340239524841309
|
|
|
|
key: test_mcc
|
|
value: [0.51639778 0.56761348 0.61290323 0.65372045 0.74348441 0.5809475
|
|
0.58834841 0.7130241 0.58264312 0.54086022]
|
|
|
|
mean value: 0.6099942679846233
|
|
|
|
key: train_mcc
|
|
value: [0.62249953 0.6079176 0.63414469 0.60794907 0.59713776 0.61543051
|
|
0.64482423 0.62249953 0.6375268 0.6122178 ]
|
|
|
|
mean value: 0.620214750789007
|
|
|
|
key: test_accuracy
|
|
value: [0.75806452 0.77419355 0.80645161 0.82258065 0.87096774 0.79032258
|
|
0.79032258 0.85483871 0.78688525 0.7704918 ]
|
|
|
|
mean value: 0.8025118984664199
|
|
|
|
key: train_accuracy
|
|
value: [0.81115108 0.80395683 0.81654676 0.80395683 0.79856115 0.80755396
|
|
0.82194245 0.81115108 0.81867145 0.80610413]
|
|
|
|
mean value: 0.8099595727367837
|
|
|
|
key: test_fscore
|
|
value: [0.75409836 0.74074074 0.80645161 0.80701754 0.875 0.79365079
|
|
0.80597015 0.86153846 0.8 0.77419355]
|
|
|
|
mean value: 0.8018661210989436
|
|
|
|
key: train_fscore
|
|
value: [0.80874317 0.8036036 0.82167832 0.80500894 0.79928315 0.80438757
|
|
0.82661996 0.81349911 0.82123894 0.80505415]
|
|
|
|
mean value: 0.8109116928454192
|
|
|
|
key: test_precision
|
|
value: [0.76666667 0.86956522 0.80645161 0.88461538 0.84848485 0.78125
|
|
0.75 0.82352941 0.74285714 0.77419355]
|
|
|
|
mean value: 0.8047613833070375
|
|
|
|
key: train_precision
|
|
value: [0.81918819 0.80505415 0.79931973 0.80071174 0.79642857 0.81784387
|
|
0.80546075 0.80350877 0.81118881 0.80797101]
|
|
|
|
mean value: 0.8066675601234072
|
|
|
|
key: test_recall
|
|
value: [0.74193548 0.64516129 0.80645161 0.74193548 0.90322581 0.80645161
|
|
0.87096774 0.90322581 0.86666667 0.77419355]
|
|
|
|
mean value: 0.8060215053763441
|
|
|
|
key: train_recall
|
|
value: [0.79856115 0.80215827 0.84532374 0.80935252 0.80215827 0.79136691
|
|
0.84892086 0.82374101 0.83154122 0.80215827]
|
|
|
|
mean value: 0.8155282225832238
|
|
|
|
key: test_roc_auc
|
|
value: [0.75806452 0.77419355 0.80645161 0.82258065 0.87096774 0.79032258
|
|
0.79032258 0.85483871 0.78817204 0.77043011]
|
|
|
|
mean value: 0.8026344086021505
|
|
|
|
key: train_roc_auc
|
|
value: [0.81115108 0.80395683 0.81654676 0.80395683 0.79856115 0.80755396
|
|
0.82194245 0.81115108 0.81864831 0.80609706]
|
|
|
|
mean value: 0.8099565508883215
|
|
|
|
key: test_jcc
|
|
value: [0.60526316 0.58823529 0.67567568 0.67647059 0.77777778 0.65789474
|
|
0.675 0.75675676 0.66666667 0.63157895]
|
|
|
|
mean value: 0.6711319601335082
|
|
|
|
key: train_jcc
|
|
value: [0.67889908 0.67168675 0.69732938 0.67365269 0.66567164 0.67278287
|
|
0.70447761 0.68562874 0.6966967 0.67371601]
|
|
|
|
mean value: 0.6820541480667476
|
|
|
|
MCC on Blind test: 0.18
|
|
|
|
Accuracy on Blind test: 0.52
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.21862555 0.04956889 0.04996634 0.05186462 0.05506182 0.06219912
|
|
0.06107974 0.06241131 0.05737829 0.05969238]
|
|
|
|
mean value: 0.07278480529785156
|
|
|
|
key: score_time
|
|
value: [0.01031947 0.00971913 0.00969386 0.00995827 0.01020288 0.00984311
|
|
0.0096755 0.00973344 0.0099237 0.00953674]
|
|
|
|
mean value: 0.009860610961914063
|
|
|
|
key: test_mcc
|
|
value: [0.96824584 0.96824584 0.93548387 0.96824584 0.96824584 0.96824584
|
|
0.96824584 0.96824584 0.90215054 1. ]
|
|
|
|
mean value: 0.9615355264465131
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98387097 0.98387097 0.96774194 0.98387097 0.98387097 0.98387097
|
|
0.98387097 0.98387097 0.95081967 1. ]
|
|
|
|
mean value: 0.9805658381808567
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98360656 0.98360656 0.96774194 0.98360656 0.98412698 0.98412698
|
|
0.98360656 0.98360656 0.95081967 1. ]
|
|
|
|
mean value: 0.9804848362754233
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.96774194 1. 0.96875 0.96875
|
|
1. 1. 0.93548387 1. ]
|
|
|
|
mean value: 0.9840725806451613
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96774194 0.96774194 0.96774194 0.96774194 1. 1.
|
|
0.96774194 0.96774194 0.96666667 1. ]
|
|
|
|
mean value: 0.9773118279569892
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98387097 0.98387097 0.96774194 0.98387097 0.98387097 0.98387097
|
|
0.98387097 0.98387097 0.95107527 1. ]
|
|
|
|
mean value: 0.9805913978494624
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96774194 0.96774194 0.9375 0.96774194 0.96875 0.96875
|
|
0.96774194 0.96774194 0.90625 1. ]
|
|
|
|
mean value: 0.9619959677419355
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.2
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01578832 0.04168701 0.05872059 0.01797581 0.01809096 0.03955126
|
|
0.04262829 0.01832151 0.01880884 0.0180583 ]
|
|
|
|
mean value: 0.028963088989257812
|
|
|
|
key: score_time
|
|
value: [0.01038313 0.01973009 0.01196766 0.01065159 0.01061916 0.02021313
|
|
0.02139711 0.01056767 0.01115489 0.01077628]
|
|
|
|
mean value: 0.013746070861816406
|
|
|
|
key: test_mcc
|
|
value: [0.93548387 1. 0.93548387 0.87831007 0.87831007 0.96824584
|
|
0.93743687 0.96824584 0.83655914 0.93635873]
|
|
|
|
mean value: 0.9274434285640426
|
|
|
|
key: train_mcc
|
|
value: [0.94283651 0.9393413 0.94305636 0.93563929 0.95353974 0.9393413
|
|
0.93914669 0.93214329 0.94994909 0.93925798]
|
|
|
|
mean value: 0.941425155755879
|
|
|
|
key: test_accuracy
|
|
value: [0.96774194 1. 0.96774194 0.93548387 0.93548387 0.98387097
|
|
0.96774194 0.98387097 0.91803279 0.96721311]
|
|
|
|
mean value: 0.9627181385510312
|
|
|
|
key: train_accuracy
|
|
value: [0.97122302 0.96942446 0.97122302 0.9676259 0.97661871 0.96942446
|
|
0.96942446 0.96582734 0.97486535 0.96947935]
|
|
|
|
mean value: 0.9705136070676672
|
|
|
|
key: test_fscore
|
|
value: [0.96774194 1. 0.96774194 0.93103448 0.93939394 0.98412698
|
|
0.96875 0.98360656 0.91803279 0.96875 ]
|
|
|
|
mean value: 0.9629178621509581
|
|
|
|
key: train_fscore
|
|
value: [0.97163121 0.9699115 0.97173145 0.96808511 0.97690941 0.9699115
|
|
0.96980462 0.96637168 0.9751773 0.96980462]
|
|
|
|
mean value: 0.9709338406138824
|
|
|
|
key: test_precision
|
|
value: [0.96774194 1. 0.96774194 1. 0.88571429 0.96875
|
|
0.93939394 1. 0.90322581 0.93939394]
|
|
|
|
mean value: 0.9571961841921519
|
|
|
|
key: train_precision
|
|
value: [0.95804196 0.95470383 0.95486111 0.95454545 0.96491228 0.95470383
|
|
0.95789474 0.95121951 0.96491228 0.95789474]
|
|
|
|
mean value: 0.9573689736486591
|
|
|
|
key: test_recall
|
|
value: [0.96774194 1. 0.96774194 0.87096774 1. 1.
|
|
1. 0.96774194 0.93333333 1. ]
|
|
|
|
mean value: 0.970752688172043
|
|
|
|
key: train_recall
|
|
value: [0.98561151 0.98561151 0.98920863 0.98201439 0.98920863 0.98561151
|
|
0.98201439 0.98201439 0.98566308 0.98201439]
|
|
|
|
mean value: 0.9848972434955261
|
|
|
|
key: test_roc_auc
|
|
value: [0.96774194 1. 0.96774194 0.93548387 0.93548387 0.98387097
|
|
0.96774194 0.98387097 0.91827957 0.96666667]
|
|
|
|
mean value: 0.9626881720430107
|
|
|
|
key: train_roc_auc
|
|
value: [0.97122302 0.96942446 0.97122302 0.9676259 0.97661871 0.96942446
|
|
0.96942446 0.96582734 0.97484593 0.96950182]
|
|
|
|
mean value: 0.970513911451484
|
|
|
|
key: test_jcc
|
|
value: [0.9375 1. 0.9375 0.87096774 0.88571429 0.96875
|
|
0.93939394 0.96774194 0.84848485 0.93939394]
|
|
|
|
mean value: 0.9295446690406368
|
|
|
|
key: train_jcc
|
|
value: [0.94482759 0.94158076 0.94501718 0.93814433 0.95486111 0.94158076
|
|
0.94137931 0.93493151 0.95155709 0.94137931]
|
|
|
|
mean value: 0.9435258942337567
|
|
|
|
MCC on Blind test: 0.14
|
|
|
|
Accuracy on Blind test: 0.35
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01899743 0.00779724 0.00781965 0.00752091 0.00765944 0.00745153
|
|
0.00752687 0.00762939 0.00754023 0.00756288]
|
|
|
|
mean value: 0.008750557899475098
|
|
|
|
key: score_time
|
|
value: [0.008394 0.00820088 0.00782681 0.00797677 0.00793958 0.00783062
|
|
0.00781059 0.0078342 0.00790739 0.00793242]
|
|
|
|
mean value: 0.007965326309204102
|
|
|
|
key: test_mcc
|
|
value: [0.61807005 0.74819006 0.67883359 0.64549722 0.67883359 0.63439154
|
|
0.63439154 0.67419986 0.54654832 0.64708149]
|
|
|
|
mean value: 0.6506037256013296
|
|
|
|
key: train_mcc
|
|
value: [0.66814183 0.65361701 0.66955589 0.67282515 0.64923736 0.67144111
|
|
0.67540424 0.6622781 0.67590132 0.66881107]
|
|
|
|
mean value: 0.6667213081476084
|
|
|
|
key: test_accuracy
|
|
value: [0.80645161 0.87096774 0.83870968 0.82258065 0.83870968 0.80645161
|
|
0.80645161 0.82258065 0.7704918 0.81967213]
|
|
|
|
mean value: 0.8203067160232681
|
|
|
|
key: train_accuracy
|
|
value: [0.83093525 0.82374101 0.83093525 0.83273381 0.82014388 0.83273381
|
|
0.83453237 0.82733813 0.83482944 0.83123878]
|
|
|
|
mean value: 0.8299161747801042
|
|
|
|
key: test_fscore
|
|
value: [0.81818182 0.87878788 0.84375 0.82539683 0.84375 0.82857143
|
|
0.82857143 0.84507042 0.78125 0.8358209 ]
|
|
|
|
mean value: 0.8329150697566978
|
|
|
|
key: train_fscore
|
|
value: [0.84175084 0.83501684 0.84280936 0.84422111 0.83388704 0.84317032
|
|
0.84511785 0.83946488 0.84563758 0.84175084]
|
|
|
|
mean value: 0.8412826664142349
|
|
|
|
key: test_precision
|
|
value: [0.77142857 0.82857143 0.81818182 0.8125 0.81818182 0.74358974
|
|
0.74358974 0.75 0.73529412 0.77777778]
|
|
|
|
mean value: 0.779911501896796
|
|
|
|
key: train_precision
|
|
value: [0.79113924 0.78481013 0.7875 0.78996865 0.77469136 0.79365079
|
|
0.7943038 0.784375 0.79495268 0.79113924]
|
|
|
|
mean value: 0.7886530890164406
|
|
|
|
key: test_recall
|
|
value: [0.87096774 0.93548387 0.87096774 0.83870968 0.87096774 0.93548387
|
|
0.93548387 0.96774194 0.83333333 0.90322581]
|
|
|
|
mean value: 0.896236559139785
|
|
|
|
key: train_recall
|
|
value: [0.89928058 0.89208633 0.90647482 0.90647482 0.9028777 0.89928058
|
|
0.9028777 0.9028777 0.90322581 0.89928058]
|
|
|
|
mean value: 0.9014736597818519
|
|
|
|
key: test_roc_auc
|
|
value: [0.80645161 0.87096774 0.83870968 0.82258065 0.83870968 0.80645161
|
|
0.80645161 0.82258065 0.77150538 0.81827957]
|
|
|
|
mean value: 0.8202688172043011
|
|
|
|
key: train_roc_auc
|
|
value: [0.83093525 0.82374101 0.83093525 0.83273381 0.82014388 0.83273381
|
|
0.83453237 0.82733813 0.83470643 0.83136072]
|
|
|
|
mean value: 0.8299160671462831
|
|
|
|
key: test_jcc
|
|
value: [0.69230769 0.78378378 0.72972973 0.7027027 0.72972973 0.70731707
|
|
0.70731707 0.73170732 0.64102564 0.71794872]
|
|
|
|
mean value: 0.7143569460642631
|
|
|
|
key: train_jcc
|
|
value: [0.72674419 0.71676301 0.7283237 0.73043478 0.71509972 0.72886297
|
|
0.73177843 0.72334294 0.73255814 0.72674419]
|
|
|
|
mean value: 0.7260652053436807
|
|
|
|
MCC on Blind test: 0.21
|
|
|
|
Accuracy on Blind test: 0.5
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01070261 0.0129571 0.01364112 0.01318789 0.01269341 0.01534224
|
|
0.01468229 0.01412392 0.01440811 0.0143919 ]
|
|
|
|
mean value: 0.013613057136535645
|
|
|
|
key: score_time
|
|
value: [0.008075 0.01009893 0.00991964 0.01034665 0.01041341 0.01067472
|
|
0.0105691 0.01087594 0.01076126 0.01034617]
|
|
|
|
mean value: 0.01020808219909668
|
|
|
|
key: test_mcc
|
|
value: [0.82199494 0.93743687 0.93548387 0.81325006 0.87831007 0.74161985
|
|
0.90748521 0.83914639 0.72318666 0.30374645]
|
|
|
|
mean value: 0.7901660359762814
|
|
|
|
key: train_mcc
|
|
value: [0.87166214 0.92172241 0.94266562 0.92172241 0.91860435 0.69376766
|
|
0.94305636 0.93238486 0.88634645 0.2887174 ]
|
|
|
|
mean value: 0.8320649673376139
|
|
|
|
key: test_accuracy
|
|
value: [0.90322581 0.96774194 0.96774194 0.90322581 0.93548387 0.85483871
|
|
0.9516129 0.91935484 0.85245902 0.59016393]
|
|
|
|
mean value: 0.8845848757271285
|
|
|
|
key: train_accuracy
|
|
value: [0.93345324 0.96043165 0.97122302 0.96043165 0.95863309 0.82733813
|
|
0.97122302 0.96582734 0.94075404 0.57630162]
|
|
|
|
mean value: 0.9065616806375366
|
|
|
|
key: test_fscore
|
|
value: [0.89285714 0.96875 0.96774194 0.89655172 0.93939394 0.87323944
|
|
0.95384615 0.92063492 0.83018868 0.71264368]
|
|
|
|
mean value: 0.895584761037988
|
|
|
|
key: train_fscore
|
|
value: [0.92979127 0.96126761 0.97153025 0.96126761 0.95971979 0.85185185
|
|
0.97173145 0.9664903 0.93761815 0.7020202 ]
|
|
|
|
mean value: 0.9213288471474509
|
|
|
|
key: test_precision
|
|
value: [1. 0.93939394 0.96774194 0.96296296 0.88571429 0.775
|
|
0.91176471 0.90625 0.95652174 0.55357143]
|
|
|
|
mean value: 0.8858920997139276
|
|
|
|
key: train_precision
|
|
value: [0.98393574 0.94137931 0.96126761 0.94137931 0.93515358 0.74594595
|
|
0.95486111 0.94809689 0.992 0.54085603]
|
|
|
|
mean value: 0.8944875526911704
|
|
|
|
key: test_recall
|
|
value: [0.80645161 1. 0.96774194 0.83870968 1. 1.
|
|
1. 0.93548387 0.73333333 1. ]
|
|
|
|
mean value: 0.9281720430107527
|
|
|
|
key: train_recall
|
|
value: [0.88129496 0.98201439 0.98201439 0.98201439 0.98561151 0.99280576
|
|
0.98920863 0.98561151 0.88888889 1. ]
|
|
|
|
mean value: 0.9669464428457234
|
|
|
|
key: test_roc_auc
|
|
value: [0.90322581 0.96774194 0.96774194 0.90322581 0.93548387 0.85483871
|
|
0.9516129 0.91935484 0.85053763 0.58333333]
|
|
|
|
mean value: 0.8837096774193549
|
|
|
|
key: train_roc_auc
|
|
value: [0.93345324 0.96043165 0.97122302 0.96043165 0.95863309 0.82733813
|
|
0.97122302 0.96582734 0.94084732 0.57706093]
|
|
|
|
mean value: 0.9066469405121065
|
|
|
|
key: test_jcc
|
|
value: [0.80645161 0.93939394 0.9375 0.8125 0.88571429 0.775
|
|
0.91176471 0.85294118 0.70967742 0.55357143]
|
|
|
|
mean value: 0.818451456829066
|
|
|
|
key: train_jcc
|
|
value: [0.86879433 0.92542373 0.94463668 0.92542373 0.92255892 0.74193548
|
|
0.94501718 0.93515358 0.88256228 0.54085603]
|
|
|
|
mean value: 0.8632361942955643
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.29
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01686311 0.01279736 0.01273036 0.01324439 0.01300955 0.01324821
|
|
0.01373839 0.01282573 0.01237702 0.01400542]
|
|
|
|
mean value: 0.013483953475952149
|
|
|
|
key: score_time
|
|
value: [0.01079369 0.01044965 0.01073813 0.0103972 0.01031804 0.01030827
|
|
0.01035023 0.01034307 0.01035166 0.01030016]
|
|
|
|
mean value: 0.010435009002685547
|
|
|
|
key: test_mcc
|
|
value: [0.87831007 0.74161985 0.78446454 0.71567809 0.79471941 0.93548387
|
|
0.96824584 0.84983659 0.77072165 0.90586325]
|
|
|
|
mean value: 0.8344943153997917
|
|
|
|
key: train_mcc
|
|
value: [0.92518498 0.76865678 0.81406658 0.92923662 0.90265061 0.89965316
|
|
0.92844206 0.89154571 0.92828039 0.93998809]
|
|
|
|
mean value: 0.8927704971400476
|
|
|
|
key: test_accuracy
|
|
value: [0.93548387 0.85483871 0.88709677 0.83870968 0.88709677 0.96774194
|
|
0.98387097 0.91935484 0.8852459 0.95081967]
|
|
|
|
mean value: 0.9110259122157589
|
|
|
|
key: train_accuracy
|
|
value: [0.96223022 0.87230216 0.89928058 0.96402878 0.94964029 0.94964029
|
|
0.96402878 0.9442446 0.96409336 0.96947935]
|
|
|
|
mean value: 0.9438968394404763
|
|
|
|
key: test_fscore
|
|
value: [0.93103448 0.83018868 0.89552239 0.80769231 0.89855072 0.96774194
|
|
0.98360656 0.9122807 0.88135593 0.95384615]
|
|
|
|
mean value: 0.9061819863058443
|
|
|
|
key: train_fscore
|
|
value: [0.96146789 0.85420945 0.90819672 0.96309963 0.95172414 0.94890511
|
|
0.96350365 0.94183865 0.96441281 0.97012302]
|
|
|
|
mean value: 0.9427481068247102
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.83333333 1. 0.81578947 0.96774194
|
|
1. 1. 0.89655172 0.91176471]
|
|
|
|
mean value: 0.9425181172521699
|
|
|
|
key: train_precision
|
|
value: [0.98127341 0.99521531 0.83433735 0.98863636 0.91390728 0.96296296
|
|
0.97777778 0.98431373 0.95759717 0.94845361]
|
|
|
|
mean value: 0.9544474964669887
|
|
|
|
key: test_recall
|
|
value: [0.87096774 0.70967742 0.96774194 0.67741935 1. 0.96774194
|
|
0.96774194 0.83870968 0.86666667 1. ]
|
|
|
|
mean value: 0.8866666666666667
|
|
|
|
key: train_recall
|
|
value: [0.94244604 0.74820144 0.99640288 0.93884892 0.99280576 0.9352518
|
|
0.94964029 0.9028777 0.97132616 0.99280576]
|
|
|
|
mean value: 0.937060674041412
|
|
|
|
key: test_roc_auc
|
|
value: [0.93548387 0.85483871 0.88709677 0.83870968 0.88709677 0.96774194
|
|
0.98387097 0.91935484 0.88494624 0.95 ]
|
|
|
|
mean value: 0.9109139784946236
|
|
|
|
key: train_roc_auc
|
|
value: [0.96223022 0.87230216 0.89928058 0.96402878 0.94964029 0.94964029
|
|
0.96402878 0.9442446 0.96408035 0.96952116]
|
|
|
|
mean value: 0.9438997189345298
|
|
|
|
key: test_jcc
|
|
value: [0.87096774 0.70967742 0.81081081 0.67741935 0.81578947 0.9375
|
|
0.96774194 0.83870968 0.78787879 0.91176471]
|
|
|
|
mean value: 0.832825990728842
|
|
|
|
key: train_jcc
|
|
value: [0.92579505 0.74551971 0.83183183 0.92882562 0.90789474 0.90277778
|
|
0.92957746 0.89007092 0.93127148 0.94197952]
|
|
|
|
mean value: 0.8935544122114777
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.4
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10854602 0.09391761 0.09340096 0.09336042 0.09349442 0.0939045
|
|
0.09685636 0.09437943 0.09400725 0.09450531]
|
|
|
|
mean value: 0.09563722610473632
|
|
|
|
key: score_time
|
|
value: [0.01416063 0.01400757 0.01419139 0.0142355 0.01414442 0.01419091
|
|
0.01533508 0.01431847 0.01418138 0.0142591 ]
|
|
|
|
mean value: 0.014302444458007813
|
|
|
|
key: test_mcc
|
|
value: [0.96824584 1. 0.96824584 0.96824584 0.96824584 0.96824584
|
|
1. 0.96824584 0.90215054 1. ]
|
|
|
|
mean value: 0.9711625556945535
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98387097 1. 0.98387097 0.98387097 0.98387097 0.98387097
|
|
1. 0.98387097 0.95081967 1. ]
|
|
|
|
mean value: 0.985404547858276
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98360656 1. 0.98412698 0.98360656 0.98412698 0.98412698
|
|
1. 0.98360656 0.95081967 1. ]
|
|
|
|
mean value: 0.9854020296643248
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.96875 1. 0.96875 0.96875
|
|
1. 1. 0.93548387 1. ]
|
|
|
|
mean value: 0.9841733870967742
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96774194 1. 1. 0.96774194 1. 1.
|
|
1. 0.96774194 0.96666667 1. ]
|
|
|
|
mean value: 0.986989247311828
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98387097 1. 0.98387097 0.98387097 0.98387097 0.98387097
|
|
1. 0.98387097 0.95107527 1. ]
|
|
|
|
mean value: 0.9854301075268818
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96774194 1. 0.96875 0.96774194 0.96875 0.96875
|
|
1. 0.96774194 0.90625 1. ]
|
|
|
|
mean value: 0.9715725806451613
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.09
|
|
|
|
Accuracy on Blind test: 0.21
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03534484 0.0434587 0.05311441 0.03592443 0.03713274 0.05070066
|
|
0.05334473 0.05310988 0.04448533 0.03317356]
|
|
|
|
mean value: 0.04397892951965332
|
|
|
|
key: score_time
|
|
value: [0.022789 0.0229876 0.02233076 0.01710248 0.01946139 0.03598452
|
|
0.02479911 0.02968454 0.01835775 0.03061008]
|
|
|
|
mean value: 0.024410724639892578
|
|
|
|
key: test_mcc
|
|
value: [0.93743687 0.93743687 0.93548387 0.93743687 0.93548387 0.96824584
|
|
0.96824584 0.87831007 0.90215054 0.96774194]
|
|
|
|
mean value: 0.9367972553494428
|
|
|
|
key: train_mcc
|
|
value: [1. 0.99640932 0.99640932 0.99640932 1. 1.
|
|
0.99283145 1. 0.99641572 0.99641572]
|
|
|
|
mean value: 0.9974890870152905
|
|
|
|
key: test_accuracy
|
|
value: [0.96774194 0.96774194 0.96774194 0.96774194 0.96774194 0.98387097
|
|
0.98387097 0.93548387 0.95081967 0.98360656]
|
|
|
|
mean value: 0.9676361713379165
|
|
|
|
key: train_accuracy
|
|
value: [1. 0.99820144 0.99820144 0.99820144 1. 1.
|
|
0.99640288 1. 0.99820467 0.99820467]
|
|
|
|
mean value: 0.9987416529971714
|
|
|
|
key: test_fscore
|
|
value: [0.96666667 0.96666667 0.96774194 0.96666667 0.96774194 0.98412698
|
|
0.98360656 0.93103448 0.95081967 0.98360656]
|
|
|
|
mean value: 0.9668678124738592
|
|
|
|
key: train_fscore
|
|
value: [1. 0.9981982 0.9981982 0.9981982 1. 1.
|
|
0.99638989 1. 0.99821109 0.9981982 ]
|
|
|
|
mean value: 0.9987393775723891
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.96774194 1. 0.96774194 0.96875
|
|
1. 1. 0.93548387 1. ]
|
|
|
|
mean value: 0.9839717741935484
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 1. 0.99642857 1. ]
|
|
|
|
mean value: 0.9996428571428572
|
|
|
|
key: test_recall
|
|
value: [0.93548387 0.93548387 0.96774194 0.93548387 0.96774194 1.
|
|
0.96774194 0.87096774 0.96666667 0.96774194]
|
|
|
|
mean value: 0.951505376344086
|
|
|
|
key: train_recall
|
|
value: [1. 0.99640288 0.99640288 0.99640288 1. 1.
|
|
0.99280576 1. 1. 0.99640288]
|
|
|
|
mean value: 0.9978417266187051
|
|
|
|
key: test_roc_auc
|
|
value: [0.96774194 0.96774194 0.96774194 0.96774194 0.96774194 0.98387097
|
|
0.98387097 0.93548387 0.95107527 0.98387097]
|
|
|
|
mean value: 0.9676881720430108
|
|
|
|
key: train_roc_auc
|
|
value: [1. 0.99820144 0.99820144 0.99820144 1. 1.
|
|
0.99640288 1. 0.99820144 0.99820144]
|
|
|
|
mean value: 0.9987410071942446
|
|
|
|
key: test_jcc
|
|
value: [0.93548387 0.93548387 0.9375 0.93548387 0.9375 0.96875
|
|
0.96774194 0.87096774 0.90625 0.96774194]
|
|
|
|
mean value: 0.9362903225806452
|
|
|
|
key: train_jcc
|
|
value: [1. 0.99640288 0.99640288 0.99640288 1. 1.
|
|
0.99280576 1. 0.99642857 0.99640288]
|
|
|
|
mean value: 0.9974845837615622
|
|
|
|
MCC on Blind test: 0.06
|
|
|
|
Accuracy on Blind test: 0.21
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.18122554 0.21741056 0.19656324 0.19563127 0.21628571 0.1628089
|
|
0.20888186 0.19683623 0.13708878 0.13875246]
|
|
|
|
mean value: 0.18514845371246338
|
|
|
|
key: score_time
|
|
value: [0.02055907 0.02068186 0.02069783 0.02077031 0.02077198 0.01287293
|
|
0.02073574 0.02086687 0.01321125 0.02461028]
|
|
|
|
mean value: 0.019577813148498536
|
|
|
|
key: test_mcc
|
|
value: [0.67741935 0.74819006 0.74348441 0.69047575 0.80813523 0.87278605
|
|
0.81325006 0.81325006 0.54086022 0.74352218]
|
|
|
|
mean value: 0.7451373368256522
|
|
|
|
key: train_mcc
|
|
value: [0.87415162 0.87059372 0.86758591 0.89596753 0.88157448 0.87455914
|
|
0.87086426 0.86758591 0.87459701 0.88883589]
|
|
|
|
mean value: 0.8766315468831808
|
|
|
|
key: test_accuracy
|
|
value: [0.83870968 0.87096774 0.87096774 0.83870968 0.90322581 0.93548387
|
|
0.90322581 0.90322581 0.7704918 0.86885246]
|
|
|
|
mean value: 0.870386039132734
|
|
|
|
key: train_accuracy
|
|
value: [0.93705036 0.9352518 0.93345324 0.94784173 0.94064748 0.93705036
|
|
0.9352518 0.93345324 0.93716338 0.9443447 ]
|
|
|
|
mean value: 0.9381508078994614
|
|
|
|
key: test_fscore
|
|
value: [0.83870968 0.87878788 0.875 0.82142857 0.9 0.9375
|
|
0.90909091 0.90909091 0.76666667 0.87878788]
|
|
|
|
mean value: 0.8715062491272169
|
|
|
|
key: train_fscore
|
|
value: [0.93738819 0.93571429 0.93474427 0.94849023 0.94138544 0.9380531
|
|
0.93617021 0.93474427 0.9380531 0.94474153]
|
|
|
|
mean value: 0.9389484621579285
|
|
|
|
key: test_precision
|
|
value: [0.83870968 0.82857143 0.84848485 0.92 0.93103448 0.90909091
|
|
0.85714286 0.85714286 0.76666667 0.82857143]
|
|
|
|
mean value: 0.8585415155848971
|
|
|
|
key: train_precision
|
|
value: [0.93238434 0.92907801 0.91695502 0.93684211 0.92982456 0.92334495
|
|
0.92307692 0.91695502 0.92657343 0.93639576]
|
|
|
|
mean value: 0.9271430114193007
|
|
|
|
key: test_recall
|
|
value: [0.83870968 0.93548387 0.90322581 0.74193548 0.87096774 0.96774194
|
|
0.96774194 0.96774194 0.76666667 0.93548387]
|
|
|
|
mean value: 0.8895698924731182
|
|
|
|
key: train_recall
|
|
value: [0.94244604 0.94244604 0.95323741 0.96043165 0.95323741 0.95323741
|
|
0.94964029 0.95323741 0.94982079 0.95323741]
|
|
|
|
mean value: 0.9510971867667156
|
|
|
|
key: test_roc_auc
|
|
value: [0.83870968 0.87096774 0.87096774 0.83870968 0.90322581 0.93548387
|
|
0.90322581 0.90322581 0.77043011 0.86774194]
|
|
|
|
mean value: 0.870268817204301
|
|
|
|
key: train_roc_auc
|
|
value: [0.93705036 0.9352518 0.93345324 0.94784173 0.94064748 0.93705036
|
|
0.9352518 0.93345324 0.93714061 0.94436064]
|
|
|
|
mean value: 0.9381501250612413
|
|
|
|
key: test_jcc
|
|
value: [0.72222222 0.78378378 0.77777778 0.6969697 0.81818182 0.88235294
|
|
0.83333333 0.83333333 0.62162162 0.78378378]
|
|
|
|
mean value: 0.7753360312183841
|
|
|
|
key: train_jcc
|
|
value: [0.88215488 0.87919463 0.87748344 0.90202703 0.88926174 0.88333333
|
|
0.88 0.87748344 0.88333333 0.89527027]
|
|
|
|
mean value: 0.884954210937499
|
|
|
|
MCC on Blind test: 0.22
|
|
|
|
Accuracy on Blind test: 0.49
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.25665951 0.24086618 0.24189734 0.23880529 0.24180579 0.24213672
|
|
0.24336982 0.24555063 0.24932742 0.25003719]
|
|
|
|
mean value: 0.2450455904006958
|
|
|
|
key: score_time
|
|
value: [0.00856853 0.0083406 0.00863934 0.00827336 0.00876927 0.00846887
|
|
0.00852108 0.00857925 0.00858474 0.00857282]
|
|
|
|
mean value: 0.008531785011291504
|
|
|
|
key: test_mcc
|
|
value: [0.96824584 0.96824584 0.93548387 1. 1. 0.96824584
|
|
1. 0.96824584 0.9344086 1. ]
|
|
|
|
mean value: 0.9742875819325697
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98387097 0.98387097 0.96774194 1. 1. 0.98387097
|
|
1. 0.98387097 0.96721311 1. ]
|
|
|
|
mean value: 0.9870438921205711
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98360656 0.98360656 0.96774194 1. 1. 0.98412698
|
|
1. 0.98360656 0.96666667 1. ]
|
|
|
|
mean value: 0.986935525840867
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.96774194 1. 1. 0.96875
|
|
1. 1. 0.96666667 1. ]
|
|
|
|
mean value: 0.9903158602150538
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96774194 0.96774194 0.96774194 1. 1. 1.
|
|
1. 0.96774194 0.96666667 1. ]
|
|
|
|
mean value: 0.983763440860215
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98387097 0.98387097 0.96774194 1. 1. 0.98387097
|
|
1. 0.98387097 0.9672043 1. ]
|
|
|
|
mean value: 0.9870430107526882
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96774194 0.96774194 0.9375 1. 1. 0.96875
|
|
1. 0.96774194 0.93548387 1. ]
|
|
|
|
mean value: 0.9744959677419355
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.05
|
|
|
|
Accuracy on Blind test: 0.19
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01201487 0.01361275 0.01402354 0.01380372 0.01379108 0.02837563
|
|
0.01576805 0.01669693 0.02459884 0.01624608]
|
|
|
|
mean value: 0.01689314842224121
|
|
|
|
key: score_time
|
|
value: [0.0111146 0.01098752 0.01094055 0.01091433 0.01093793 0.01111317
|
|
0.01172638 0.01128578 0.01110101 0.01107979]
|
|
|
|
mean value: 0.011120104789733886
|
|
|
|
key: test_mcc
|
|
value: [0.74193548 0.80813523 0.81325006 0.52297636 0.74819006 0.67419986
|
|
0.67883359 0.81325006 0.72516604 0.71375712]
|
|
|
|
mean value: 0.7239693864680706
|
|
|
|
key: train_mcc
|
|
value: [0.82567165 0.81659431 0.79995316 0.7380124 0.83549358 0.78285538
|
|
0.76623167 0.78683637 0.87297353 0.8490525 ]
|
|
|
|
mean value: 0.8073674571945186
|
|
|
|
key: test_accuracy
|
|
value: [0.87096774 0.90322581 0.90322581 0.75806452 0.87096774 0.82258065
|
|
0.83870968 0.90322581 0.85245902 0.85245902]
|
|
|
|
mean value: 0.8575885774722369
|
|
|
|
key: train_accuracy
|
|
value: [0.9118705 0.90827338 0.89748201 0.85971223 0.91546763 0.88848921
|
|
0.88309353 0.89028777 0.93536804 0.92280072]
|
|
|
|
mean value: 0.9012845020213631
|
|
|
|
key: test_fscore
|
|
value: [0.87096774 0.9 0.89655172 0.73684211 0.87878788 0.84507042
|
|
0.83333333 0.89655172 0.86567164 0.86567164]
|
|
|
|
mean value: 0.8589448213713017
|
|
|
|
key: train_fscore
|
|
value: [0.90875233 0.90876565 0.89142857 0.84210526 0.91965812 0.89491525
|
|
0.88245931 0.88291747 0.93771626 0.92598967]
|
|
|
|
mean value: 0.8994707904383525
|
|
|
|
key: test_precision
|
|
value: [0.87096774 0.93103448 0.96296296 0.80769231 0.82857143 0.75
|
|
0.86206897 0.96296296 0.78378378 0.80555556]
|
|
|
|
mean value: 0.8565600191740348
|
|
|
|
key: train_precision
|
|
value: [0.94208494 0.90391459 0.94736842 0.96296296 0.8762215 0.84615385
|
|
0.88727273 0.94650206 0.90635452 0.88778878]
|
|
|
|
mean value: 0.9106624340187
|
|
|
|
key: test_recall
|
|
value: [0.87096774 0.87096774 0.83870968 0.67741935 0.93548387 0.96774194
|
|
0.80645161 0.83870968 0.96666667 0.93548387]
|
|
|
|
mean value: 0.8708602150537634
|
|
|
|
key: train_recall
|
|
value: [0.87769784 0.91366906 0.84172662 0.74820144 0.9676259 0.94964029
|
|
0.87769784 0.82733813 0.97132616 0.9676259 ]
|
|
|
|
mean value: 0.8942549186457286
|
|
|
|
key: test_roc_auc
|
|
value: [0.87096774 0.90322581 0.90322581 0.75806452 0.87096774 0.82258065
|
|
0.83870968 0.90322581 0.85430108 0.85107527]
|
|
|
|
mean value: 0.8576344086021506
|
|
|
|
key: train_roc_auc
|
|
value: [0.9118705 0.90827338 0.89748201 0.85971223 0.91546763 0.88848921
|
|
0.88309353 0.89028777 0.93530337 0.92288105]
|
|
|
|
mean value: 0.9012860679198577
|
|
|
|
key: test_jcc
|
|
value: [0.77142857 0.81818182 0.8125 0.58333333 0.78378378 0.73170732
|
|
0.71428571 0.8125 0.76315789 0.76315789]
|
|
|
|
mean value: 0.7554036327560076
|
|
|
|
key: train_jcc
|
|
value: [0.83276451 0.83278689 0.80412371 0.72727273 0.85126582 0.80981595
|
|
0.78964401 0.79037801 0.88273616 0.86217949]
|
|
|
|
mean value: 0.8182967266032459
|
|
|
|
MCC on Blind test: 0.15
|
|
|
|
Accuracy on Blind test: 0.77
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01394916 0.01445484 0.01928425 0.01152682 0.01178312 0.01156855
|
|
0.01153874 0.0114634 0.01149082 0.01142311]
|
|
|
|
mean value: 0.012848281860351562
|
|
|
|
key: score_time
|
|
value: [0.01326442 0.01076293 0.01061916 0.01054502 0.01053524 0.01052785
|
|
0.01054406 0.01065135 0.01066136 0.01065755]
|
|
|
|
mean value: 0.010876893997192383
|
|
|
|
key: test_mcc
|
|
value: [0.90369611 1. 0.90369611 0.87831007 0.80813523 0.93743687
|
|
0.90369611 1. 0.77072165 0.90586325]
|
|
|
|
mean value: 0.9011555410657976
|
|
|
|
key: train_mcc
|
|
value: [0.91741458 0.92145965 0.92518498 0.92475364 0.93914669 0.91054923
|
|
0.93214329 0.92475364 0.93206857 0.92840473]
|
|
|
|
mean value: 0.9255878992579923
|
|
|
|
key: test_accuracy
|
|
value: [0.9516129 1. 0.9516129 0.93548387 0.90322581 0.96774194
|
|
0.9516129 1. 0.8852459 0.95081967]
|
|
|
|
mean value: 0.9497355896351137
|
|
|
|
key: train_accuracy
|
|
value: [0.95863309 0.96043165 0.96223022 0.96223022 0.96942446 0.95503597
|
|
0.96582734 0.96223022 0.96588869 0.96409336]
|
|
|
|
mean value: 0.9626025212146261
|
|
|
|
key: test_fscore
|
|
value: [0.95238095 1. 0.95238095 0.93103448 0.90625 0.96875
|
|
0.95238095 1. 0.88135593 0.95384615]
|
|
|
|
mean value: 0.9498379425951021
|
|
|
|
key: train_fscore
|
|
value: [0.95900178 0.96113074 0.96296296 0.96269982 0.96980462 0.95575221
|
|
0.96637168 0.96269982 0.96637168 0.96441281]
|
|
|
|
mean value: 0.9631208137030208
|
|
|
|
key: test_precision
|
|
value: [0.9375 1. 0.9375 1. 0.87878788 0.93939394
|
|
0.9375 1. 0.89655172 0.91176471]
|
|
|
|
mean value: 0.9438998248202102
|
|
|
|
key: train_precision
|
|
value: [0.95053004 0.94444444 0.94463668 0.95087719 0.95789474 0.94076655
|
|
0.95121951 0.95087719 0.95454545 0.95422535]
|
|
|
|
mean value: 0.9500017150163743
|
|
|
|
key: test_recall
|
|
value: [0.96774194 1. 0.96774194 0.87096774 0.93548387 1.
|
|
0.96774194 1. 0.86666667 1. ]
|
|
|
|
mean value: 0.9576344086021505
|
|
|
|
key: train_recall
|
|
value: [0.9676259 0.97841727 0.98201439 0.97482014 0.98201439 0.97122302
|
|
0.98201439 0.97482014 0.97849462 0.97482014]
|
|
|
|
mean value: 0.9766264407828575
|
|
|
|
key: test_roc_auc
|
|
value: [0.9516129 1. 0.9516129 0.93548387 0.90322581 0.96774194
|
|
0.9516129 1. 0.88494624 0.95 ]
|
|
|
|
mean value: 0.9496236559139786
|
|
|
|
key: train_roc_auc
|
|
value: [0.95863309 0.96043165 0.96223022 0.96223022 0.96942446 0.95503597
|
|
0.96582734 0.96223022 0.96586602 0.96411258]
|
|
|
|
mean value: 0.9626021763234573
|
|
|
|
key: test_jcc
|
|
value: [0.90909091 1. 0.90909091 0.87096774 0.82857143 0.93939394
|
|
0.90909091 1. 0.78787879 0.91176471]
|
|
|
|
mean value: 0.906584933093472
|
|
|
|
key: train_jcc
|
|
value: [0.92123288 0.92517007 0.92857143 0.92808219 0.94137931 0.91525424
|
|
0.93493151 0.92808219 0.93493151 0.93127148]
|
|
|
|
mean value: 0.9288906795867435
|
|
|
|
MCC on Blind test: 0.19
|
|
|
|
Accuracy on Blind test: 0.44
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: /home/tanu/git/LSHTM_analysis/scripts/ml/./katg_config.py:203: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rouC_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./katg_config.py:206: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rouC_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist', 'rsa', 'kd_values', 'rd_values',
|
|
'electro_rr', 'electro_mm', 'electro_sm', 'electr...
|
|
'volumetric_ss', 'consurf_score', 'snap2_score', 'provean_score', 'maf',
|
|
'logorI', 'lineage_proportion', 'dist_lineage_proportion',
|
|
'lineage_count_all', 'lineage_count_unique'],
|
|
dtype='object')),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.11430693 0.1973083 0.16238761 0.19749618 0.09576607 0.1363256
|
|
0.09693003 0.12462544 0.20535517 0.23405504]
|
|
|
|
mean value: 0.1564556360244751
|
|
|
|
key: score_time
|
|
value: [0.01904321 0.02068663 0.02037406 0.02086973 0.01105285 0.01114559
|
|
0.01939201 0.01616836 0.01520658 0.01612258]
|
|
|
|
mean value: 0.01700615882873535
|
|
|
|
key: test_mcc
|
|
value: [0.90369611 1. 0.93548387 0.87831007 0.84266484 0.93743687
|
|
0.93743687 0.96824584 0.77072165 0.93635873]
|
|
|
|
mean value: 0.9110354846088805
|
|
|
|
key: train_mcc
|
|
value: [0.92844206 0.93563929 0.9393413 0.93563929 0.94986154 0.9393413
|
|
0.93214329 0.92844206 0.94264494 0.93558747]
|
|
|
|
mean value: 0.9367082543906752
|
|
|
|
key: test_accuracy
|
|
value: [0.9516129 1. 0.96774194 0.93548387 0.91935484 0.96774194
|
|
0.96774194 0.98387097 0.8852459 0.96721311]
|
|
|
|
mean value: 0.9546007403490216
|
|
|
|
key: train_accuracy
|
|
value: [0.96402878 0.9676259 0.96942446 0.9676259 0.97482014 0.96942446
|
|
0.96582734 0.96402878 0.97127469 0.96768402]
|
|
|
|
mean value: 0.9681764462756545
|
|
|
|
key: test_fscore
|
|
value: [0.95238095 1. 0.96774194 0.93103448 0.92307692 0.96875
|
|
0.96875 0.98360656 0.88135593 0.96875 ]
|
|
|
|
mean value: 0.9545446783280807
|
|
|
|
key: train_fscore
|
|
value: [0.96453901 0.96808511 0.9699115 0.96808511 0.97508897 0.9699115
|
|
0.96637168 0.96453901 0.97153025 0.96797153]
|
|
|
|
mean value: 0.9686033664546803
|
|
|
|
key: test_precision
|
|
value: [0.9375 1. 0.96774194 1. 0.88235294 0.93939394
|
|
0.93939394 1. 0.89655172 0.93939394]
|
|
|
|
mean value: 0.950232841898009
|
|
|
|
key: train_precision
|
|
value: [0.95104895 0.95454545 0.95470383 0.95454545 0.96478873 0.95470383
|
|
0.95121951 0.95104895 0.96466431 0.95774648]
|
|
|
|
mean value: 0.9559015511110829
|
|
|
|
key: test_recall
|
|
value: [0.96774194 1. 0.96774194 0.87096774 0.96774194 1.
|
|
1. 0.96774194 0.86666667 1. ]
|
|
|
|
mean value: 0.9608602150537635
|
|
|
|
key: train_recall
|
|
value: [0.97841727 0.98201439 0.98561151 0.98201439 0.98561151 0.98561151
|
|
0.98201439 0.97841727 0.97849462 0.97841727]
|
|
|
|
mean value: 0.9816624120058792
|
|
|
|
key: test_roc_auc
|
|
value: [0.9516129 1. 0.96774194 0.93548387 0.91935484 0.96774194
|
|
0.96774194 0.98387097 0.88494624 0.96666667]
|
|
|
|
mean value: 0.9545161290322581
|
|
|
|
key: train_roc_auc
|
|
value: [0.96402878 0.9676259 0.96942446 0.9676259 0.97482014 0.96942446
|
|
0.96582734 0.96402878 0.9712617 0.96770326]
|
|
|
|
mean value: 0.9681770712462289
|
|
|
|
key: test_jcc
|
|
value: [0.90909091 1. 0.9375 0.87096774 0.85714286 0.93939394
|
|
0.93939394 0.96774194 0.78787879 0.93939394]
|
|
|
|
mean value: 0.9148504049713727
|
|
|
|
key: train_jcc
|
|
value: [0.93150685 0.93814433 0.94158076 0.93814433 0.95138889 0.94158076
|
|
0.93493151 0.93150685 0.94463668 0.93793103]
|
|
|
|
mean value: 0.9391351978873097
|
|
|
|
MCC on Blind test: 0.15
|
|
|
|
Accuracy on Blind test: 0.38
|