19724 lines
978 KiB
Text
19724 lines
978 KiB
Text
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_data_sl.py:549: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
mask_check.sort_values(by = ['ligand_distance'], ascending = True, inplace = True)
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
|
|
from pandas import MultiIndex, Int64Index
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
1.22.4
|
|
1.4.1
|
|
|
|
aaindex_df contains non-numerical data
|
|
|
|
Total no. of non-numerial columns: 2
|
|
|
|
Selecting numerical data only
|
|
|
|
PASS: successfully selected numerical columns only for aaindex_df
|
|
|
|
Now checking for NA in the remaining aaindex_cols
|
|
|
|
Counting aaindex_df cols with NA
|
|
ncols with NA: 4 columns
|
|
Dropping these...
|
|
Original ncols: 127
|
|
|
|
Revised df ncols: 123
|
|
|
|
Checking NA in revised df...
|
|
|
|
PASS: cols with NA successfully dropped from aaindex_df
|
|
Proceeding with combining aa_df with other features_df
|
|
|
|
PASS: ncols match
|
|
Expected ncols: 123
|
|
Got: 123
|
|
|
|
Total no. of columns in clean aa_df: 123
|
|
|
|
Proceeding to merge, expected nrows in merged_df: 1133
|
|
|
|
PASS: my_features_df and aa_df successfully combined
|
|
nrows: 1133
|
|
ncols: 274
|
|
count of NULL values before imputation
|
|
|
|
or_mychisq 339
|
|
log10_or_mychisq 339
|
|
dtype: int64
|
|
count of NULL values AFTER imputation
|
|
|
|
mutationinformation 0
|
|
or_rawI 0
|
|
logorI 0
|
|
dtype: int64
|
|
|
|
PASS: OR values imputed, data ready for ML
|
|
|
|
Total no. of features for aaindex: 123
|
|
|
|
No. of numerical features: 169
|
|
No. of categorical features: 7
|
|
|
|
PASS: x_features has no target variable
|
|
|
|
No. of columns for x_features: 176
|
|
|
|
-------------------------------------------------------------
|
|
Successfully split data according to scaling law: 1/np.sqrt(x_ncols)
|
|
Train data size: (515, 176)
|
|
Test data size: 0.07537783614444091 (42, 176)
|
|
y_train numbers: Counter({0: 261, 1: 254})
|
|
y_train ratio: 1.0275590551181102
|
|
|
|
y_test_numbers: Counter({0: 21, 1: 21})
|
|
y_test ratio: 1.0
|
|
-------------------------------------------------------------
|
|
|
|
Simple Random OverSampling
|
|
Counter({0: 261, 1: 261})
|
|
(522, 176)
|
|
|
|
Simple Random UnderSampling
|
|
Counter({0: 254, 1: 254})
|
|
(508, 176)
|
|
|
|
Simple Combined Over and UnderSampling
|
|
Counter({0: 261, 1: 261})
|
|
(522, 176)
|
|
|
|
SMOTE_NC OverSampling
|
|
Counter({0: 261, 1: 261})
|
|
(522, 176)
|
|
|
|
#####################################################################
|
|
|
|
Running ML analysis: scaling law split
|
|
Gene name: rpoB
|
|
Drug name: rifampicin
|
|
|
|
Output directory: /home/tanu/git/Data/rifampicin/output/ml/tts_sl/
|
|
Sanity checks:
|
|
ML source data size: (557, 176)
|
|
Total input features: (515, 176)
|
|
Target feature numbers: Counter({0: 261, 1: 254})
|
|
Target features ratio: 1.0275590551181102
|
|
|
|
#####################################################################
|
|
|
|
|
|
================================================================
|
|
|
|
Strucutral features (n): 37
|
|
These are:
|
|
Common stablity features: ['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity', 'mcsm_ppi2_affinity', 'interface_dist']
|
|
FoldX columns: ['electro_rr', 'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr', 'disulfide_mm', 'disulfide_sm', 'disulfide_ss', 'hbonds_rr', 'hbonds_mm', 'hbonds_sm', 'hbonds_ss', 'partcov_rr', 'partcov_mm', 'partcov_sm', 'partcov_ss', 'vdwclashes_rr', 'vdwclashes_mm', 'vdwclashes_sm', 'vdwclashes_ss', 'volumetric_rr', 'volumetric_mm', 'volumetric_ss']
|
|
Other struc columns: ['rsa', 'kd_values', 'rd_values']
|
|
================================================================
|
|
|
|
AAindex features (n): 123
|
|
================================================================
|
|
|
|
Evolutionary features (n): 3
|
|
These are:
|
|
['consurf_score', 'snap2_score', 'provean_score']
|
|
================================================================
|
|
|
|
Genomic features (n): 6
|
|
These are:
|
|
['maf', 'logorI']
|
|
['lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique']
|
|
================================================================
|
|
|
|
Categorical features (n): 7
|
|
These are:
|
|
['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site']
|
|
================================================================
|
|
|
|
|
|
Pass: No. of features match
|
|
|
|
#####################################################################
|
|
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03457308 0.05314136 0.03903151 0.0377934 0.03705144 0.03332138
|
|
0.03554845 0.03472877 0.0331893 0.03619957]
|
|
|
|
mean value: 0.037457823753356934
|
|
|
|
key: score_time
|
|
value: [0.01273632 0.01220393 0.01418066 0.01225758 0.01433253 0.01218915
|
|
0.0123353 0.01226211 0.01226807 0.01440454]
|
|
|
|
mean value: 0.012917017936706543
|
|
|
|
key: test_mcc
|
|
value: [0.76888889 0.61538462 0.84866842 0.84866842 0.77151675 0.88289781
|
|
0.76733527 0.88289781 0.80990051 0.69568237]
|
|
|
|
mean value: 0.7891840882445919
|
|
|
|
key: train_mcc
|
|
value: [0.87494868 0.86615908 0.86178968 0.86190423 0.85751876 0.84920893
|
|
0.86645175 0.86645175 0.85783034 0.87499419]
|
|
|
|
mean value: 0.8637257413526429
|
|
|
|
key: test_accuracy
|
|
value: [0.88461538 0.80769231 0.92307692 0.92307692 0.88461538 0.94117647
|
|
0.88235294 0.94117647 0.90196078 0.84313725]
|
|
|
|
mean value: 0.8932880844645551
|
|
|
|
key: train_accuracy
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[0.93736501 0.93304536 0.93088553 0.93088553 0.9287257 0.92456897
|
|
0.93318966 0.93318966 0.92887931 0.9375 ]
|
|
|
|
mean value: 0.9318234713636703
|
|
|
|
key: test_fscore
|
|
value: [0.88 0.80769231 0.92 0.92 0.88888889 0.93877551
|
|
0.88461538 0.93877551 0.90566038 0.85185185]
|
|
|
|
mean value: 0.8936259830815086
|
|
|
|
key: train_fscore
|
|
value: [0.93736501 0.93246187 0.930131 0.93043478 0.92810458 0.92407809
|
|
0.93275488 0.93275488 0.92841649 0.93681917]
|
|
|
|
mean value: 0.931332075708447
|
|
|
|
key: test_precision
|
|
value: [0.88 0.80769231 0.95833333 0.95833333 0.85714286 0.95833333
|
|
0.85185185 0.95833333 0.85714286 0.79310345]
|
|
|
|
mean value: 0.888026665543907
|
|
|
|
key: train_precision
|
|
value: [0.92735043 0.92640693 0.92608696 0.92241379 0.92207792 0.91810345
|
|
0.92672414 0.92672414 0.92241379 0.93478261]
|
|
|
|
mean value: 0.9253084151397495
|
|
|
|
key: test_recall
|
|
value: [0.88 0.80769231 0.88461538 0.88461538 0.92307692 0.92
|
|
0.92 0.92 0.96 0.92 ]
|
|
|
|
mean value: 0.902
|
|
|
|
key: train_recall
|
|
value: [0.94759825 0.93859649 0.93421053 0.93859649 0.93421053 0.930131
|
|
0.93886463 0.93886463 0.93449782 0.93886463]
|
|
|
|
mean value: 0.9374434995786409
|
|
|
|
key: test_roc_auc
|
|
value: [0.88444444 0.80769231 0.92307692 0.92307692 0.88461538 0.94076923
|
|
0.88307692 0.94076923 0.90307692 0.84461538]
|
|
|
|
mean value: 0.8935213675213675
|
|
|
|
key: train_roc_auc
|
|
value: [0.93747434 0.93312803 0.93093505 0.93100037 0.92880739 0.92463997
|
|
0.9332621 0.9332621 0.92895104 0.93751742]
|
|
|
|
mean value: 0.9318977817951397
|
|
|
|
key: test_jcc
|
|
value: [0.78571429 0.67741935 0.85185185 0.85185185 0.8 0.88461538
|
|
0.79310345 0.88461538 0.82758621 0.74193548]
|
|
|
|
mean value: 0.809869325253085
|
|
|
|
key: train_jcc
|
|
value: [0.88211382 0.87346939 0.86938776 0.8699187 0.86585366 0.85887097
|
|
0.87398374 0.87398374 0.86639676 0.88114754]
|
|
|
|
mean value: 0.8715126071252873
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.84475875 0.97533607 0.87528491 1.0226748 0.90170574 0.9565835
|
|
0.97258997 0.88586736 1.06623507 0.90682554]
|
|
|
|
mean value: 0.9407861709594727
|
|
|
|
key: score_time
|
|
value: [0.01480126 0.01481223 0.01491356 0.01508093 0.02434945 0.01536441
|
|
0.01489663 0.01477122 0.01504946 0.0123744 ]
|
|
|
|
mean value: 0.01564135551452637
|
|
|
|
key: test_mcc
|
|
value: [0.80829038 0.65433031 0.84866842 0.88527041 0.73568294 0.88289781
|
|
0.80461538 0.92153846 0.8459178 0.65224812]
|
|
|
|
mean value: 0.803946003970833
|
|
|
|
key: train_mcc
|
|
value: [0.90510935 0.91374613 0.89664633 0.90072034 0.90072034 0.90085939
|
|
0.90549103 0.89669076 0.8968689 0.83622884]
|
|
|
|
mean value: 0.8953081428164044
|
|
|
|
key: test_accuracy
|
|
value: [0.90384615 0.82692308 0.92307692 0.94230769 0.86538462 0.94117647
|
|
0.90196078 0.96078431 0.92156863 0.82352941]
|
|
|
|
mean value: 0.9010558069381599
|
|
|
|
key: train_accuracy
|
|
value: [0.9524838 0.95680346 0.94816415 0.95032397 0.95032397 0.95043103
|
|
0.95258621 0.94827586 0.94827586 0.91810345]
|
|
|
|
mean value: 0.9475771765844939
|
|
|
|
key: test_fscore
|
|
value: [0.90196078 0.82352941 0.92 0.94117647 0.87272727 0.93877551
|
|
0.90196078 0.96 0.92307692 0.83018868]
|
|
|
|
mean value: 0.9013395836233953
|
|
|
|
key: train_fscore
|
|
value: [0.95238095 0.95652174 0.94805195 0.94989107 0.94989107 0.94989107
|
|
0.95258621 0.94805195 0.94827586 0.9173913 ]
|
|
|
|
mean value: 0.9472933163543006
|
|
|
|
key: test_precision
|
|
value: [0.88461538 0.84 0.95833333 0.96 0.82758621 0.95833333
|
|
0.88461538 0.96 0.88888889 0.78571429]
|
|
|
|
mean value: 0.8948086817397162
|
|
|
|
key: train_precision
|
|
value: [0.94420601 0.94827586 0.93589744 0.94372294 0.94372294 0.94782609
|
|
0.94042553 0.93991416 0.93617021 0.91341991]
|
|
|
|
mean value: 0.9393581102143395
|
|
|
|
key: test_recall
|
|
value: [0.92 0.80769231 0.88461538 0.92307692 0.92307692 0.92
|
|
0.92 0.96 0.96 0.88 ]
|
|
|
|
mean value: 0.9098461538461539
|
|
|
|
key: train_recall
|
|
value: [0.96069869 0.96491228 0.96052632 0.95614035 0.95614035 0.95196507
|
|
0.9650655 0.95633188 0.96069869 0.92139738]
|
|
|
|
mean value: 0.9553876503485789
|
|
|
|
key: test_roc_auc
|
|
value: [0.90444444 0.82692308 0.92307692 0.94230769 0.86538462 0.94076923
|
|
0.90230769 0.96076923 0.92230769 0.82461538]
|
|
|
|
mean value: 0.9012905982905982
|
|
|
|
key: train_roc_auc
|
|
value: [0.95257157 0.95692423 0.94834826 0.9504106 0.9504106 0.95045062
|
|
0.95274552 0.9483787 0.94843445 0.9181455 ]
|
|
|
|
mean value: 0.9476820048433202
|
|
|
|
key: test_jcc
|
|
value: [0.82142857 0.7 0.85185185 0.88888889 0.77419355 0.88461538
|
|
0.82142857 0.92307692 0.85714286 0.70967742]
|
|
|
|
mean value: 0.8232304016174984
|
|
|
|
key: train_jcc
|
|
value: [0.90909091 0.91666667 0.90123457 0.90456432 0.90456432 0.90456432
|
|
0.90946502 0.90123457 0.90163934 0.84738956]
|
|
|
|
mean value: 0.9000413580689495
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01280427 0.01010084 0.00992489 0.00991201 0.01013994 0.01014113
|
|
0.01022482 0.01023531 0.01004434 0.01003003]
|
|
|
|
mean value: 0.010355758666992187
|
|
|
|
key: score_time
|
|
value: [0.00947094 0.00892138 0.00892901 0.0089221 0.00908446 0.00910974
|
|
0.00904751 0.00909019 0.00910616 0.00897598]
|
|
|
|
mean value: 0.009065747261047363
|
|
|
|
key: test_mcc
|
|
value: [0.54156684 0.57735027 0.74466871 0.70064905 0.66628253 0.65064936
|
|
0.68779719 0.57342193 0.72615385 0.72984534]
|
|
|
|
mean value: 0.6598385073077018
|
|
|
|
key: train_mcc
|
|
value: [0.69176702 0.6927847 0.69160663 0.66143964 0.70344863 0.70415149
|
|
0.69511551 0.67751955 0.67041841 0.69062182]
|
|
|
|
mean value: 0.6878873412216182
|
|
|
|
key: test_accuracy
|
|
value: [0.76923077 0.78846154 0.86538462 0.84615385 0.82692308 0.82352941
|
|
0.84313725 0.78431373 0.8627451 0.8627451 ]
|
|
|
|
mean value: 0.827262443438914
|
|
|
|
key: train_accuracy
|
|
value: [0.84449244 0.84449244 0.84449244 0.82937365 0.85097192 0.8512931
|
|
0.84698276 0.8362069 0.83405172 0.84482759]
|
|
|
|
mean value: 0.8427184963133983
|
|
|
|
key: test_fscore
|
|
value: [0.73913043 0.78431373 0.85106383 0.83333333 0.80851064 0.80851064
|
|
0.83333333 0.79245283 0.8627451 0.85106383]
|
|
|
|
mean value: 0.8164457691337579
|
|
|
|
key: train_fscore
|
|
value: [0.83486239 0.83255814 0.83410138 0.81755196 0.8428246 0.84353741
|
|
0.83972912 0.82242991 0.82379863 0.83783784]
|
|
|
|
mean value: 0.83292313777467
|
|
|
|
key: test_precision
|
|
value: [0.80952381 0.8 0.95238095 0.90909091 0.9047619 0.86363636
|
|
0.86956522 0.75 0.84615385 0.90909091]
|
|
|
|
mean value: 0.8614203912029998
|
|
|
|
key: train_precision
|
|
value: [0.87922705 0.88613861 0.87864078 0.86341463 0.87677725 0.87735849
|
|
0.86915888 0.88442211 0.86538462 0.86511628]
|
|
|
|
mean value: 0.8745638703109545
|
|
|
|
key: test_recall
|
|
value: [0.68 0.76923077 0.76923077 0.76923077 0.73076923 0.76
|
|
0.8 0.84 0.88 0.8 ]
|
|
|
|
mean value: 0.7798461538461539
|
|
|
|
key: train_recall
|
|
value: [0.79475983 0.78508772 0.79385965 0.77631579 0.81140351 0.81222707
|
|
0.81222707 0.76855895 0.7860262 0.81222707]
|
|
|
|
mean value: 0.7952692867540029
|
|
|
|
key: test_roc_auc
|
|
value: [0.76592593 0.78846154 0.86538462 0.84615385 0.82692308 0.82230769
|
|
0.84230769 0.78538462 0.86307692 0.86153846]
|
|
|
|
mean value: 0.8267464387464387
|
|
|
|
key: train_roc_auc
|
|
value: [0.84396111 0.84360769 0.84373834 0.82858343 0.85038261 0.85079439
|
|
0.84653907 0.83534331 0.83343863 0.84441141]
|
|
|
|
mean value: 0.8420799970776743
|
|
|
|
key: test_jcc
|
|
value: [0.5862069 0.64516129 0.74074074 0.71428571 0.67857143 0.67857143
|
|
0.71428571 0.65625 0.75862069 0.74074074]
|
|
|
|
mean value: 0.6913434643725245
|
|
|
|
key: train_jcc
|
|
value: [0.71653543 0.71314741 0.71541502 0.69140625 0.72834646 0.72941176
|
|
0.72373541 0.6984127 0.70038911 0.72093023]
|
|
|
|
mean value: 0.7137729779180588
|
|
|
|
MCC on Blind test: 0.63
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01047707 0.01029992 0.01025939 0.01041651 0.01023507 0.0102303
|
|
0.0112865 0.01102686 0.01036859 0.01120663]
|
|
|
|
mean value: 0.010580682754516601
|
|
|
|
key: score_time
|
|
value: [0.00895643 0.00900006 0.00895834 0.00900507 0.00899935 0.00976944
|
|
0.00984073 0.00931478 0.00945687 0.00984907]
|
|
|
|
mean value: 0.009315013885498047
|
|
|
|
key: test_mcc
|
|
value: [0.57831366 0.4233902 0.69230769 0.71151247 0.73568294 0.72573276
|
|
0.80461538 0.88307692 0.60769231 0.64715023]
|
|
|
|
mean value: 0.6809474562124411
|
|
|
|
key: train_mcc
|
|
value: [0.69835966 0.77538376 0.72791401 0.72780737 0.77538491 0.75470857
|
|
0.75496039 0.72841838 0.75426257 0.74133606]
|
|
|
|
mean value: 0.7438535657906272
|
|
|
|
key: test_accuracy
|
|
value: [0.78846154 0.71153846 0.84615385 0.84615385 0.86538462 0.8627451
|
|
0.90196078 0.94117647 0.80392157 0.82352941]
|
|
|
|
mean value: 0.8391025641025641
|
|
|
|
key: train_accuracy
|
|
value: [0.8488121 0.88768898 0.86393089 0.86393089 0.88768898 0.87715517
|
|
0.87715517 0.86422414 0.87715517 0.87068966]
|
|
|
|
mean value: 0.8718431146197959
|
|
|
|
key: test_fscore
|
|
value: [0.76595745 0.70588235 0.84615385 0.82608696 0.87272727 0.85714286
|
|
0.90196078 0.94117647 0.8 0.81632653]
|
|
|
|
mean value: 0.8333414517809609
|
|
|
|
key: train_fscore
|
|
value: [0.84304933 0.88495575 0.8627451 0.86092715 0.88646288 0.87741935
|
|
0.87794433 0.86153846 0.87527352 0.86899563]
|
|
|
|
mean value: 0.8699311510042489
|
|
|
|
key: test_precision
|
|
value: [0.81818182 0.72 0.84615385 0.95 0.82758621 0.875
|
|
0.88461538 0.92307692 0.8 0.83333333]
|
|
|
|
mean value: 0.8477947512257857
|
|
|
|
key: train_precision
|
|
value: [0.86635945 0.89285714 0.85714286 0.86666667 0.8826087 0.86440678
|
|
0.86134454 0.86725664 0.87719298 0.86899563]
|
|
|
|
mean value: 0.8704831379611647
|
|
|
|
key: test_recall
|
|
value: [0.72 0.69230769 0.84615385 0.73076923 0.92307692 0.84
|
|
0.92 0.96 0.8 0.8 ]
|
|
|
|
mean value: 0.8232307692307692
|
|
|
|
key: train_recall
|
|
value: [0.8209607 0.87719298 0.86842105 0.85526316 0.89035088 0.89082969
|
|
0.89519651 0.8558952 0.87336245 0.86899563]
|
|
|
|
mean value: 0.8696468244847928
|
|
|
|
key: test_roc_auc
|
|
value: [0.78592593 0.71153846 0.84615385 0.84615385 0.86538462 0.86230769
|
|
0.90230769 0.94153846 0.80384615 0.82307692]
|
|
|
|
mean value: 0.8388233618233618
|
|
|
|
key: train_roc_auc
|
|
value: [0.84851454 0.88753266 0.86399776 0.86380179 0.88772863 0.87732974
|
|
0.87738549 0.86411781 0.87710675 0.87066803]
|
|
|
|
mean value: 0.8718183204075174
|
|
|
|
key: test_jcc
|
|
value: [0.62068966 0.54545455 0.73333333 0.7037037 0.77419355 0.75
|
|
0.82142857 0.88888889 0.66666667 0.68965517]
|
|
|
|
mean value: 0.7194014085449013
|
|
|
|
key: train_jcc
|
|
value: [0.72868217 0.79365079 0.75862069 0.75581395 0.79607843 0.7816092
|
|
0.78244275 0.75675676 0.77821012 0.76833977]
|
|
|
|
mean value: 0.7700204624031467
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.01085258 0.01142216 0.01102972 0.010952 0.0106256 0.01055479
|
|
0.01059937 0.0103457 0.01040864 0.01033592]
|
|
|
|
mean value: 0.010712647438049316
|
|
|
|
key: score_time
|
|
value: [0.07376838 0.01394439 0.01408267 0.01353168 0.01317382 0.01507711
|
|
0.01556063 0.01267171 0.01504016 0.01285124]
|
|
|
|
mean value: 0.019970178604125977
|
|
|
|
key: test_mcc
|
|
value: [0.61551019 0.38575837 0.38575837 0.5 0.53846154 0.5372904
|
|
0.65064936 0.72573276 0.60769231 0.45474301]
|
|
|
|
mean value: 0.5401596315962462
|
|
|
|
key: train_mcc
|
|
value: [0.70231538 0.72376727 0.69801004 0.71054252 0.71511629 0.68110244
|
|
0.67701807 0.68110244 0.71576891 0.69138045]
|
|
|
|
mean value: 0.6996123825721399
|
|
|
|
key: test_accuracy
|
|
value: [0.80769231 0.69230769 0.69230769 0.73076923 0.76923077 0.76470588
|
|
0.82352941 0.8627451 0.80392157 0.7254902 ]
|
|
|
|
mean value: 0.7672699849170437
|
|
|
|
key: train_accuracy
|
|
value: [0.85097192 0.86177106 0.8488121 0.85529158 0.8574514 0.84051724
|
|
0.83836207 0.84051724 0.85775862 0.84482759]
|
|
|
|
mean value: 0.8496280814776197
|
|
|
|
key: test_fscore
|
|
value: [0.79166667 0.68 0.7037037 0.66666667 0.76923077 0.77777778
|
|
0.80851064 0.85714286 0.8 0.69565217]
|
|
|
|
mean value: 0.7550351253399358
|
|
|
|
key: train_fscore
|
|
value: [0.84632517 0.85714286 0.84304933 0.85339168 0.85267857 0.83628319
|
|
0.83296214 0.83628319 0.85333333 0.83636364]
|
|
|
|
mean value: 0.8447813087328101
|
|
|
|
key: test_precision
|
|
value: [0.82608696 0.70833333 0.67857143 0.875 0.76923077 0.72413793
|
|
0.86363636 0.875 0.8 0.76190476]
|
|
|
|
mean value: 0.7881901544232879
|
|
|
|
key: train_precision
|
|
value: [0.86363636 0.87272727 0.86238532 0.85152838 0.86818182 0.84753363
|
|
0.85 0.84753363 0.86877828 0.87203791]
|
|
|
|
mean value: 0.8604342619734768
|
|
|
|
key: test_recall
|
|
value: [0.76 0.65384615 0.73076923 0.53846154 0.76923077 0.84
|
|
0.76 0.84 0.8 0.64 ]
|
|
|
|
mean value: 0.7332307692307692
|
|
|
|
key: train_recall
|
|
value: [0.82969432 0.84210526 0.8245614 0.85526316 0.8377193 0.82532751
|
|
0.81659389 0.82532751 0.83842795 0.80349345]
|
|
|
|
mean value: 0.8298513751627978
|
|
|
|
key: test_roc_auc
|
|
value: [0.80592593 0.69230769 0.69230769 0.73076923 0.76923077 0.76615385
|
|
0.82230769 0.86230769 0.80384615 0.72384615]
|
|
|
|
mean value: 0.7669002849002848
|
|
|
|
key: train_roc_auc
|
|
value: [0.8507446 0.86147816 0.84845091 0.85529115 0.85715752 0.84032333
|
|
0.83808418 0.84032333 0.85751185 0.84429992]
|
|
|
|
mean value: 0.8493664950009298
|
|
|
|
key: test_jcc
|
|
value: [0.65517241 0.51515152 0.54285714 0.5 0.625 0.63636364
|
|
0.67857143 0.75 0.66666667 0.53333333]
|
|
|
|
mean value: 0.6103116136736826
|
|
|
|
key: train_jcc
|
|
value: [0.73359073 0.75 0.72868217 0.74427481 0.74319066 0.71863118
|
|
0.71374046 0.71863118 0.74418605 0.71875 ]
|
|
|
|
mean value: 0.7313677236713617
|
|
|
|
MCC on Blind test: 0.33
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02325892 0.02017713 0.02105737 0.02158213 0.02060938 0.02063775
|
|
0.0231297 0.02031779 0.02411246 0.02205038]
|
|
|
|
mean value: 0.0216933012008667
|
|
|
|
key: score_time
|
|
value: [0.0124054 0.0114367 0.01170635 0.01159406 0.01113749 0.01143193
|
|
0.01157951 0.01144934 0.01246667 0.0119288 ]
|
|
|
|
mean value: 0.011713624000549316
|
|
|
|
key: test_mcc
|
|
value: [0.80829038 0.65433031 0.84866842 0.84866842 0.80829038 0.84544958
|
|
0.76733527 0.92153846 0.76733527 0.68875274]
|
|
|
|
mean value: 0.7958659239597995
|
|
|
|
key: train_mcc
|
|
value: [0.79696947 0.81423213 0.7927817 0.7927817 0.79695053 0.79323288
|
|
0.79739862 0.78885906 0.80169098 0.81031311]
|
|
|
|
mean value: 0.7985210167824943
|
|
|
|
key: test_accuracy
|
|
value: [0.90384615 0.82692308 0.92307692 0.92307692 0.90384615 0.92156863
|
|
0.88235294 0.96078431 0.88235294 0.84313725]
|
|
|
|
mean value: 0.8970965309200604
|
|
|
|
key: train_accuracy
|
|
value: [0.89848812 0.90712743 0.89632829 0.89632829 0.89848812 0.89655172
|
|
0.8987069 0.89439655 0.90086207 0.90517241]
|
|
|
|
mean value: 0.8992449914351679
|
|
|
|
key: test_fscore
|
|
value: [0.90196078 0.83018868 0.92 0.92 0.90566038 0.91666667
|
|
0.88461538 0.96 0.88461538 0.84615385]
|
|
|
|
mean value: 0.8969861122968781
|
|
|
|
key: train_fscore
|
|
value: [0.89760349 0.9059081 0.89565217 0.89565217 0.89715536 0.8961039
|
|
0.89760349 0.89370933 0.89956332 0.90393013]
|
|
|
|
mean value: 0.8982881450268425
|
|
|
|
key: test_precision
|
|
value: [0.88461538 0.81481481 0.95833333 0.95833333 0.88888889 0.95652174
|
|
0.85185185 0.96 0.85185185 0.81481481]
|
|
|
|
mean value: 0.8940026012634709
|
|
|
|
key: train_precision
|
|
value: [0.89565217 0.90393013 0.88793103 0.88793103 0.89519651 0.88841202
|
|
0.89565217 0.88793103 0.89956332 0.90393013]
|
|
|
|
mean value: 0.894612955577799
|
|
|
|
key: test_recall
|
|
value: [0.92 0.84615385 0.88461538 0.88461538 0.92307692 0.88
|
|
0.92 0.96 0.92 0.88 ]
|
|
|
|
mean value: 0.9018461538461539
|
|
|
|
key: train_recall
|
|
value: [0.89956332 0.90789474 0.90350877 0.90350877 0.89912281 0.90393013
|
|
0.89956332 0.89956332 0.89956332 0.90393013]
|
|
|
|
mean value: 0.9020148624837202
|
|
|
|
key: test_roc_auc
|
|
value: [0.90444444 0.82692308 0.92307692 0.92307692 0.90384615 0.92076923
|
|
0.88307692 0.96076923 0.88307692 0.84384615]
|
|
|
|
mean value: 0.8972905982905983
|
|
|
|
key: train_roc_auc
|
|
value: [0.89849961 0.90713886 0.89643524 0.89643524 0.89849757 0.89664592
|
|
0.89871783 0.89446251 0.90084549 0.90515655]
|
|
|
|
mean value: 0.8992834814328039
|
|
|
|
key: test_jcc
|
|
value: [0.82142857 0.70967742 0.85185185 0.85185185 0.82758621 0.84615385
|
|
0.79310345 0.92307692 0.79310345 0.73333333]
|
|
|
|
mean value: 0.8151166900499492
|
|
|
|
key: train_jcc
|
|
value: [0.81422925 0.828 0.81102362 0.81102362 0.81349206 0.81176471
|
|
0.81422925 0.80784314 0.81746032 0.8247012 ]
|
|
|
|
mean value: 0.8153767161426962
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [2.01321244 1.85925555 2.15802002 2.02537608 2.0681262 2.02303672
|
|
2.0381639 1.67555785 1.83844972 2.44499278]
|
|
|
|
mean value: 2.01441912651062
|
|
|
|
key: score_time
|
|
value: [0.01252031 0.01444769 0.01506567 0.01446342 0.01447082 0.01439071
|
|
0.01457739 0.01256704 0.01462984 0.0286572 ]
|
|
|
|
mean value: 0.015579009056091308
|
|
|
|
key: test_mcc
|
|
value: [0.69185185 0.69436507 0.65433031 0.81312325 0.81312325 0.88307692
|
|
0.84544958 0.88289781 0.76733527 0.68875274]
|
|
|
|
mean value: 0.7734306059974984
|
|
|
|
key: train_mcc
|
|
value: [0.99568893 0.99568893 1. 0.99568893 1. 0.98714723
|
|
1. 0.97003963 1. 0.99137787]
|
|
|
|
mean value: 0.9935631530556671
|
|
|
|
key: test_accuracy
|
|
value: [0.84615385 0.84615385 0.82692308 0.90384615 0.90384615 0.94117647
|
|
0.92156863 0.94117647 0.88235294 0.84313725]
|
|
|
|
mean value: 0.8856334841628959
|
|
|
|
key: train_accuracy
|
|
value: [0.99784017 0.99784017 1. 0.99784017 1. 0.99353448
|
|
1. 0.98491379 1. 0.99568966]
|
|
|
|
mean value: 0.9967658449393014
|
|
|
|
key: test_fscore
|
|
value: [0.84 0.84 0.82352941 0.89795918 0.90909091 0.94117647
|
|
0.91666667 0.93877551 0.88461538 0.84615385]
|
|
|
|
mean value: 0.8837967382757299
|
|
|
|
key: train_fscore
|
|
value: [0.99781182 0.99781182 1. 0.99781182 1. 0.99340659
|
|
1. 0.98454746 1. 0.99563319]
|
|
|
|
mean value: 0.9967022691125853
|
|
|
|
key: test_precision
|
|
value: [0.84 0.875 0.84 0.95652174 0.86206897 0.92307692
|
|
0.95652174 0.95833333 0.85185185 0.81481481]
|
|
|
|
mean value: 0.8878189366855034
|
|
|
|
key: train_precision
|
|
value: [1. 0.99563319 1. 0.99563319 1. 1.
|
|
1. 0.99553571 1. 0.99563319]
|
|
|
|
mean value: 0.9982435277604491
|
|
|
|
key: test_recall
|
|
value: [0.84 0.80769231 0.80769231 0.84615385 0.96153846 0.96
|
|
0.88 0.92 0.92 0.88 ]
|
|
|
|
mean value: 0.8823076923076923
|
|
|
|
key: train_recall
|
|
value: [0.99563319 1. 1. 1. 1. 0.98689956
|
|
1. 0.97379913 1. 0.99563319]
|
|
|
|
mean value: 0.9951965065502183
|
|
|
|
key: test_roc_auc
|
|
value: [0.84592593 0.84615385 0.82692308 0.90384615 0.90384615 0.94153846
|
|
0.92076923 0.94076923 0.88307692 0.84384615]
|
|
|
|
mean value: 0.8856695156695157
|
|
|
|
key: train_roc_auc
|
|
value: [0.99781659 0.99787234 1. 0.99787234 1. 0.99344978
|
|
1. 0.9847719 1. 0.99568893]
|
|
|
|
mean value: 0.9967471894453219
|
|
|
|
key: test_jcc
|
|
value: [0.72413793 0.72413793 0.7 0.81481481 0.83333333 0.88888889
|
|
0.84615385 0.88461538 0.79310345 0.73333333]
|
|
|
|
mean value: 0.7942518911484429
|
|
|
|
key: train_jcc
|
|
value: [0.99563319 0.99563319 1. 0.99563319 1. 0.98689956
|
|
1. 0.96956522 1. 0.99130435]
|
|
|
|
mean value: 0.9934668691854945
|
|
|
|
MCC on Blind test: 0.75
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02841353 0.02183843 0.02099395 0.01935482 0.01939464 0.0214653
|
|
0.02150893 0.02194858 0.02061439 0.02167058]
|
|
|
|
mean value: 0.021720314025878908
|
|
|
|
key: score_time
|
|
value: [0.01223016 0.00936103 0.00875926 0.00877237 0.00869417 0.0087049
|
|
0.00894022 0.00874114 0.00898623 0.00875068]
|
|
|
|
mean value: 0.009194016456604004
|
|
|
|
key: test_mcc
|
|
value: [0.80829038 0.81312325 0.92307692 0.88527041 0.89056356 0.96153846
|
|
0.80431528 0.80461538 0.76461538 0.96148034]
|
|
|
|
mean value: 0.8616889371976575
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.90384615 0.90384615 0.96153846 0.94230769 0.94230769 0.98039216
|
|
0.90196078 0.90196078 0.88235294 0.98039216]
|
|
|
|
mean value: 0.9300904977375566
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.90196078 0.89795918 0.96153846 0.94339623 0.94545455 0.98039216
|
|
0.89795918 0.90196078 0.88 0.97959184]
|
|
|
|
mean value: 0.929021316297993
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.88461538 0.95652174 0.96153846 0.92592593 0.89655172 0.96153846
|
|
0.91666667 0.88461538 0.88 1. ]
|
|
|
|
mean value: 0.9267973748168651
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.92 0.84615385 0.96153846 0.96153846 1. 1.
|
|
0.88 0.92 0.88 0.96 ]
|
|
|
|
mean value: 0.932923076923077
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.90444444 0.90384615 0.96153846 0.94230769 0.94230769 0.98076923
|
|
0.90153846 0.90230769 0.88230769 0.98 ]
|
|
|
|
mean value: 0.9301367521367522
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.82142857 0.81481481 0.92592593 0.89285714 0.89655172 0.96153846
|
|
0.81481481 0.82142857 0.78571429 0.96 ]
|
|
|
|
mean value: 0.8695074312660519
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.87
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.11908174 0.11891198 0.12085533 0.12205338 0.12058616 0.12348747
|
|
0.12098384 0.12022972 0.11956501 0.12257099]
|
|
|
|
mean value: 0.12083256244659424
|
|
|
|
key: score_time
|
|
value: [0.01758242 0.01875472 0.01774883 0.01888084 0.01758361 0.0192914
|
|
0.01767397 0.0176754 0.01803231 0.0176661 ]
|
|
|
|
mean value: 0.01808896064758301
|
|
|
|
key: test_mcc
|
|
value: [0.7364532 0.69230769 0.88527041 0.77849894 0.88527041 0.84544958
|
|
0.80461538 0.88289781 0.72573276 0.65224812]
|
|
|
|
mean value: 0.7888744323177563
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.86538462 0.84615385 0.94230769 0.88461538 0.94230769 0.92156863
|
|
0.90196078 0.94117647 0.8627451 0.82352941]
|
|
|
|
mean value: 0.8931749622926093
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.86792453 0.84615385 0.94117647 0.875 0.94339623 0.91666667
|
|
0.90196078 0.93877551 0.85714286 0.83018868]
|
|
|
|
mean value: 0.8918385569031677
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.82142857 0.84615385 0.96 0.95454545 0.92592593 0.95652174
|
|
0.88461538 0.95833333 0.875 0.78571429]
|
|
|
|
mean value: 0.8968238540847236
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.92 0.84615385 0.92307692 0.80769231 0.96153846 0.88
|
|
0.92 0.92 0.84 0.88 ]
|
|
|
|
mean value: 0.8898461538461538
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.86740741 0.84615385 0.94230769 0.88461538 0.94230769 0.92076923
|
|
0.90230769 0.94076923 0.86230769 0.82461538]
|
|
|
|
mean value: 0.8933561253561253
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.76666667 0.73333333 0.88888889 0.77777778 0.89285714 0.84615385
|
|
0.82142857 0.88461538 0.75 0.70967742]
|
|
|
|
mean value: 0.807139903107645
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01038766 0.01013541 0.01010156 0.01030636 0.01012969 0.01005673
|
|
0.0101459 0.0112083 0.01133013 0.01134515]
|
|
|
|
mean value: 0.01051468849182129
|
|
|
|
key: score_time
|
|
value: [0.00891232 0.00864291 0.00884056 0.00908256 0.008775 0.00878358
|
|
0.0087533 0.00932074 0.00876355 0.00923038]
|
|
|
|
mean value: 0.008910489082336426
|
|
|
|
key: test_mcc
|
|
value: [0.54074074 0.27104108 0.58080232 0.40422604 0.66628253 0.33282012
|
|
0.5685677 0.64769231 0.49076923 0.61017022]
|
|
|
|
mean value: 0.5113112288991031
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.76923077 0.63461538 0.78846154 0.69230769 0.82692308 0.66666667
|
|
0.78431373 0.82352941 0.74509804 0.80392157]
|
|
|
|
mean value: 0.7535067873303167
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.76923077 0.65454545 0.7755102 0.63636364 0.84210526 0.65306122
|
|
0.7755102 0.82352941 0.74509804 0.80769231]
|
|
|
|
mean value: 0.7482646514623515
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.74074074 0.62068966 0.82608696 0.77777778 0.77419355 0.66666667
|
|
0.79166667 0.80769231 0.73076923 0.77777778]
|
|
|
|
mean value: 0.7514061328172418
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.8 0.69230769 0.73076923 0.53846154 0.92307692 0.64
|
|
0.76 0.84 0.76 0.84 ]
|
|
|
|
mean value: 0.7524615384615385
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.77037037 0.63461538 0.78846154 0.69230769 0.82692308 0.66615385
|
|
0.78384615 0.82384615 0.74538462 0.80461538]
|
|
|
|
mean value: 0.7536524216524216
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.625 0.48648649 0.63333333 0.46666667 0.72727273 0.48484848
|
|
0.63333333 0.7 0.59375 0.67741935]
|
|
|
|
mean value: 0.6028110386779741
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.57
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.78404307 1.84690738 1.85415792 1.84416199 1.86087155 1.84431767
|
|
1.88158655 1.84635663 1.76327705 1.80362248]
|
|
|
|
mean value: 1.8329302310943603
|
|
|
|
key: score_time
|
|
value: [0.09506845 0.09978175 0.10103059 0.09958267 0.0994401 0.10090494
|
|
0.10089231 0.09912086 0.09177494 0.10069799]
|
|
|
|
mean value: 0.09882946014404297
|
|
|
|
key: test_mcc
|
|
value: [0.84888889 0.84866842 0.96225045 0.92307692 0.9258201 1.
|
|
0.92427578 1. 0.88307692 0.88307692]
|
|
|
|
mean value: 0.9199134409597195
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.92307692 0.92307692 0.98076923 0.96153846 0.96153846 1.
|
|
0.96078431 1. 0.94117647 0.94117647]
|
|
|
|
mean value: 0.9593137254901961
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.92307692 0.92 0.98039216 0.96153846 0.96296296 1.
|
|
0.95833333 1. 0.94117647 0.94117647]
|
|
|
|
mean value: 0.9588656778950897
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.88888889 0.95833333 1. 0.96153846 0.92857143 1.
|
|
1. 1. 0.92307692 0.92307692]
|
|
|
|
mean value: 0.9583485958485959
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96 0.88461538 0.96153846 0.96153846 1. 1.
|
|
0.92 1. 0.96 0.96 ]
|
|
|
|
mean value: 0.9607692307692308
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.92444444 0.92307692 0.98076923 0.96153846 0.96153846 1.
|
|
0.96 1. 0.94153846 0.94153846]
|
|
|
|
mean value: 0.9594444444444444
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.85714286 0.85185185 0.96153846 0.92592593 0.92857143 1.
|
|
0.92 1. 0.88888889 0.88888889]
|
|
|
|
mean value: 0.9222808302808303
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.81
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC0...05', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
|
|
key: fit_time
|
|
value: [1.84273124 0.98533678 1.06106544 0.9535141 1.02706265 0.97413754
|
|
0.97359943 0.98847985 1.03109789 0.98040366]
|
|
|
|
mean value: 1.0817428588867188
|
|
|
|
key: score_time
|
|
value: [0.2416172 0.22115898 0.2508235 0.22502398 0.17808771 0.23984313
|
|
0.2546699 0.28216171 0.22378087 0.21789837]
|
|
|
|
mean value: 0.23350653648376465
|
|
|
|
key: test_mcc
|
|
value: [0.813662 0.76923077 0.96225045 0.92307692 0.88527041 1.
|
|
0.92427578 1. 0.88307692 0.88307692]
|
|
|
|
mean value: 0.9043920181288048
|
|
|
|
key: train_mcc
|
|
value: [0.95683011 0.95247872 0.95679358 0.96112065 0.95682367 0.94826721
|
|
0.95258977 0.9482967 0.95258977 0.95692011]
|
|
|
|
mean value: 0.9542710305666754
|
|
|
|
key: test_accuracy
|
|
value: [0.90384615 0.88461538 0.98076923 0.96153846 0.94230769 1.
|
|
0.96078431 1. 0.94117647 0.94117647]
|
|
|
|
mean value: 0.9516214177978883
|
|
|
|
key: train_accuracy
|
|
value: [0.97840173 0.9762419 0.97840173 0.98056156 0.97840173 0.97413793
|
|
0.9762931 0.97413793 0.9762931 0.97844828]
|
|
|
|
mean value: 0.9771318984136441
|
|
|
|
key: test_fscore
|
|
value: [0.90566038 0.88461538 0.98039216 0.96153846 0.94339623 1.
|
|
0.95833333 1. 0.94117647 0.94117647]
|
|
|
|
mean value: 0.951628888129998
|
|
|
|
key: train_fscore
|
|
value: [0.97807018 0.97582418 0.97807018 0.98021978 0.97797357 0.97379913
|
|
0.97603486 0.97368421 0.97603486 0.97807018]
|
|
|
|
mean value: 0.9767781104581152
|
|
|
|
key: test_precision
|
|
value: [0.85714286 0.88461538 1. 0.96153846 0.92592593 1.
|
|
1. 1. 0.92307692 0.92307692]
|
|
|
|
mean value: 0.9475376475376476
|
|
|
|
key: train_precision
|
|
value: [0.98237885 0.97797357 0.97807018 0.98237885 0.98230088 0.97379913
|
|
0.97391304 0.97797357 0.97391304 0.98237885]
|
|
|
|
mean value: 0.9785079974428954
|
|
|
|
key: test_recall
|
|
value: [0.96 0.88461538 0.96153846 0.96153846 0.96153846 1.
|
|
0.92 1. 0.96 0.96 ]
|
|
|
|
mean value: 0.9569230769230769
|
|
|
|
key: train_recall
|
|
value: [0.97379913 0.97368421 0.97807018 0.97807018 0.97368421 0.97379913
|
|
0.97816594 0.96943231 0.97816594 0.97379913]
|
|
|
|
mean value: 0.9750670343982226
|
|
|
|
key: test_roc_auc
|
|
value: [0.90592593 0.88461538 0.98076923 0.96153846 0.94230769 1.
|
|
0.96 1. 0.94153846 0.94153846]
|
|
|
|
mean value: 0.9518233618233618
|
|
|
|
key: train_roc_auc
|
|
value: [0.97835255 0.97620381 0.97839679 0.98052445 0.97833147 0.97413361
|
|
0.97631701 0.97407786 0.97631701 0.97838893]
|
|
|
|
mean value: 0.9771043482593041
|
|
|
|
key: test_jcc
|
|
value: [0.82758621 0.79310345 0.96153846 0.92592593 0.89285714 1.
|
|
0.92 1. 0.88888889 0.88888889]
|
|
|
|
mean value: 0.9098788963271722
|
|
|
|
key: train_jcc
|
|
value: [0.95708155 0.9527897 0.95708155 0.9612069 0.95689655 0.94893617
|
|
0.95319149 0.94871795 0.95319149 0.95708155]
|
|
|
|
mean value: 0.954617488069393
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02028584 0.01054978 0.01048946 0.01108623 0.01054072 0.01161528
|
|
0.0117743 0.01121378 0.01106954 0.01049757]
|
|
|
|
mean value: 0.011912250518798828
|
|
|
|
key: score_time
|
|
value: [0.01542568 0.00919604 0.00922108 0.00998044 0.00957513 0.00999999
|
|
0.00992775 0.00970435 0.00996804 0.00929213]
|
|
|
|
mean value: 0.010229063034057618
|
|
|
|
key: test_mcc
|
|
value: [0.57831366 0.4233902 0.69230769 0.71151247 0.73568294 0.72573276
|
|
0.80461538 0.88307692 0.60769231 0.64715023]
|
|
|
|
mean value: 0.6809474562124411
|
|
|
|
key: train_mcc
|
|
value: [0.69835966 0.77538376 0.72791401 0.72780737 0.77538491 0.75470857
|
|
0.75496039 0.72841838 0.75426257 0.74133606]
|
|
|
|
mean value: 0.7438535657906272
|
|
|
|
key: test_accuracy
|
|
value: [0.78846154 0.71153846 0.84615385 0.84615385 0.86538462 0.8627451
|
|
0.90196078 0.94117647 0.80392157 0.82352941]
|
|
|
|
mean value: 0.8391025641025641
|
|
|
|
key: train_accuracy
|
|
value: [0.8488121 0.88768898 0.86393089 0.86393089 0.88768898 0.87715517
|
|
0.87715517 0.86422414 0.87715517 0.87068966]
|
|
|
|
mean value: 0.8718431146197959
|
|
|
|
key: test_fscore
|
|
value: [0.76595745 0.70588235 0.84615385 0.82608696 0.87272727 0.85714286
|
|
0.90196078 0.94117647 0.8 0.81632653]
|
|
|
|
mean value: 0.8333414517809609
|
|
|
|
key: train_fscore
|
|
value: [0.84304933 0.88495575 0.8627451 0.86092715 0.88646288 0.87741935
|
|
0.87794433 0.86153846 0.87527352 0.86899563]
|
|
|
|
mean value: 0.8699311510042489
|
|
|
|
key: test_precision
|
|
value: [0.81818182 0.72 0.84615385 0.95 0.82758621 0.875
|
|
0.88461538 0.92307692 0.8 0.83333333]
|
|
|
|
mean value: 0.8477947512257857
|
|
|
|
key: train_precision
|
|
value: [0.86635945 0.89285714 0.85714286 0.86666667 0.8826087 0.86440678
|
|
0.86134454 0.86725664 0.87719298 0.86899563]
|
|
|
|
mean value: 0.8704831379611647
|
|
|
|
key: test_recall
|
|
value: [0.72 0.69230769 0.84615385 0.73076923 0.92307692 0.84
|
|
0.92 0.96 0.8 0.8 ]
|
|
|
|
mean value: 0.8232307692307692
|
|
|
|
key: train_recall
|
|
value: [0.8209607 0.87719298 0.86842105 0.85526316 0.89035088 0.89082969
|
|
0.89519651 0.8558952 0.87336245 0.86899563]
|
|
|
|
mean value: 0.8696468244847928
|
|
|
|
key: test_roc_auc
|
|
value: [0.78592593 0.71153846 0.84615385 0.84615385 0.86538462 0.86230769
|
|
0.90230769 0.94153846 0.80384615 0.82307692]
|
|
|
|
mean value: 0.8388233618233618
|
|
|
|
key: train_roc_auc
|
|
value: [0.84851454 0.88753266 0.86399776 0.86380179 0.88772863 0.87732974
|
|
0.87738549 0.86411781 0.87710675 0.87066803]
|
|
|
|
mean value: 0.8718183204075174
|
|
|
|
key: test_jcc
|
|
value: [0.62068966 0.54545455 0.73333333 0.7037037 0.77419355 0.75
|
|
0.82142857 0.88888889 0.66666667 0.68965517]
|
|
|
|
mean value: 0.7194014085449013
|
|
|
|
key: train_jcc
|
|
value: [0.72868217 0.79365079 0.75862069 0.75581395 0.79607843 0.7816092
|
|
0.78244275 0.75675676 0.77821012 0.76833977]
|
|
|
|
mean value: 0.7700204624031467
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC0...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.15760708 0.06621742 0.07752442 0.07321143 0.0758338 0.08350158
|
|
0.07533073 0.07601404 0.06880569 0.07532454]
|
|
|
|
mean value: 0.08293707370758056
|
|
|
|
key: score_time
|
|
value: [0.0113287 0.01082397 0.01109338 0.01086092 0.01102328 0.01106405
|
|
0.0109849 0.01106334 0.01134181 0.01326489]
|
|
|
|
mean value: 0.011284923553466797
|
|
|
|
key: test_mcc
|
|
value: [0.89087081 0.84866842 0.96225045 0.92307692 0.9258201 1.
|
|
0.92153846 1. 0.88307692 0.92427578]
|
|
|
|
mean value: 0.9279577865544593
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.94230769 0.92307692 0.98076923 0.96153846 0.96153846 1.
|
|
0.96078431 1. 0.94117647 0.96078431]
|
|
|
|
mean value: 0.9631975867269985
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94339623 0.92 0.98039216 0.96153846 0.96296296 1.
|
|
0.96 1. 0.94117647 0.95833333]
|
|
|
|
mean value: 0.9627799611700832
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.89285714 0.95833333 1. 0.96153846 0.92857143 1.
|
|
0.96 1. 0.92307692 1. ]
|
|
|
|
mean value: 0.9624377289377289
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.88461538 0.96153846 0.96153846 1. 1.
|
|
0.96 1. 0.96 0.92 ]
|
|
|
|
mean value: 0.9647692307692308
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.94444444 0.92307692 0.98076923 0.96153846 0.96153846 1.
|
|
0.96076923 1. 0.94153846 0.96 ]
|
|
|
|
mean value: 0.9633675213675214
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.89285714 0.85185185 0.96153846 0.92592593 0.92857143 1.
|
|
0.92307692 1. 0.88888889 0.92 ]
|
|
|
|
mean value: 0.9292710622710623
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.86
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.05204272 0.08309412 0.08293486 0.09231877 0.06768751 0.06185079
|
|
0.0759449 0.04235697 0.07767081 0.07520199]
|
|
|
|
mean value: 0.07111034393310547
|
|
|
|
key: score_time
|
|
value: [0.01879072 0.03160286 0.01222348 0.02677679 0.01260066 0.01890087
|
|
0.01229119 0.01862192 0.02106357 0.01877522]
|
|
|
|
mean value: 0.019164729118347167
|
|
|
|
key: test_mcc
|
|
value: [0.77185185 0.61538462 0.84866842 0.77151675 0.84866842 0.88289781
|
|
0.64769231 0.80904133 0.73107432 0.6610182 ]
|
|
|
|
mean value: 0.7587814027095846
|
|
|
|
key: train_mcc
|
|
value: [0.91800556 0.90108249 0.91392793 0.91358716 0.91358716 0.90520077
|
|
0.90129433 0.9009374 0.91411317 0.91393374]
|
|
|
|
mean value: 0.9095669708390538
|
|
|
|
key: test_accuracy
|
|
value: [0.88461538 0.80769231 0.92307692 0.88461538 0.92307692 0.94117647
|
|
0.82352941 0.90196078 0.8627451 0.82352941]
|
|
|
|
mean value: 0.8776018099547511
|
|
|
|
key: train_accuracy
|
|
value: [0.95896328 0.95032397 0.95680346 0.95680346 0.95680346 0.95258621
|
|
0.95043103 0.95043103 0.95689655 0.95689655]
|
|
|
|
mean value: 0.954693900350041
|
|
|
|
key: test_fscore
|
|
value: [0.88461538 0.80769231 0.92 0.88 0.92592593 0.93877551
|
|
0.82352941 0.89361702 0.86792453 0.83636364]
|
|
|
|
mean value: 0.8778443726144525
|
|
|
|
key: train_fscore
|
|
value: [0.95878525 0.95032397 0.95670996 0.95614035 0.95614035 0.95217391
|
|
0.95053763 0.95010846 0.95689655 0.95670996]
|
|
|
|
mean value: 0.954452639776014
|
|
|
|
key: test_precision
|
|
value: [0.85185185 0.80769231 0.95833333 0.91666667 0.89285714 0.95833333
|
|
0.80769231 0.95454545 0.82142857 0.76666667]
|
|
|
|
mean value: 0.8736067636067636
|
|
|
|
key: train_precision
|
|
value: [0.95258621 0.93617021 0.94444444 0.95614035 0.95614035 0.94805195
|
|
0.93644068 0.94396552 0.94468085 0.94849785]
|
|
|
|
mean value: 0.9467118414261851
|
|
|
|
key: test_recall
|
|
value: [0.92 0.80769231 0.88461538 0.84615385 0.96153846 0.92
|
|
0.84 0.84 0.92 0.92 ]
|
|
|
|
mean value: 0.886
|
|
|
|
key: train_recall
|
|
value: [0.9650655 0.96491228 0.96929825 0.95614035 0.95614035 0.95633188
|
|
0.9650655 0.95633188 0.96943231 0.9650655 ]
|
|
|
|
mean value: 0.962378380448939
|
|
|
|
key: test_roc_auc
|
|
value: [0.88592593 0.80769231 0.92307692 0.88461538 0.92307692 0.94076923
|
|
0.82384615 0.90076923 0.86384615 0.82538462]
|
|
|
|
mean value: 0.8779002849002849
|
|
|
|
key: train_roc_auc
|
|
value: [0.95902848 0.95054125 0.95698955 0.95679358 0.95679358 0.95263402
|
|
0.95061786 0.95050636 0.95705658 0.95700084]
|
|
|
|
mean value: 0.9547962096825529
|
|
|
|
key: test_jcc
|
|
value: [0.79310345 0.67741935 0.85185185 0.78571429 0.86206897 0.88461538
|
|
0.7 0.80769231 0.76666667 0.71875 ]
|
|
|
|
mean value: 0.784788226517231
|
|
|
|
key: train_jcc
|
|
value: [0.92083333 0.90534979 0.91701245 0.91596639 0.91596639 0.90871369
|
|
0.9057377 0.90495868 0.91735537 0.91701245]
|
|
|
|
mean value: 0.9128906244397688
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01436949 0.01111484 0.01016617 0.01088428 0.00985456 0.00989771
|
|
0.01070356 0.01117134 0.01106 0.01109123]
|
|
|
|
mean value: 0.011031317710876464
|
|
|
|
key: score_time
|
|
value: [0.01223636 0.00920248 0.00911593 0.00875282 0.00875449 0.00877905
|
|
0.00953007 0.00965476 0.00958037 0.00875211]
|
|
|
|
mean value: 0.00943584442138672
|
|
|
|
key: test_mcc
|
|
value: [0.65330526 0.54006172 0.84866842 0.70064905 0.80829038 0.72573276
|
|
0.72573276 0.72615385 0.68875274 0.608971 ]
|
|
|
|
mean value: 0.7026317935253169
|
|
|
|
key: train_mcc
|
|
value: [0.67238923 0.72423761 0.71496629 0.70646532 0.72361387 0.74138866
|
|
0.69411122 0.70713779 0.70314599 0.71561406]
|
|
|
|
mean value: 0.7103070036060428
|
|
|
|
key: test_accuracy
|
|
value: [0.82692308 0.76923077 0.92307692 0.84615385 0.90384615 0.8627451
|
|
0.8627451 0.8627451 0.84313725 0.80392157]
|
|
|
|
mean value: 0.8504524886877828
|
|
|
|
key: train_accuracy
|
|
value: [0.83585313 0.86177106 0.8574514 0.85313175 0.86177106 0.87068966
|
|
0.84698276 0.85344828 0.8512931 0.85775862]
|
|
|
|
mean value: 0.8550150815520965
|
|
|
|
key: test_fscore
|
|
value: [0.81632653 0.76 0.92 0.83333333 0.90196078 0.85714286
|
|
0.85714286 0.8627451 0.84615385 0.79166667]
|
|
|
|
mean value: 0.8446471973404747
|
|
|
|
key: train_fscore
|
|
value: [0.82959641 0.85585586 0.85333333 0.84821429 0.85777778 0.86784141
|
|
0.84257206 0.84888889 0.84563758 0.8539823 ]
|
|
|
|
mean value: 0.8503699910679656
|
|
|
|
key: test_precision
|
|
value: [0.83333333 0.79166667 0.95833333 0.90909091 0.92 0.875
|
|
0.875 0.84615385 0.81481481 0.82608696]
|
|
|
|
mean value: 0.8649479859914643
|
|
|
|
key: train_precision
|
|
value: [0.85253456 0.87962963 0.86486486 0.86363636 0.86936937 0.87555556
|
|
0.85585586 0.86425339 0.86697248 0.86547085]
|
|
|
|
mean value: 0.8658142923870936
|
|
|
|
key: test_recall
|
|
value: [0.8 0.73076923 0.88461538 0.76923077 0.88461538 0.84
|
|
0.84 0.88 0.88 0.76 ]
|
|
|
|
mean value: 0.8269230769230769
|
|
|
|
key: train_recall
|
|
value: [0.80786026 0.83333333 0.84210526 0.83333333 0.84649123 0.86026201
|
|
0.82969432 0.83406114 0.82532751 0.84279476]
|
|
|
|
mean value: 0.8355263157894737
|
|
|
|
key: test_roc_auc
|
|
value: [0.82592593 0.76923077 0.92307692 0.84615385 0.90384615 0.86230769
|
|
0.86230769 0.86307692 0.84384615 0.80307692]
|
|
|
|
mean value: 0.8502849002849002
|
|
|
|
key: train_roc_auc
|
|
value: [0.83555406 0.86134752 0.85722284 0.85283688 0.86154349 0.87055654
|
|
0.84676206 0.85320078 0.85096163 0.85756759]
|
|
|
|
mean value: 0.8547553382911727
|
|
|
|
key: test_jcc
|
|
value: [0.68965517 0.61290323 0.85185185 0.71428571 0.82142857 0.75
|
|
0.75 0.75862069 0.73333333 0.65517241]
|
|
|
|
mean value: 0.7337250972567991
|
|
|
|
key: train_jcc
|
|
value: [0.70881226 0.7480315 0.74418605 0.73643411 0.75097276 0.76653696
|
|
0.72796935 0.73745174 0.73255814 0.74517375]
|
|
|
|
mean value: 0.7398126610083979
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01407576 0.01649785 0.02283406 0.02057886 0.01880026 0.02140307
|
|
0.02072906 0.02038002 0.02303767 0.01901793]
|
|
|
|
mean value: 0.01973545551300049
|
|
|
|
key: score_time
|
|
value: [0.01000118 0.01126051 0.01198721 0.01775503 0.01397157 0.01187444
|
|
0.01192546 0.01181602 0.01192856 0.0117836 ]
|
|
|
|
mean value: 0.012430357933044433
|
|
|
|
key: test_mcc
|
|
value: [0.67524617 0.65433031 0.81312325 0.88527041 0.73131034 0.76662339
|
|
0.76733527 0.73878883 0.74071542 0.65224812]
|
|
|
|
mean value: 0.7424991517164395
|
|
|
|
key: train_mcc
|
|
value: [0.76102063 0.87912177 0.90936066 0.85585682 0.85829967 0.89258812
|
|
0.86828293 0.81782174 0.8647866 0.8793363 ]
|
|
|
|
mean value: 0.8586475244728068
|
|
|
|
key: test_accuracy
|
|
value: [0.82692308 0.82692308 0.90384615 0.94230769 0.86538462 0.88235294
|
|
0.88235294 0.8627451 0.8627451 0.82352941]
|
|
|
|
mean value: 0.8679110105580694
|
|
|
|
key: train_accuracy
|
|
value: [0.86825054 0.93952484 0.95464363 0.92656587 0.92656587 0.94612069
|
|
0.93318966 0.90517241 0.93103448 0.93965517]
|
|
|
|
mean value: 0.9270723169732629
|
|
|
|
key: test_fscore
|
|
value: [0.79069767 0.82352941 0.89795918 0.94339623 0.8627451 0.875
|
|
0.88461538 0.84444444 0.87272727 0.83018868]
|
|
|
|
mean value: 0.8625303375343475
|
|
|
|
key: train_fscore
|
|
value: [0.84711779 0.9380531 0.95424837 0.92827004 0.92093023 0.94456763
|
|
0.93446089 0.89671362 0.93277311 0.93913043]
|
|
|
|
mean value: 0.923626520709015
|
|
|
|
key: test_precision
|
|
value: [0.94444444 0.84 0.95652174 0.92592593 0.88 0.91304348
|
|
0.85185185 0.95 0.8 0.78571429]
|
|
|
|
mean value: 0.8847501725327812
|
|
|
|
key: train_precision
|
|
value: [0.99411765 0.94642857 0.94805195 0.89430894 0.98019802 0.95945946
|
|
0.9057377 0.96954315 0.89878543 0.93506494]
|
|
|
|
mean value: 0.9431695801182518
|
|
|
|
key: test_recall
|
|
value: [0.68 0.80769231 0.84615385 0.96153846 0.84615385 0.84
|
|
0.92 0.76 0.96 0.88 ]
|
|
|
|
mean value: 0.8501538461538461
|
|
|
|
key: train_recall
|
|
value: [0.73799127 0.92982456 0.96052632 0.96491228 0.86842105 0.930131
|
|
0.9650655 0.83406114 0.96943231 0.94323144]
|
|
|
|
mean value: 0.9103596874281774
|
|
|
|
key: test_roc_auc
|
|
value: [0.82148148 0.82692308 0.90384615 0.94230769 0.86538462 0.88153846
|
|
0.88307692 0.86076923 0.86461538 0.82461538]
|
|
|
|
mean value: 0.8674558404558405
|
|
|
|
key: train_roc_auc
|
|
value: [0.86685888 0.93938037 0.95473124 0.92713699 0.92569989 0.94591657
|
|
0.93359658 0.90426461 0.93152467 0.93970083]
|
|
|
|
mean value: 0.9268810621174348
|
|
|
|
key: test_jcc
|
|
value: [0.65384615 0.7 0.81481481 0.89285714 0.75862069 0.77777778
|
|
0.79310345 0.73076923 0.77419355 0.70967742]
|
|
|
|
mean value: 0.760566022573809
|
|
|
|
key: train_jcc
|
|
value: [0.73478261 0.88333333 0.9125 0.86614173 0.85344828 0.89495798
|
|
0.87698413 0.81276596 0.87401575 0.8852459 ]
|
|
|
|
mean value: 0.8594175667469572
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01722074 0.02114892 0.02169371 0.01978874 0.02115488 0.02141595
|
|
0.01782823 0.02157593 0.02269053 0.02523351]
|
|
|
|
mean value: 0.020975112915039062
|
|
|
|
key: score_time
|
|
value: [0.01061797 0.01195741 0.01185656 0.01192594 0.01196694 0.01201153
|
|
0.01211405 0.01274276 0.01431131 0.01298666]
|
|
|
|
mean value: 0.012249112129211426
|
|
|
|
key: test_mcc
|
|
value: [0.81203628 0.66628253 0.72760688 0.88527041 0.72760688 0.84544958
|
|
0.84307692 0.84307692 0.88289781 0.77487835]
|
|
|
|
mean value: 0.8008182566592692
|
|
|
|
key: train_mcc
|
|
value: [0.84181709 0.87136001 0.8004481 0.90499207 0.71142522 0.89664473
|
|
0.87722401 0.89655622 0.90180046 0.87805565]
|
|
|
|
mean value: 0.8580323560247036
|
|
|
|
key: test_accuracy
|
|
value: [0.90384615 0.82692308 0.84615385 0.94230769 0.84615385 0.92156863
|
|
0.92156863 0.92156863 0.94117647 0.88235294]
|
|
|
|
mean value: 0.8953619909502262
|
|
|
|
key: train_accuracy
|
|
value: [0.91792657 0.93304536 0.89416847 0.9524838 0.83801296 0.94827586
|
|
0.9375 0.94827586 0.95043103 0.9375 ]
|
|
|
|
mean value: 0.9257619907648768
|
|
|
|
key: test_fscore
|
|
value: [0.89361702 0.84210526 0.81818182 0.94117647 0.86666667 0.91666667
|
|
0.92 0.92 0.93877551 0.88888889]
|
|
|
|
mean value: 0.8946078305630848
|
|
|
|
key: train_fscore
|
|
value: [0.91162791 0.93555094 0.88192771 0.95196507 0.85768501 0.94713656
|
|
0.93424036 0.94736842 0.94854586 0.93920335]
|
|
|
|
mean value: 0.9255251191697211
|
|
|
|
key: test_precision
|
|
value: [0.95454545 0.77419355 1. 0.96 0.76470588 0.95652174
|
|
0.92 0.92 0.95833333 0.82758621]
|
|
|
|
mean value: 0.9035886164645812
|
|
|
|
key: train_precision
|
|
value: [0.97512438 0.88932806 0.97860963 0.94782609 0.75585284 0.95555556
|
|
0.97169811 0.95154185 0.97247706 0.90322581]
|
|
|
|
mean value: 0.9301239386440059
|
|
|
|
key: test_recall
|
|
value: [0.84 0.92307692 0.69230769 0.92307692 1. 0.88
|
|
0.92 0.92 0.92 0.96 ]
|
|
|
|
mean value: 0.8978461538461538
|
|
|
|
key: train_recall
|
|
value: [0.8558952 0.98684211 0.80263158 0.95614035 0.99122807 0.93886463
|
|
0.89956332 0.94323144 0.92576419 0.97816594]
|
|
|
|
mean value: 0.9278326821420363
|
|
|
|
key: test_roc_auc
|
|
value: [0.90148148 0.82692308 0.84615385 0.94230769 0.84615385 0.92076923
|
|
0.92153846 0.92153846 0.94076923 0.88384615]
|
|
|
|
mean value: 0.8951481481481481
|
|
|
|
key: train_roc_auc
|
|
value: [0.91726384 0.93384658 0.89280515 0.95253826 0.84029489 0.94815572
|
|
0.9370157 0.94821147 0.95011614 0.93801914]
|
|
|
|
mean value: 0.9258266884068974
|
|
|
|
key: test_jcc
|
|
value: [0.80769231 0.72727273 0.69230769 0.88888889 0.76470588 0.84615385
|
|
0.85185185 0.85185185 0.88461538 0.8 ]
|
|
|
|
mean value: 0.8115340432987492
|
|
|
|
key: train_jcc
|
|
value: [0.83760684 0.87890625 0.7887931 0.90833333 0.75083056 0.89958159
|
|
0.87659574 0.9 0.90212766 0.88537549]
|
|
|
|
mean value: 0.8628150577457124
|
|
|
|
MCC on Blind test: 0.81
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.1831696 0.17794561 0.17552376 0.17649794 0.17604017 0.18365979
|
|
0.17523932 0.17550063 0.17589378 0.1765306 ]
|
|
|
|
mean value: 0.17760012149810792
|
|
|
|
key: score_time
|
|
value: [0.01525378 0.01530957 0.01576948 0.0153296 0.01532698 0.01565194
|
|
0.01529312 0.01551795 0.01531744 0.01527667]
|
|
|
|
mean value: 0.015404653549194337
|
|
|
|
key: test_mcc
|
|
value: [0.92592593 0.92307692 0.9258201 0.92307692 0.9258201 1.
|
|
0.88289781 0.96153846 0.84307692 0.92427578]
|
|
|
|
mean value: 0.9235508946829519
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96153846 0.96153846 0.96153846 0.96153846 0.96153846 1.
|
|
0.94117647 0.98039216 0.92156863 0.96078431]
|
|
|
|
mean value: 0.9611613876319759
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96153846 0.96153846 0.96 0.96153846 0.96296296 1.
|
|
0.93877551 0.98039216 0.92 0.95833333]
|
|
|
|
mean value: 0.9605079347978508
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.92592593 0.96153846 1. 0.96153846 0.92857143 1.
|
|
0.95833333 0.96153846 0.92 1. ]
|
|
|
|
mean value: 0.9617446072446073
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.96153846 0.92307692 0.96153846 1. 1.
|
|
0.92 1. 0.92 0.92 ]
|
|
|
|
mean value: 0.9606153846153846
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96296296 0.96153846 0.96153846 0.96153846 0.96153846 1.
|
|
0.94076923 0.98076923 0.92153846 0.96 ]
|
|
|
|
mean value: 0.9612193732193732
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.92592593 0.92592593 0.92307692 0.92592593 0.92857143 1.
|
|
0.88461538 0.96153846 0.85185185 0.92 ]
|
|
|
|
mean value: 0.9247431827431828
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.95
|
|
|
|
Accuracy on Blind test: 0.98
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.06544733 0.06211853 0.06923485 0.07710958 0.07985926 0.07961392
|
|
0.05945921 0.07828259 0.06221557 0.06299257]
|
|
|
|
mean value: 0.0696333408355713
|
|
|
|
key: score_time
|
|
value: [0.02117729 0.03006291 0.02708936 0.02777982 0.02915573 0.03034329
|
|
0.02469993 0.02470422 0.03304958 0.02316904]
|
|
|
|
mean value: 0.027123117446899415
|
|
|
|
key: test_mcc
|
|
value: [0.89087081 0.92307692 0.96225045 0.92307692 0.9258201 1.
|
|
0.80431528 0.96153846 0.88307692 0.92153846]
|
|
|
|
mean value: 0.9195564331630951
|
|
|
|
key: train_mcc
|
|
value: [0.98275766 0.98275637 0.99135872 0.99568837 0.99135872 0.98275574
|
|
0.9870767 0.99569843 0.98290567 0.9870767 ]
|
|
|
|
mean value: 0.9879433054726061
|
|
|
|
key: test_accuracy
|
|
value: [0.94230769 0.96153846 0.98076923 0.96153846 0.96153846 1.
|
|
0.90196078 0.98039216 0.94117647 0.96078431]
|
|
|
|
mean value: 0.9592006033182504
|
|
|
|
key: train_accuracy
|
|
value: [0.99136069 0.99136069 0.99568035 0.99784017 0.99568035 0.99137931
|
|
0.99353448 0.99784483 0.99137931 0.99353448]
|
|
|
|
mean value: 0.9939594660013406
|
|
|
|
key: test_fscore
|
|
value: [0.94339623 0.96153846 0.98039216 0.96153846 0.96296296 1.
|
|
0.89795918 0.98039216 0.94117647 0.96 ]
|
|
|
|
mean value: 0.9589356080442175
|
|
|
|
key: train_fscore
|
|
value: [0.99130435 0.99126638 0.99561404 0.9978022 0.99561404 0.99126638
|
|
0.99346405 0.99782135 0.99134199 0.99346405]
|
|
|
|
mean value: 0.9938958813575108
|
|
|
|
key: test_precision
|
|
value: [0.89285714 0.96153846 1. 0.96153846 0.92857143 1.
|
|
0.91666667 0.96153846 0.92307692 0.96 ]
|
|
|
|
mean value: 0.9505787545787546
|
|
|
|
key: train_precision
|
|
value: [0.98701299 0.98695652 0.99561404 1. 0.99561404 0.99126638
|
|
0.99130435 0.99565217 0.98283262 0.99130435]
|
|
|
|
mean value: 0.9917557442064376
|
|
|
|
key: test_recall
|
|
value: [1. 0.96153846 0.96153846 0.96153846 1. 1.
|
|
0.88 1. 0.96 0.96 ]
|
|
|
|
mean value: 0.9684615384615385
|
|
|
|
key: train_recall
|
|
value: [0.99563319 0.99561404 0.99561404 0.99561404 0.99561404 0.99126638
|
|
0.99563319 1. 1. 0.99563319]
|
|
|
|
mean value: 0.9960622079215506
|
|
|
|
key: test_roc_auc
|
|
value: [0.94444444 0.96153846 0.98076923 0.96153846 0.96153846 1.
|
|
0.90153846 0.98076923 0.94153846 0.96076923]
|
|
|
|
mean value: 0.9594444444444444
|
|
|
|
key: train_roc_auc
|
|
value: [0.99140634 0.99142404 0.99567936 0.99780702 0.99567936 0.99137787
|
|
0.99356127 0.99787234 0.99148936 0.99356127]
|
|
|
|
mean value: 0.9939858230006007
|
|
|
|
key: test_jcc
|
|
value: [0.89285714 0.92592593 0.96153846 0.92592593 0.92857143 1.
|
|
0.81481481 0.96153846 0.88888889 0.92307692]
|
|
|
|
mean value: 0.9223137973137974
|
|
|
|
key: train_jcc
|
|
value: [0.98275862 0.98268398 0.99126638 0.99561404 0.99126638 0.98268398
|
|
0.98701299 0.99565217 0.98283262 0.98701299]
|
|
|
|
mean value: 0.9878784138201812
|
|
|
|
MCC on Blind test: 0.9
|
|
|
|
Accuracy on Blind test: 0.95
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.09743285 0.13634515 0.15849376 0.17319727 0.15302992 0.16544819
|
|
0.15362334 0.14878941 0.14881253 0.15468121]
|
|
|
|
mean value: 0.14898536205291749
|
|
|
|
key: score_time
|
|
value: [0.0148344 0.01481652 0.02447915 0.02398372 0.02386975 0.02401042
|
|
0.02402806 0.02415848 0.02408814 0.024122 ]
|
|
|
|
mean value: 0.022239065170288085
|
|
|
|
key: test_mcc
|
|
value: [0.65330526 0.53846154 0.6172134 0.466924 0.62279916 0.68875274
|
|
0.68615385 0.84307692 0.68615385 0.61017022]
|
|
|
|
mean value: 0.6413010922501776
|
|
|
|
key: train_mcc
|
|
value: [0.98712064 0.99568837 0.98711849 0.98711849 0.98711849 0.98714723
|
|
0.99141377 0.98714723 0.98714723 0.98714723]
|
|
|
|
mean value: 0.9884167178261011
|
|
|
|
key: test_accuracy
|
|
value: [0.82692308 0.76923077 0.80769231 0.71153846 0.80769231 0.84313725
|
|
0.84313725 0.92156863 0.84313725 0.80392157]
|
|
|
|
mean value: 0.8177978883861237
|
|
|
|
key: train_accuracy
|
|
value: [0.99352052 0.99784017 0.99352052 0.99352052 0.99352052 0.99353448
|
|
0.99568966 0.99353448 0.99353448 0.99353448]
|
|
|
|
mean value: 0.9941749832427199
|
|
|
|
key: test_fscore
|
|
value: [0.81632653 0.76923077 0.8 0.63414634 0.82142857 0.84615385
|
|
0.84 0.92 0.84 0.80769231]
|
|
|
|
mean value: 0.8094978366581154
|
|
|
|
key: train_fscore
|
|
value: [0.99340659 0.9978022 0.99337748 0.99337748 0.99337748 0.99340659
|
|
0.99561404 0.99340659 0.99340659 0.99340659]
|
|
|
|
mean value: 0.994058165025401
|
|
|
|
key: test_precision
|
|
value: [0.83333333 0.76923077 0.83333333 0.86666667 0.76666667 0.81481481
|
|
0.84 0.92 0.84 0.77777778]
|
|
|
|
mean value: 0.8261823361823362
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.8 0.76923077 0.76923077 0.5 0.88461538 0.88
|
|
0.84 0.92 0.84 0.84 ]
|
|
|
|
mean value: 0.8043076923076923
|
|
|
|
key: train_recall
|
|
value: [0.98689956 0.99561404 0.98684211 0.98684211 0.98684211 0.98689956
|
|
0.99126638 0.98689956 0.98689956 0.98689956]
|
|
|
|
mean value: 0.988190454301693
|
|
|
|
key: test_roc_auc
|
|
value: [0.82592593 0.76923077 0.80769231 0.71153846 0.80769231 0.84384615
|
|
0.84307692 0.92153846 0.84307692 0.80461538]
|
|
|
|
mean value: 0.8178233618233618
|
|
|
|
key: train_roc_auc
|
|
value: [0.99344978 0.99780702 0.99342105 0.99342105 0.99342105 0.99344978
|
|
0.99563319 0.99344978 0.99344978 0.99344978]
|
|
|
|
mean value: 0.9940952271508465
|
|
|
|
key: test_jcc
|
|
value: [0.68965517 0.625 0.66666667 0.46428571 0.6969697 0.73333333
|
|
0.72413793 0.85185185 0.72413793 0.67741935]
|
|
|
|
mean value: 0.6853457652428732
|
|
|
|
key: train_jcc
|
|
value: [0.98689956 0.99561404 0.98684211 0.98684211 0.98684211 0.98689956
|
|
0.99126638 0.98689956 0.98689956 0.98689956]
|
|
|
|
mean value: 0.988190454301693
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.71794391 0.7203269 0.72270894 0.72166705 0.71613026 0.71105385
|
|
0.71652889 0.72433043 0.72593617 0.72219372]
|
|
|
|
mean value: 0.7198820114135742
|
|
|
|
key: score_time
|
|
value: [0.00964975 0.00945163 0.00943828 0.00946069 0.00968385 0.00951886
|
|
0.00939965 0.00939608 0.00974894 0.00926399]
|
|
|
|
mean value: 0.009501171112060548
|
|
|
|
key: test_mcc
|
|
value: [0.89087081 0.88527041 0.96225045 0.92307692 0.9258201 1.
|
|
0.92153846 0.96153846 0.8459178 0.92153846]
|
|
|
|
mean value: 0.9237821874489884
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.94230769 0.94230769 0.98076923 0.96153846 0.96153846 1.
|
|
0.96078431 0.98039216 0.92156863 0.96078431]
|
|
|
|
mean value: 0.9611990950226245
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94339623 0.94117647 0.98039216 0.96153846 0.96296296 1.
|
|
0.96 0.98039216 0.92307692 0.96 ]
|
|
|
|
mean value: 0.9612935358307167
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.89285714 0.96 1. 0.96153846 0.92857143 1.
|
|
0.96 0.96153846 0.88888889 0.96 ]
|
|
|
|
mean value: 0.9513394383394383
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.92307692 0.96153846 0.96153846 1. 1.
|
|
0.96 1. 0.96 0.96 ]
|
|
|
|
mean value: 0.9726153846153847
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.94444444 0.94230769 0.98076923 0.96153846 0.96153846 1.
|
|
0.96076923 0.98076923 0.92230769 0.96076923]
|
|
|
|
mean value: 0.9615213675213675
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.89285714 0.88888889 0.96153846 0.92592593 0.92857143 1.
|
|
0.92307692 0.96153846 0.85714286 0.92307692]
|
|
|
|
mean value: 0.9262617012617013
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.86
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.03045988 0.03296375 0.04668927 0.0325253 0.03056479 0.03066826
|
|
0.03425074 0.03044724 0.03053427 0.03095007]
|
|
|
|
mean value: 0.033005356788635254
|
|
|
|
key: score_time
|
|
value: [0.01308393 0.01344943 0.01736689 0.01323104 0.01809931 0.01370311
|
|
0.01521468 0.01508665 0.01611853 0.01422572]
|
|
|
|
mean value: 0.014957928657531738
|
|
|
|
key: test_mcc
|
|
value: [0.4637037 0.32338083 0.09128709 0.28697202 0.50037023 0.25161197
|
|
0.42192651 0.43108293 0.54660922 0.31510143]
|
|
|
|
mean value: 0.3632045934004423
|
|
|
|
key: train_mcc
|
|
value: [0.87029251 0.82908577 0.52028331 0.6055719 0.97411589 0.83676363
|
|
0.92126558 0.96566269 0.94140567 0.59943068]
|
|
|
|
mean value: 0.8063877621767472
|
|
|
|
key: test_accuracy
|
|
value: [0.73076923 0.65384615 0.53846154 0.63461538 0.75 0.60784314
|
|
0.70588235 0.70588235 0.76470588 0.60784314]
|
|
|
|
mean value: 0.6699849170437405
|
|
|
|
key: train_accuracy
|
|
value: [0.93088553 0.90712743 0.71058315 0.76673866 0.98704104 0.91163793
|
|
0.95905172 0.98275862 0.96982759 0.76293103]
|
|
|
|
mean value: 0.8888582706486929
|
|
|
|
key: test_fscore
|
|
value: [0.73076923 0.7 0.63636364 0.68852459 0.75471698 0.67741935
|
|
0.72727273 0.73684211 0.78571429 0.70588235]
|
|
|
|
mean value: 0.7143505264458934
|
|
|
|
key: train_fscore
|
|
value: [0.93469388 0.91382766 0.77288136 0.80851064 0.98689956 0.91783567
|
|
0.96016771 0.98268398 0.97033898 0.80633803]
|
|
|
|
mean value: 0.905417747054172
|
|
|
|
key: test_precision
|
|
value: [0.7037037 0.61764706 0.525 0.6 0.74074074 0.56756757
|
|
0.66666667 0.65625 0.70967742 0.55813953]
|
|
|
|
mean value: 0.6345392691740768
|
|
|
|
key: train_precision
|
|
value: [0.87739464 0.84132841 0.62983425 0.67857143 0.9826087 0.84814815
|
|
0.9233871 0.97424893 0.94238683 0.67551622]
|
|
|
|
mean value: 0.8373424655092186
|
|
|
|
key: test_recall
|
|
value: [0.76 0.80769231 0.80769231 0.80769231 0.76923077 0.84
|
|
0.8 0.84 0.88 0.96 ]
|
|
|
|
mean value: 0.8272307692307692
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 0.99122807 1.
|
|
1. 0.99126638 1. 1. ]
|
|
|
|
mean value: 0.998249444572129
|
|
|
|
key: test_roc_auc
|
|
value: [0.73185185 0.65384615 0.53846154 0.63461538 0.75 0.61230769
|
|
0.70769231 0.70846154 0.76692308 0.61461538]
|
|
|
|
mean value: 0.6718774928774929
|
|
|
|
key: train_roc_auc
|
|
value: [0.93162393 0.90851064 0.71489362 0.77021277 0.9871034 0.91276596
|
|
0.95957447 0.98286723 0.97021277 0.76595745]
|
|
|
|
mean value: 0.8903722218314364
|
|
|
|
key: test_jcc
|
|
value: [0.57575758 0.53846154 0.46666667 0.525 0.60606061 0.51219512
|
|
0.57142857 0.58333333 0.64705882 0.54545455]
|
|
|
|
mean value: 0.5571416782643468
|
|
|
|
key: train_jcc
|
|
value: [0.87739464 0.84132841 0.62983425 0.67857143 0.97413793 0.84814815
|
|
0.9233871 0.96595745 0.94238683 0.67551622]
|
|
|
|
mean value: 0.8356662410244379
|
|
|
|
MCC on Blind test: 0.45
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02886796 0.03506136 0.04856014 0.03749013 0.03853464 0.03879905
|
|
0.03333855 0.03817081 0.0379436 0.03178 ]
|
|
|
|
mean value: 0.03685462474822998
|
|
|
|
key: score_time
|
|
value: [0.0193758 0.01948166 0.02359152 0.02530694 0.02442861 0.02386785
|
|
0.01905274 0.02539372 0.02083659 0.01895976]
|
|
|
|
mean value: 0.022029519081115723
|
|
|
|
key: test_mcc
|
|
value: [0.80829038 0.65433031 0.84866842 0.88527041 0.80829038 0.88289781
|
|
0.76733527 0.88289781 0.80990051 0.73107432]
|
|
|
|
mean value: 0.8078955626786625
|
|
|
|
key: train_mcc
|
|
value: [0.86208312 0.86630587 0.85815088 0.84902492 0.84879533 0.85375825
|
|
0.85375825 0.86645175 0.85797371 0.84920893]
|
|
|
|
mean value: 0.856551100724223
|
|
|
|
key: test_accuracy
|
|
value: [0.90384615 0.82692308 0.92307692 0.94230769 0.90384615 0.94117647
|
|
0.88235294 0.94117647 0.90196078 0.8627451 ]
|
|
|
|
mean value: 0.9029411764705882
|
|
|
|
key: train_accuracy
|
|
value: [0.93088553 0.93304536 0.9287257 0.92440605 0.92440605 0.92672414
|
|
0.92672414 0.93318966 0.92887931 0.92456897]
|
|
|
|
mean value: 0.9281554889401952
|
|
|
|
key: test_fscore
|
|
value: [0.90196078 0.83018868 0.92 0.94117647 0.90566038 0.93877551
|
|
0.88461538 0.93877551 0.90566038 0.86792453]
|
|
|
|
mean value: 0.9034737622189659
|
|
|
|
key: train_fscore
|
|
value: [0.93103448 0.93275488 0.92903226 0.92407809 0.92341357 0.92672414
|
|
0.92672414 0.93275488 0.9287257 0.92407809]
|
|
|
|
mean value: 0.9279320228969524
|
|
|
|
key: test_precision
|
|
value: [0.88461538 0.81481481 0.95833333 0.96 0.88888889 0.95833333
|
|
0.85185185 0.95833333 0.85714286 0.82142857]
|
|
|
|
mean value: 0.8953742368742369
|
|
|
|
key: train_precision
|
|
value: [0.91914894 0.92274678 0.91139241 0.91416309 0.92139738 0.91489362
|
|
0.91489362 0.92672414 0.91880342 0.91810345]
|
|
|
|
mean value: 0.9182266831443672
|
|
|
|
key: test_recall
|
|
value: [0.92 0.84615385 0.88461538 0.92307692 0.92307692 0.92
|
|
0.92 0.92 0.96 0.92 ]
|
|
|
|
mean value: 0.9136923076923077
|
|
|
|
key: train_recall
|
|
value: [0.94323144 0.94298246 0.94736842 0.93421053 0.9254386 0.93886463
|
|
0.93886463 0.93886463 0.93886463 0.930131 ]
|
|
|
|
mean value: 0.9378820960698689
|
|
|
|
key: test_roc_auc
|
|
value: [0.90444444 0.82692308 0.92307692 0.94230769 0.90384615 0.94076923
|
|
0.88307692 0.94076923 0.90307692 0.86384615]
|
|
|
|
mean value: 0.9032136752136752
|
|
|
|
key: train_roc_auc
|
|
value: [0.93101743 0.93319336 0.92900336 0.92455207 0.92442143 0.92687912
|
|
0.92687912 0.9332621 0.92900678 0.92463997]
|
|
|
|
mean value: 0.9282854742942543
|
|
|
|
key: test_jcc
|
|
value: [0.82142857 0.70967742 0.85185185 0.88888889 0.82758621 0.88461538
|
|
0.79310345 0.88461538 0.82758621 0.76666667]
|
|
|
|
mean value: 0.8256020029490552
|
|
|
|
key: train_jcc
|
|
value: [0.87096774 0.87398374 0.86746988 0.85887097 0.85772358 0.86345382
|
|
0.86345382 0.87398374 0.86693548 0.85887097]
|
|
|
|
mean value: 0.8655713728241052
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.19555092 0.28002501 0.27410674 0.29807663 0.32655668 0.30876517
|
|
0.29639506 0.28637075 0.28587222 0.27678657]
|
|
|
|
mean value: 0.28285057544708253
|
|
|
|
key: score_time
|
|
value: [0.01900291 0.01891041 0.01891088 0.02085233 0.0235498 0.01884913
|
|
0.01897097 0.02552199 0.02648234 0.02427053]
|
|
|
|
mean value: 0.021532130241394044
|
|
|
|
key: test_mcc
|
|
value: [0.80829038 0.65433031 0.84866842 0.88527041 0.80829038 0.88289781
|
|
0.76733527 0.88289781 0.80990051 0.73107432]
|
|
|
|
mean value: 0.8078955626786625
|
|
|
|
key: train_mcc
|
|
value: [0.86208312 0.86630587 0.80159752 0.84902492 0.84879533 0.85375825
|
|
0.85375825 0.86645175 0.90108236 0.84920893]
|
|
|
|
mean value: 0.8552066307077918
|
|
|
|
key: test_accuracy
|
|
value: [0.90384615 0.82692308 0.92307692 0.94230769 0.90384615 0.94117647
|
|
0.88235294 0.94117647 0.90196078 0.8627451 ]
|
|
|
|
mean value: 0.9029411764705882
|
|
|
|
key: train_accuracy
|
|
value: [0.93088553 0.93304536 0.90064795 0.92440605 0.92440605 0.92672414
|
|
0.92672414 0.93318966 0.95043103 0.92456897]
|
|
|
|
mean value: 0.9275028859760185
|
|
|
|
key: test_fscore
|
|
value: [0.90196078 0.83018868 0.92 0.94117647 0.90566038 0.93877551
|
|
0.88461538 0.93877551 0.90566038 0.86792453]
|
|
|
|
mean value: 0.9034737622189659
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_sl.py:107: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
baseline_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_sl.py:110: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
baseline_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[0.93103448 0.93275488 0.9004329 0.92407809 0.92341357 0.92672414
|
|
0.92672414 0.93275488 0.95032397 0.92407809]
|
|
|
|
mean value: 0.9272319143476138
|
|
|
|
key: test_precision
|
|
value: [0.88461538 0.81481481 0.95833333 0.96 0.88888889 0.95833333
|
|
0.85185185 0.95833333 0.85714286 0.82142857]
|
|
|
|
mean value: 0.8953742368742369
|
|
|
|
key: train_precision
|
|
value: [0.91914894 0.92274678 0.88888889 0.91416309 0.92139738 0.91489362
|
|
0.91489362 0.92672414 0.94017094 0.91810345]
|
|
|
|
mean value: 0.918113083663679
|
|
|
|
key: test_recall
|
|
value: [0.92 0.84615385 0.88461538 0.92307692 0.92307692 0.92
|
|
0.92 0.92 0.96 0.92 ]
|
|
|
|
mean value: 0.9136923076923077
|
|
|
|
key: train_recall
|
|
value: [0.94323144 0.94298246 0.9122807 0.93421053 0.9254386 0.93886463
|
|
0.93886463 0.93886463 0.96069869 0.930131 ]
|
|
|
|
mean value: 0.9365567302535815
|
|
|
|
key: test_roc_auc
|
|
value: [0.90444444 0.82692308 0.92307692 0.94230769 0.90384615 0.94076923
|
|
0.88307692 0.94076923 0.90307692 0.86384615]
|
|
|
|
mean value: 0.9032136752136752
|
|
|
|
key: train_roc_auc
|
|
value: [0.93101743 0.93319336 0.9008212 0.92455207 0.92442143 0.92687912
|
|
0.92687912 0.9332621 0.95056211 0.92463997]
|
|
|
|
mean value: 0.9276227913861107
|
|
|
|
key: test_jcc
|
|
value: [0.82142857 0.70967742 0.85185185 0.88888889 0.82758621 0.88461538
|
|
0.79310345 0.88461538 0.82758621 0.76666667]
|
|
|
|
mean value: 0.8256020029490552
|
|
|
|
key: train_jcc
|
|
value: [0.87096774 0.87398374 0.81889764 0.85887097 0.85772358 0.86345382
|
|
0.86345382 0.87398374 0.90534979 0.85887097]
|
|
|
|
mean value: 0.8645555796885971
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03841734 0.03949714 0.03965116 0.03697515 0.03645062 0.03646588
|
|
0.03761983 0.09846282 0.04134083 0.04195118]
|
|
|
|
mean value: 0.044683194160461424
|
|
|
|
key: score_time
|
|
value: [0.0146718 0.01451349 0.01439214 0.0156827 0.01446724 0.01457667
|
|
0.01458859 0.01217008 0.01210904 0.01474524]
|
|
|
|
mean value: 0.014191699028015137
|
|
|
|
key: test_mcc
|
|
value: [0.85164138 0.73997003 0.77849894 0.96225045 0.89056356 0.74466871
|
|
0.76923077 0.88527041 0.81312325 0.79056942]
|
|
|
|
mean value: 0.8225786910541291
|
|
|
|
key: train_mcc
|
|
value: [0.85946342 0.88089135 0.87262489 0.86395495 0.86815585 0.87246682
|
|
0.88113831 0.86411148 0.86411148 0.87660368]
|
|
|
|
mean value: 0.8703522241153047
|
|
|
|
key: test_accuracy
|
|
value: [0.9245283 0.86792453 0.88461538 0.98076923 0.94230769 0.86538462
|
|
0.88461538 0.94230769 0.90384615 0.88461538]
|
|
|
|
mean value: 0.9080914368650218
|
|
|
|
key: train_accuracy
|
|
value: [0.92963753 0.94029851 0.93617021 0.93191489 0.93404255 0.93617021
|
|
0.94042553 0.93191489 0.93191489 0.93829787]
|
|
|
|
mean value: 0.9350787097944926
|
|
|
|
key: test_fscore
|
|
value: [0.92592593 0.87719298 0.875 0.98113208 0.94545455 0.85106383
|
|
0.88461538 0.94117647 0.90909091 0.89655172]
|
|
|
|
mean value: 0.9087203847528004
|
|
|
|
key: train_fscore
|
|
value: [0.93052632 0.94092827 0.93697479 0.93248945 0.93446089 0.93670886
|
|
0.94117647 0.93277311 0.93277311 0.93842887]
|
|
|
|
mean value: 0.9357240139743418
|
|
|
|
key: test_precision
|
|
value: [0.89285714 0.83333333 0.95454545 0.96296296 0.89655172 0.95238095
|
|
0.88461538 0.96 0.86206897 0.8125 ]
|
|
|
|
mean value: 0.9011815920350403
|
|
|
|
key: train_precision
|
|
value: [0.92083333 0.92916667 0.9253112 0.92468619 0.92857143 0.92887029
|
|
0.92946058 0.92116183 0.92116183 0.93644068]
|
|
|
|
mean value: 0.9265664027577826
|
|
|
|
key: test_recall
|
|
value: [0.96153846 0.92592593 0.80769231 1. 1. 0.76923077
|
|
0.88461538 0.92307692 0.96153846 1. ]
|
|
|
|
mean value: 0.9233618233618234
|
|
|
|
key: train_recall
|
|
value: [0.94042553 0.95299145 0.94893617 0.94042553 0.94042553 0.94468085
|
|
0.95319149 0.94468085 0.94468085 0.94042553]
|
|
|
|
mean value: 0.9450863793416985
|
|
|
|
key: test_roc_auc
|
|
value: [0.92521368 0.86680912 0.88461538 0.98076923 0.94230769 0.86538462
|
|
0.88461538 0.94230769 0.90384615 0.88461538]
|
|
|
|
mean value: 0.9080484330484331
|
|
|
|
key: train_roc_auc
|
|
value: [0.92961448 0.94032551 0.93617021 0.93191489 0.93404255 0.93617021
|
|
0.94042553 0.93191489 0.93191489 0.93829787]
|
|
|
|
mean value: 0.9350791052918713
|
|
|
|
key: test_jcc
|
|
value: [0.86206897 0.78125 0.77777778 0.96296296 0.89655172 0.74074074
|
|
0.79310345 0.88888889 0.83333333 0.8125 ]
|
|
|
|
mean value: 0.8349177841634738
|
|
|
|
key: train_jcc
|
|
value: [0.87007874 0.88844622 0.88142292 0.87351779 0.87698413 0.88095238
|
|
0.88888889 0.87401575 0.87401575 0.884 ]
|
|
|
|
mean value: 0.8792322559647762
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.88060451 1.01606941 0.91109967 1.10975718 0.93197393 0.99338555
|
|
0.97769165 0.90921044 0.9921577 0.92200208]
|
|
|
|
mean value: 0.9643952131271363
|
|
|
|
key: score_time
|
|
value: [0.01480174 0.04356098 0.01479554 0.01655483 0.02265716 0.04054356
|
|
0.01495075 0.0150485 0.01497436 0.01514244]
|
|
|
|
mean value: 0.02130298614501953
|
|
|
|
key: test_mcc
|
|
value: [0.81196581 0.8116984 0.77849894 0.96225045 0.89056356 0.81312325
|
|
0.84615385 0.84866842 0.84866842 0.82305489]
|
|
|
|
mean value: 0.8434645996919574
|
|
|
|
key: train_mcc
|
|
value: [0.91474349 0.91484796 0.90667855 0.90220118 0.91064654 0.90233192
|
|
0.90667855 0.90233192 0.88965172 0.90220118]
|
|
|
|
mean value: 0.9052313012071413
|
|
|
|
key: test_accuracy
|
|
value: [0.90566038 0.90566038 0.88461538 0.98076923 0.94230769 0.90384615
|
|
0.92307692 0.92307692 0.92307692 0.90384615]
|
|
|
|
mean value: 0.9195936139332366
|
|
|
|
key: train_accuracy
|
|
value: [0.95735608 0.95735608 0.95319149 0.95106383 0.95531915 0.95106383
|
|
0.95319149 0.95106383 0.94468085 0.95106383]
|
|
|
|
mean value: 0.9525350451390464
|
|
|
|
key: test_fscore
|
|
value: [0.90566038 0.90909091 0.875 0.98113208 0.94545455 0.89795918
|
|
0.92307692 0.92 0.92592593 0.9122807 ]
|
|
|
|
mean value: 0.9195580641806348
|
|
|
|
key: train_fscore
|
|
value: [0.95762712 0.95762712 0.95378151 0.95137421 0.95541401 0.95157895
|
|
0.95378151 0.95157895 0.94537815 0.95137421]
|
|
|
|
mean value: 0.9529515735610741
|
|
|
|
key: test_precision
|
|
value: [0.88888889 0.89285714 0.95454545 0.96296296 0.89655172 0.95652174
|
|
0.92307692 0.95833333 0.89285714 0.83870968]
|
|
|
|
mean value: 0.916530498920957
|
|
|
|
key: train_precision
|
|
value: [0.9535865 0.94957983 0.94190871 0.94537815 0.95338983 0.94166667
|
|
0.94190871 0.94166667 0.93360996 0.94537815]
|
|
|
|
mean value: 0.9448073182078001
|
|
|
|
key: test_recall
|
|
value: [0.92307692 0.92592593 0.80769231 1. 1. 0.84615385
|
|
0.92307692 0.88461538 0.96153846 1. ]
|
|
|
|
mean value: 0.9272079772079772
|
|
|
|
key: train_recall
|
|
value: [0.96170213 0.96581197 0.96595745 0.95744681 0.95744681 0.96170213
|
|
0.96595745 0.96170213 0.95744681 0.95744681]
|
|
|
|
mean value: 0.9612620476450263
|
|
|
|
key: test_roc_auc
|
|
value: [0.90598291 0.90527066 0.88461538 0.98076923 0.94230769 0.90384615
|
|
0.92307692 0.92307692 0.92307692 0.90384615]
|
|
|
|
mean value: 0.9195868945868946
|
|
|
|
key: train_roc_auc
|
|
value: [0.95734679 0.95737407 0.95319149 0.95106383 0.95531915 0.95106383
|
|
0.95319149 0.95106383 0.94468085 0.95106383]
|
|
|
|
mean value: 0.952535915621022
|
|
|
|
key: test_jcc
|
|
value: [0.82758621 0.83333333 0.77777778 0.96296296 0.89655172 0.81481481
|
|
0.85714286 0.85185185 0.86206897 0.83870968]
|
|
|
|
mean value: 0.8522800171854676
|
|
|
|
key: train_jcc
|
|
value: [0.91869919 0.91869919 0.91164659 0.90725806 0.91463415 0.90763052
|
|
0.91164659 0.90763052 0.89641434 0.90725806]
|
|
|
|
mean value: 0.9101517208854413
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01405215 0.01066756 0.01140666 0.012398 0.01129866 0.01105595
|
|
0.01146221 0.01126456 0.01135373 0.01132464]
|
|
|
|
mean value: 0.011628413200378418
|
|
|
|
key: score_time
|
|
value: [0.01219368 0.00984573 0.00990057 0.00974965 0.00986099 0.00926423
|
|
0.00986862 0.00995731 0.0097661 0.00981498]
|
|
|
|
mean value: 0.010022187232971191
|
|
|
|
key: test_mcc
|
|
value: [0.66048569 0.40912228 0.71151247 0.77151675 0.80829038 0.65824263
|
|
0.57735027 0.50037023 0.54006172 0.73568294]
|
|
|
|
mean value: 0.6372635364356517
|
|
|
|
key: train_mcc
|
|
value: [0.66698754 0.68740344 0.69667663 0.67751905 0.69117257 0.71834239
|
|
0.67337154 0.67751905 0.67558392 0.69117257]
|
|
|
|
mean value: 0.6855748724120782
|
|
|
|
key: test_accuracy
|
|
value: [0.83018868 0.69811321 0.84615385 0.88461538 0.90384615 0.82692308
|
|
0.78846154 0.75 0.76923077 0.86538462]
|
|
|
|
mean value: 0.8162917271407837
|
|
|
|
key: train_accuracy
|
|
value: [0.8315565 0.84221748 0.84680851 0.83617021 0.84468085 0.85744681
|
|
0.83617021 0.83617021 0.83617021 0.84468085]
|
|
|
|
mean value: 0.8412071859547249
|
|
|
|
key: test_fscore
|
|
value: [0.82352941 0.66666667 0.82608696 0.88 0.90196078 0.81632653
|
|
0.78431373 0.75471698 0.76 0.85714286]
|
|
|
|
mean value: 0.807074391364421
|
|
|
|
key: train_fscore
|
|
value: [0.82247191 0.83408072 0.83928571 0.82539683 0.8388521 0.85011186
|
|
0.84057971 0.82539683 0.82774049 0.8388521 ]
|
|
|
|
mean value: 0.8342768246079215
|
|
|
|
key: test_precision
|
|
value: [0.84 0.76190476 0.95 0.91666667 0.92 0.86956522
|
|
0.8 0.74074074 0.79166667 0.91304348]
|
|
|
|
mean value: 0.8503587531631009
|
|
|
|
key: train_precision
|
|
value: [0.87142857 0.87735849 0.88262911 0.88349515 0.87155963 0.89622642
|
|
0.81854839 0.88349515 0.87264151 0.87155963]
|
|
|
|
mean value: 0.8728942038918088
|
|
|
|
key: test_recall
|
|
value: [0.80769231 0.59259259 0.73076923 0.84615385 0.88461538 0.76923077
|
|
0.76923077 0.76923077 0.73076923 0.80769231]
|
|
|
|
mean value: 0.7707977207977208
|
|
|
|
key: train_recall
|
|
value: [0.7787234 0.79487179 0.8 0.77446809 0.80851064 0.80851064
|
|
0.86382979 0.77446809 0.78723404 0.80851064]
|
|
|
|
mean value: 0.7999127114020731
|
|
|
|
key: test_roc_auc
|
|
value: [0.82977208 0.70014245 0.84615385 0.88461538 0.90384615 0.82692308
|
|
0.78846154 0.75 0.76923077 0.86538462]
|
|
|
|
mean value: 0.8164529914529915
|
|
|
|
key: train_roc_auc
|
|
value: [0.83166939 0.84211675 0.84680851 0.83617021 0.84468085 0.85744681
|
|
0.83617021 0.83617021 0.83617021 0.84468085]
|
|
|
|
mean value: 0.8412084015275505
|
|
|
|
key: test_jcc
|
|
value: [0.7 0.5 0.7037037 0.78571429 0.82142857 0.68965517
|
|
0.64516129 0.60606061 0.61290323 0.75 ]
|
|
|
|
mean value: 0.6814626855449992
|
|
|
|
key: train_jcc
|
|
value: [0.69847328 0.71538462 0.72307692 0.7027027 0.72243346 0.73929961
|
|
0.725 0.7027027 0.70610687 0.72243346]
|
|
|
|
mean value: 0.7157613627585733
|
|
|
|
MCC on Blind test: 0.63
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01250458 0.01152492 0.01148963 0.01054597 0.01109338 0.01149631
|
|
0.01151705 0.0116148 0.01161146 0.01170826]
|
|
|
|
mean value: 0.011510634422302246
|
|
|
|
key: score_time
|
|
value: [0.01028419 0.00985003 0.00978994 0.00903082 0.00983238 0.00987172
|
|
0.01005745 0.01006413 0.00986838 0.00981331]
|
|
|
|
mean value: 0.009846234321594238
|
|
|
|
key: test_mcc
|
|
value: [0.73646724 0.50997151 0.6172134 0.88527041 0.69436507 0.69230769
|
|
0.69436507 0.77151675 0.69436507 0.69230769]
|
|
|
|
mean value: 0.6988149917958625
|
|
|
|
key: train_mcc
|
|
value: [0.74840423 0.76129503 0.71066404 0.74894295 0.75778307 0.77046393
|
|
0.71925314 0.68550371 0.74075423 0.75330062]
|
|
|
|
mean value: 0.739636494142086
|
|
|
|
key: test_accuracy
|
|
value: [0.86792453 0.75471698 0.80769231 0.94230769 0.84615385 0.84615385
|
|
0.84615385 0.88461538 0.84615385 0.84615385]
|
|
|
|
mean value: 0.8488026124818577
|
|
|
|
key: train_accuracy
|
|
value: [0.87420043 0.88059701 0.85531915 0.87446809 0.8787234 0.88510638
|
|
0.85957447 0.84255319 0.87021277 0.87659574]
|
|
|
|
mean value: 0.8697350632853967
|
|
|
|
key: test_fscore
|
|
value: [0.86792453 0.75471698 0.8 0.94117647 0.85185185 0.84615385
|
|
0.84 0.88 0.85185185 0.84615385]
|
|
|
|
mean value: 0.8479829376033594
|
|
|
|
key: train_fscore
|
|
value: [0.87473461 0.87931034 0.85470085 0.87473461 0.88050314 0.88655462
|
|
0.8583691 0.83982684 0.86825054 0.87553648]
|
|
|
|
mean value: 0.869252113965142
|
|
|
|
key: test_precision
|
|
value: [0.85185185 0.76923077 0.83333333 0.96 0.82142857 0.84615385
|
|
0.875 0.91666667 0.82142857 0.84615385]
|
|
|
|
mean value: 0.8541247456247456
|
|
|
|
key: train_precision
|
|
value: [0.87288136 0.88695652 0.8583691 0.87288136 0.8677686 0.87551867
|
|
0.86580087 0.85462555 0.88157895 0.88311688]
|
|
|
|
mean value: 0.8719497846503439
|
|
|
|
key: test_recall
|
|
value: [0.88461538 0.74074074 0.76923077 0.92307692 0.88461538 0.84615385
|
|
0.80769231 0.84615385 0.88461538 0.84615385]
|
|
|
|
mean value: 0.8433048433048433
|
|
|
|
key: train_recall
|
|
value: [0.87659574 0.87179487 0.85106383 0.87659574 0.89361702 0.89787234
|
|
0.85106383 0.82553191 0.85531915 0.86808511]
|
|
|
|
mean value: 0.8667539552645935
|
|
|
|
key: test_roc_auc
|
|
value: [0.86823362 0.75498575 0.80769231 0.94230769 0.84615385 0.84615385
|
|
0.84615385 0.88461538 0.84615385 0.84615385]
|
|
|
|
mean value: 0.8488603988603989
|
|
|
|
key: train_roc_auc
|
|
value: [0.87419531 0.88057829 0.85531915 0.87446809 0.8787234 0.88510638
|
|
0.85957447 0.84255319 0.87021277 0.87659574]
|
|
|
|
mean value: 0.8697326786688488
|
|
|
|
key: test_jcc
|
|
value: [0.76666667 0.60606061 0.66666667 0.88888889 0.74193548 0.73333333
|
|
0.72413793 0.78571429 0.74193548 0.73333333]
|
|
|
|
mean value: 0.7388672679440199
|
|
|
|
key: train_jcc
|
|
value: [0.77735849 0.78461538 0.74626866 0.77735849 0.78651685 0.79622642
|
|
0.7518797 0.7238806 0.76717557 0.77862595]
|
|
|
|
mean value: 0.7689906114471405
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00976706 0.01079893 0.01038599 0.01070499 0.01065278 0.01069951
|
|
0.0106945 0.0093689 0.01063561 0.00939131]
|
|
|
|
mean value: 0.010309958457946777
|
|
|
|
key: score_time
|
|
value: [0.01736188 0.01571083 0.01535082 0.01273298 0.01242304 0.01235628
|
|
0.01241827 0.01493406 0.01261854 0.01222396]
|
|
|
|
mean value: 0.013813066482543945
|
|
|
|
key: test_mcc
|
|
value: [0.58487934 0.36194897 0.30769231 0.5990423 0.66628253 0.4233902
|
|
0.43929769 0.73568294 0.6172134 0.31139958]
|
|
|
|
mean value: 0.504682924498233
|
|
|
|
key: train_mcc
|
|
value: [0.72752093 0.72748132 0.73197454 0.69792921 0.71495188 0.71490009
|
|
0.72768593 0.69364214 0.70823856 0.73223982]
|
|
|
|
mean value: 0.7176564421606096
|
|
|
|
key: test_accuracy
|
|
value: [0.79245283 0.67924528 0.65384615 0.78846154 0.82692308 0.71153846
|
|
0.71153846 0.86538462 0.80769231 0.65384615]
|
|
|
|
mean value: 0.7490928882438317
|
|
|
|
key: train_accuracy
|
|
value: [0.86353945 0.86353945 0.86595745 0.84893617 0.85744681 0.85744681
|
|
0.86382979 0.84680851 0.85319149 0.86595745]
|
|
|
|
mean value: 0.8586653359343102
|
|
|
|
key: test_fscore
|
|
value: [0.78431373 0.66666667 0.65384615 0.75555556 0.84210526 0.71698113
|
|
0.66666667 0.87272727 0.8 0.625 ]
|
|
|
|
mean value: 0.7383862436185877
|
|
|
|
key: train_fscore
|
|
value: [0.86147186 0.86086957 0.86509636 0.84796574 0.85653105 0.85714286
|
|
0.86324786 0.84615385 0.84768212 0.86393089]
|
|
|
|
mean value: 0.8570092145719881
|
|
|
|
key: test_precision
|
|
value: [0.8 0.70833333 0.65384615 0.89473684 0.77419355 0.7037037
|
|
0.78947368 0.82758621 0.83333333 0.68181818]
|
|
|
|
mean value: 0.7667024987634145
|
|
|
|
key: train_precision
|
|
value: [0.87665198 0.87610619 0.87068966 0.85344828 0.86206897 0.85897436
|
|
0.86695279 0.84978541 0.88073394 0.87719298]
|
|
|
|
mean value: 0.8672604557430365
|
|
|
|
key: test_recall
|
|
value: [0.76923077 0.62962963 0.65384615 0.65384615 0.92307692 0.73076923
|
|
0.57692308 0.92307692 0.76923077 0.57692308]
|
|
|
|
mean value: 0.7206552706552707
|
|
|
|
key: train_recall
|
|
value: [0.84680851 0.84615385 0.85957447 0.84255319 0.85106383 0.85531915
|
|
0.85957447 0.84255319 0.81702128 0.85106383]
|
|
|
|
mean value: 0.8471685761047463
|
|
|
|
key: test_roc_auc
|
|
value: [0.79202279 0.68019943 0.65384615 0.78846154 0.82692308 0.71153846
|
|
0.71153846 0.86538462 0.80769231 0.65384615]
|
|
|
|
mean value: 0.7491452991452991
|
|
|
|
key: train_roc_auc
|
|
value: [0.8635752 0.86350245 0.86595745 0.84893617 0.85744681 0.85744681
|
|
0.86382979 0.84680851 0.85319149 0.86595745]
|
|
|
|
mean value: 0.8586652118567013
|
|
|
|
key: test_jcc
|
|
value: [0.64516129 0.5 0.48571429 0.60714286 0.72727273 0.55882353
|
|
0.5 0.77419355 0.66666667 0.45454545]
|
|
|
|
mean value: 0.5919520359463434
|
|
|
|
key: train_jcc
|
|
value: [0.75665399 0.75572519 0.76226415 0.73605948 0.74906367 0.75
|
|
0.7593985 0.73333333 0.73563218 0.76045627]
|
|
|
|
mean value: 0.7498586771390656
|
|
|
|
MCC on Blind test: 0.33
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02233267 0.02127671 0.02358603 0.02476525 0.02091074 0.02076507
|
|
0.02088213 0.02093339 0.02088308 0.02073622]
|
|
|
|
mean value: 0.02170712947845459
|
|
|
|
key: score_time
|
|
value: [0.01467752 0.01178527 0.01262712 0.01558471 0.01146078 0.01158738
|
|
0.01152992 0.01151848 0.01147294 0.01148391]
|
|
|
|
mean value: 0.012372803688049317
|
|
|
|
key: test_mcc
|
|
value: [0.81196581 0.69957726 0.74466871 0.92307692 0.84866842 0.77849894
|
|
0.73131034 0.88527041 0.81312325 0.73568294]
|
|
|
|
mean value: 0.7971843013202676
|
|
|
|
key: train_mcc
|
|
value: [0.79530824 0.80810708 0.80451759 0.78298581 0.79149653 0.80018114
|
|
0.80000724 0.78726255 0.7957735 0.80428445]
|
|
|
|
mean value: 0.7969924124303265
|
|
|
|
key: test_accuracy
|
|
value: [0.90566038 0.8490566 0.86538462 0.96153846 0.92307692 0.88461538
|
|
0.86538462 0.94230769 0.90384615 0.86538462]
|
|
|
|
mean value: 0.8966255442670538
|
|
|
|
key: train_accuracy
|
|
value: [0.89765458 0.90405117 0.90212766 0.89148936 0.89574468 0.9
|
|
0.9 0.89361702 0.89787234 0.90212766]
|
|
|
|
mean value: 0.8984684480333893
|
|
|
|
key: test_fscore
|
|
value: [0.90566038 0.85714286 0.85106383 0.96153846 0.92592593 0.875
|
|
0.86792453 0.94117647 0.90909091 0.87272727]
|
|
|
|
mean value: 0.8967250632461273
|
|
|
|
key: train_fscore
|
|
value: [0.89787234 0.90364026 0.90336134 0.89171975 0.89552239 0.90105263
|
|
0.90021231 0.8940678 0.8974359 0.9017094 ]
|
|
|
|
mean value: 0.8986594116764762
|
|
|
|
key: test_precision
|
|
value: [0.88888889 0.82758621 0.95238095 0.96153846 0.89285714 0.95454545
|
|
0.85185185 0.96 0.86206897 0.82758621]
|
|
|
|
mean value: 0.8979304131373097
|
|
|
|
key: train_precision
|
|
value: [0.89787234 0.9055794 0.89211618 0.88983051 0.8974359 0.89166667
|
|
0.89830508 0.89029536 0.90128755 0.9055794 ]
|
|
|
|
mean value: 0.8969968390902169
|
|
|
|
key: test_recall
|
|
value: [0.92307692 0.88888889 0.76923077 0.96153846 0.96153846 0.80769231
|
|
0.88461538 0.92307692 0.96153846 0.92307692]
|
|
|
|
mean value: 0.9004273504273504
|
|
|
|
key: train_recall
|
|
value: [0.89787234 0.9017094 0.91489362 0.89361702 0.89361702 0.9106383
|
|
0.90212766 0.89787234 0.89361702 0.89787234]
|
|
|
|
mean value: 0.9003837061283869
|
|
|
|
key: test_roc_auc
|
|
value: [0.90598291 0.8482906 0.86538462 0.96153846 0.92307692 0.88461538
|
|
0.86538462 0.94230769 0.90384615 0.86538462]
|
|
|
|
mean value: 0.8965811965811966
|
|
|
|
key: train_roc_auc
|
|
value: [0.89765412 0.90404619 0.90212766 0.89148936 0.89574468 0.9
|
|
0.9 0.89361702 0.89787234 0.90212766]
|
|
|
|
mean value: 0.8984679032551373
|
|
|
|
key: test_jcc
|
|
value: [0.82758621 0.75 0.74074074 0.92592593 0.86206897 0.77777778
|
|
0.76666667 0.88888889 0.83333333 0.77419355]
|
|
|
|
mean value: 0.8147182054134223
|
|
|
|
key: train_jcc
|
|
value: [0.81467181 0.82421875 0.82375479 0.8045977 0.81081081 0.81992337
|
|
0.81853282 0.80842912 0.81395349 0.82101167]
|
|
|
|
mean value: 0.81599043363822
|
|
|
|
MCC on Blind test: 0.67
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.43118334 1.92359424 1.93823981 1.90809631 1.79002547 0.74983549
|
|
1.94170189 1.90422225 1.8327477 1.80478406]
|
|
|
|
mean value: 1.7224430561065673
|
|
|
|
key: score_time
|
|
value: [0.0124011 0.01241326 0.01531434 0.02261472 0.01246166 0.01324511
|
|
0.01439619 0.02267289 0.02315784 0.0149169 ]
|
|
|
|
mean value: 0.016359400749206544
|
|
|
|
key: test_mcc
|
|
value: [0.73646724 0.8116984 0.77849894 0.96225045 0.85634884 0.77849894
|
|
0.80829038 0.84615385 0.84615385 0.82305489]
|
|
|
|
mean value: 0.8247415773739302
|
|
|
|
key: train_mcc
|
|
value: [0.98297841 0.99147118 0.99148936 0.99152527 0.9873145 0.91084449
|
|
0.9957537 0.99148936 0.97894501 0.9957537 ]
|
|
|
|
mean value: 0.981756497283519
|
|
|
|
key: test_accuracy
|
|
value: [0.86792453 0.90566038 0.88461538 0.98076923 0.92307692 0.88461538
|
|
0.90384615 0.92307692 0.92307692 0.90384615]
|
|
|
|
mean value: 0.9100507982583455
|
|
|
|
key: train_accuracy
|
|
value: [0.99147122 0.99573561 0.99574468 0.99574468 0.99361702 0.95531915
|
|
0.99787234 0.99574468 0.9893617 0.99787234]
|
|
|
|
mean value: 0.9908483418772399
|
|
|
|
key: test_fscore
|
|
value: [0.86792453 0.90909091 0.875 0.98113208 0.92857143 0.875
|
|
0.90196078 0.92307692 0.92307692 0.9122807 ]
|
|
|
|
mean value: 0.909711427365788
|
|
|
|
key: train_fscore
|
|
value: [0.99145299 0.9957265 0.99574468 0.9957265 0.99357602 0.95578947
|
|
0.9978678 0.99574468 0.98924731 0.99787686]
|
|
|
|
mean value: 0.9908752808838321
|
|
|
|
key: test_precision
|
|
value: [0.85185185 0.89285714 0.95454545 0.96296296 0.86666667 0.95454545
|
|
0.92 0.92307692 0.92307692 0.83870968]
|
|
|
|
mean value: 0.9088293057002734
|
|
|
|
key: train_precision
|
|
value: [0.99570815 0.9957265 0.99574468 1. 1. 0.94583333
|
|
1. 0.99574468 1. 0.99576271]
|
|
|
|
mean value: 0.9924520057132802
|
|
|
|
key: test_recall
|
|
value: [0.88461538 0.92592593 0.80769231 1. 1. 0.80769231
|
|
0.88461538 0.92307692 0.92307692 1. ]
|
|
|
|
mean value: 0.9156695156695157
|
|
|
|
key: train_recall
|
|
value: [0.98723404 0.9957265 0.99574468 0.99148936 0.98723404 0.96595745
|
|
0.99574468 0.99574468 0.9787234 1. ]
|
|
|
|
mean value: 0.9893598836152028
|
|
|
|
key: test_roc_auc
|
|
value: [0.86823362 0.90527066 0.88461538 0.98076923 0.92307692 0.88461538
|
|
0.90384615 0.92307692 0.92307692 0.90384615]
|
|
|
|
mean value: 0.9100427350427351
|
|
|
|
key: train_roc_auc
|
|
value: [0.99148027 0.99573559 0.99574468 0.99574468 0.99361702 0.95531915
|
|
0.99787234 0.99574468 0.9893617 0.99787234]
|
|
|
|
mean value: 0.9908492453173304
|
|
|
|
key: test_jcc
|
|
value: [0.76666667 0.83333333 0.77777778 0.96296296 0.86666667 0.77777778
|
|
0.82142857 0.85714286 0.85714286 0.83870968]
|
|
|
|
mean value: 0.8359609148318826
|
|
|
|
key: train_jcc
|
|
value: [0.98305085 0.99148936 0.99152542 0.99148936 0.98723404 0.91532258
|
|
0.99574468 0.99152542 0.9787234 0.99576271]
|
|
|
|
mean value: 0.9821867838488652
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0287106 0.02211738 0.02123046 0.02213478 0.01955199 0.02208471
|
|
0.0208869 0.02009106 0.02331758 0.02205229]
|
|
|
|
mean value: 0.022217774391174318
|
|
|
|
key: score_time
|
|
value: [0.01214767 0.00964355 0.00877619 0.00889492 0.00886869 0.00891852
|
|
0.00902438 0.00888228 0.00897551 0.0089376 ]
|
|
|
|
mean value: 0.009306931495666504
|
|
|
|
key: test_mcc
|
|
value: [0.81688878 0.92704716 0.92307692 0.88527041 0.84866842 0.96225045
|
|
0.84615385 0.84866842 0.77151675 1. ]
|
|
|
|
mean value: 0.8829541177251579
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.90566038 0.96226415 0.96153846 0.94230769 0.92307692 0.98076923
|
|
0.92307692 0.92307692 0.88461538 1. ]
|
|
|
|
mean value: 0.9406386066763426
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.96428571 0.96153846 0.94339623 0.92592593 0.98039216
|
|
0.92307692 0.92592593 0.88888889 1. ]
|
|
|
|
mean value: 0.9422521132010588
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.86206897 0.93103448 0.96153846 0.92592593 0.89285714 1.
|
|
0.92307692 0.89285714 0.85714286 1. ]
|
|
|
|
mean value: 0.9246501901674316
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96153846 1. 0.96153846 0.96153846 0.96153846 0.96153846
|
|
0.92307692 0.96153846 0.92307692 1. ]
|
|
|
|
mean value: 0.9615384615384616
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.90669516 0.96153846 0.96153846 0.94230769 0.92307692 0.98076923
|
|
0.92307692 0.92307692 0.88461538 1. ]
|
|
|
|
mean value: 0.9406695156695157
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.93103448 0.92592593 0.89285714 0.86206897 0.96153846
|
|
0.85714286 0.86206897 0.8 1. ]
|
|
|
|
mean value: 0.8925970134590824
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.87
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.13061213 0.12142134 0.12343693 0.12211251 0.12081861 0.12108946
|
|
0.119946 0.12066436 0.12075686 0.12019539]
|
|
|
|
mean value: 0.12210536003112793
|
|
|
|
key: score_time
|
|
value: [0.01764417 0.01821375 0.01812148 0.01800776 0.01792765 0.01793003
|
|
0.01794624 0.01797581 0.01795745 0.01803088]
|
|
|
|
mean value: 0.017975521087646485
|
|
|
|
key: test_mcc
|
|
value: [0.74106548 0.70042867 0.81312325 0.88527041 0.89056356 0.84866842
|
|
0.76923077 0.88527041 0.89056356 0.66628253]
|
|
|
|
mean value: 0.8090467064561273
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.86792453 0.8490566 0.90384615 0.94230769 0.94230769 0.92307692
|
|
0.88461538 0.94230769 0.94230769 0.82692308]
|
|
|
|
mean value: 0.902467343976778
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.87272727 0.84615385 0.89795918 0.94117647 0.94545455 0.92
|
|
0.88461538 0.94117647 0.94545455 0.84210526]
|
|
|
|
mean value: 0.9036822982413429
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.82758621 0.88 0.95652174 0.96 0.89655172 0.95833333
|
|
0.88461538 0.96 0.89655172 0.77419355]
|
|
|
|
mean value: 0.8994353660638663
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.92307692 0.81481481 0.84615385 0.92307692 1. 0.88461538
|
|
0.88461538 0.92307692 1. 0.92307692]
|
|
|
|
mean value: 0.9122507122507123
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.86894587 0.8497151 0.90384615 0.94230769 0.94230769 0.92307692
|
|
0.88461538 0.94230769 0.94230769 0.82692308]
|
|
|
|
mean value: 0.9026353276353276
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.77419355 0.73333333 0.81481481 0.88888889 0.89655172 0.85185185
|
|
0.79310345 0.88888889 0.89655172 0.72727273]
|
|
|
|
mean value: 0.8265450949989326
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.81
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01057506 0.01090622 0.01024985 0.0104661 0.01062894 0.01125455
|
|
0.01053858 0.01054215 0.01159382 0.01146913]
|
|
|
|
mean value: 0.010822439193725586
|
|
|
|
key: score_time
|
|
value: [0.00924444 0.00917506 0.00940752 0.00961065 0.00959969 0.00898838
|
|
0.00904393 0.00929976 0.00917506 0.00896525]
|
|
|
|
mean value: 0.009250974655151368
|
|
|
|
key: test_mcc
|
|
value: [0.43536101 0.53035501 0.35273781 0.66628253 0.73568294 0.69230769
|
|
0.58789635 0.73131034 0.69230769 0.63245553]
|
|
|
|
mean value: 0.605669690697596
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.71698113 0.75471698 0.67307692 0.82692308 0.86538462 0.84615385
|
|
0.78846154 0.86538462 0.84615385 0.80769231]
|
|
|
|
mean value: 0.7990928882438316
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.69387755 0.72340426 0.63829787 0.80851064 0.85714286 0.84615385
|
|
0.76595745 0.8627451 0.84615385 0.82758621]
|
|
|
|
mean value: 0.7869829618172682
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.73913043 0.85 0.71428571 0.9047619 0.91304348 0.84615385
|
|
0.85714286 0.88 0.84615385 0.75 ]
|
|
|
|
mean value: 0.8300672081541647
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.65384615 0.62962963 0.57692308 0.73076923 0.80769231 0.84615385
|
|
0.69230769 0.84615385 0.84615385 0.92307692]
|
|
|
|
mean value: 0.7552706552706553
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.71581197 0.75712251 0.67307692 0.82692308 0.86538462 0.84615385
|
|
0.78846154 0.86538462 0.84615385 0.80769231]
|
|
|
|
mean value: 0.7992165242165242
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.53125 0.56666667 0.46875 0.67857143 0.75 0.73333333
|
|
0.62068966 0.75862069 0.73333333 0.70588235]
|
|
|
|
mean value: 0.6547097459673524
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.82980156 1.81833506 1.86908293 1.83907104 1.79196358 1.76676679
|
|
1.80318642 1.84923625 1.91046953 1.88992739]
|
|
|
|
mean value: 1.836784052848816
|
|
|
|
key: score_time
|
|
value: [0.10081434 0.10086036 0.09425735 0.09559989 0.09269404 0.09352255
|
|
0.09992027 0.10102367 0.10104275 0.10122085]
|
|
|
|
mean value: 0.09809560775756836
|
|
|
|
key: test_mcc
|
|
value: [0.81688878 0.92450142 0.9258201 0.92307692 0.9258201 0.9258201
|
|
0.89056356 0.96225045 0.9258201 0.9258201 ]
|
|
|
|
mean value: 0.914638163414845
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.90566038 0.96226415 0.96153846 0.96153846 0.96153846 0.96153846
|
|
0.94230769 0.98076923 0.96153846 0.96153846]
|
|
|
|
mean value: 0.9560232220609579
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.96296296 0.96 0.96153846 0.96296296 0.96
|
|
0.93877551 0.98039216 0.96296296 0.96296296]
|
|
|
|
mean value: 0.9561648889548049
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.86206897 0.96296296 1. 0.96153846 0.92857143 1.
|
|
1. 1. 0.92857143 0.92857143]
|
|
|
|
mean value: 0.9572284675732952
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96153846 0.96296296 0.92307692 0.96153846 1. 0.92307692
|
|
0.88461538 0.96153846 1. 1. ]
|
|
|
|
mean value: 0.9578347578347578
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.90669516 0.96225071 0.96153846 0.96153846 0.96153846 0.96153846
|
|
0.94230769 0.98076923 0.96153846 0.96153846]
|
|
|
|
mean value: 0.9561253561253562
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.92857143 0.92307692 0.92592593 0.92857143 0.92307692
|
|
0.88461538 0.96153846 0.92857143 0.92857143]
|
|
|
|
mean value: 0.9165852665852666
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.81
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC0...05', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.00343037 0.91479254 0.98317862 0.97180915 0.9955368 1.05391216
|
|
0.92968369 0.95286179 1.04950261 0.98119926]
|
|
|
|
mean value: 0.9835906982421875
|
|
|
|
key: score_time
|
|
value: [0.20472693 0.27165556 0.25223088 0.27607179 0.22573662 0.22320628
|
|
0.12021565 0.27948952 0.23237991 0.22426629]
|
|
|
|
mean value: 0.23099794387817382
|
|
|
|
key: test_mcc
|
|
value: [0.81688878 0.77350427 0.9258201 0.92307692 0.9258201 0.88527041
|
|
0.89056356 0.96225045 0.9258201 0.88527041]
|
|
|
|
mean value: 0.891428510912105
|
|
|
|
key: train_mcc
|
|
value: [0.96588471 0.95309971 0.95744681 0.95320012 0.95748148 0.95320012
|
|
0.95744681 0.95320012 0.94893617 0.95320012]
|
|
|
|
mean value: 0.955309616755026
|
|
|
|
key: test_accuracy
|
|
value: [0.90566038 0.88679245 0.96153846 0.96153846 0.96153846 0.94230769
|
|
0.94230769 0.98076923 0.96153846 0.94230769]
|
|
|
|
mean value: 0.9446298984034833
|
|
|
|
key: train_accuracy
|
|
value: [0.98294243 0.97654584 0.9787234 0.97659574 0.9787234 0.97659574
|
|
0.9787234 0.97659574 0.97446809 0.97659574]
|
|
|
|
mean value: 0.9776509549516853
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.88888889 0.96 0.96153846 0.96296296 0.94117647
|
|
0.93877551 0.98039216 0.96296296 0.94339623]
|
|
|
|
mean value: 0.9449184549514342
|
|
|
|
key: train_fscore
|
|
value: [0.98297872 0.9764454 0.9787234 0.97654584 0.97863248 0.97654584
|
|
0.9787234 0.97654584 0.97446809 0.97654584]
|
|
|
|
mean value: 0.9776154860669302
|
|
|
|
key: test_precision
|
|
value: [0.86206897 0.88888889 1. 0.96153846 0.92857143 0.96
|
|
1. 1. 0.92857143 0.92592593]
|
|
|
|
mean value: 0.9455565099013374
|
|
|
|
key: train_precision
|
|
value: [0.98297872 0.97854077 0.9787234 0.97863248 0.98283262 0.97863248
|
|
0.9787234 0.97863248 0.97446809 0.97863248]
|
|
|
|
mean value: 0.9790796922109131
|
|
|
|
key: test_recall
|
|
value: [0.96153846 0.88888889 0.92307692 0.96153846 1. 0.92307692
|
|
0.88461538 0.96153846 1. 0.96153846]
|
|
|
|
mean value: 0.9465811965811965
|
|
|
|
key: train_recall
|
|
value: [0.98297872 0.97435897 0.9787234 0.97446809 0.97446809 0.97446809
|
|
0.9787234 0.97446809 0.97446809 0.97446809]
|
|
|
|
mean value: 0.9761593016912166
|
|
|
|
key: test_roc_auc
|
|
value: [0.90669516 0.88675214 0.96153846 0.96153846 0.96153846 0.94230769
|
|
0.94230769 0.98076923 0.96153846 0.94230769]
|
|
|
|
mean value: 0.9447293447293448
|
|
|
|
key: train_roc_auc
|
|
value: [0.98294235 0.97654119 0.9787234 0.97659574 0.9787234 0.97659574
|
|
0.9787234 0.97659574 0.97446809 0.97659574]
|
|
|
|
mean value: 0.977650481905801
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.8 0.92307692 0.92592593 0.92857143 0.88888889
|
|
0.88461538 0.96153846 0.92857143 0.89285714]
|
|
|
|
mean value: 0.8967378917378918
|
|
|
|
key: train_jcc
|
|
value: [0.9665272 0.9539749 0.95833333 0.95416667 0.958159 0.95416667
|
|
0.95833333 0.95416667 0.95020747 0.95416667]
|
|
|
|
mean value: 0.9562201890079111
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02466607 0.01180482 0.01069355 0.01015139 0.01118636 0.01110482
|
|
0.01029754 0.01127911 0.01162481 0.0100522 ]
|
|
|
|
mean value: 0.012286067008972168
|
|
|
|
key: score_time
|
|
value: [0.01343274 0.01010823 0.00889921 0.00962663 0.00955248 0.00959492
|
|
0.00926352 0.00954437 0.01042676 0.00954938]
|
|
|
|
mean value: 0.009999823570251466
|
|
|
|
key: test_mcc
|
|
value: [0.73646724 0.50997151 0.6172134 0.88527041 0.69436507 0.69230769
|
|
0.69436507 0.77151675 0.69436507 0.69230769]
|
|
|
|
mean value: 0.6988149917958625
|
|
|
|
key: train_mcc
|
|
value: [0.74840423 0.76129503 0.71066404 0.74894295 0.75778307 0.77046393
|
|
0.71925314 0.68550371 0.74075423 0.75330062]
|
|
|
|
mean value: 0.739636494142086
|
|
|
|
key: test_accuracy
|
|
value: [0.86792453 0.75471698 0.80769231 0.94230769 0.84615385 0.84615385
|
|
0.84615385 0.88461538 0.84615385 0.84615385]
|
|
|
|
mean value: 0.8488026124818577
|
|
|
|
key: train_accuracy
|
|
value: [0.87420043 0.88059701 0.85531915 0.87446809 0.8787234 0.88510638
|
|
0.85957447 0.84255319 0.87021277 0.87659574]
|
|
|
|
mean value: 0.8697350632853967
|
|
|
|
key: test_fscore
|
|
value: [0.86792453 0.75471698 0.8 0.94117647 0.85185185 0.84615385
|
|
0.84 0.88 0.85185185 0.84615385]
|
|
|
|
mean value: 0.8479829376033594
|
|
|
|
key: train_fscore
|
|
value: [0.87473461 0.87931034 0.85470085 0.87473461 0.88050314 0.88655462
|
|
0.8583691 0.83982684 0.86825054 0.87553648]
|
|
|
|
mean value: 0.869252113965142
|
|
|
|
key: test_precision
|
|
value: [0.85185185 0.76923077 0.83333333 0.96 0.82142857 0.84615385
|
|
0.875 0.91666667 0.82142857 0.84615385]
|
|
|
|
mean value: 0.8541247456247456
|
|
|
|
key: train_precision
|
|
value: [0.87288136 0.88695652 0.8583691 0.87288136 0.8677686 0.87551867
|
|
0.86580087 0.85462555 0.88157895 0.88311688]
|
|
|
|
mean value: 0.8719497846503439
|
|
|
|
key: test_recall
|
|
value: [0.88461538 0.74074074 0.76923077 0.92307692 0.88461538 0.84615385
|
|
0.80769231 0.84615385 0.88461538 0.84615385]
|
|
|
|
mean value: 0.8433048433048433
|
|
|
|
key: train_recall
|
|
value: [0.87659574 0.87179487 0.85106383 0.87659574 0.89361702 0.89787234
|
|
0.85106383 0.82553191 0.85531915 0.86808511]
|
|
|
|
mean value: 0.8667539552645935
|
|
|
|
key: test_roc_auc
|
|
value: [0.86823362 0.75498575 0.80769231 0.94230769 0.84615385 0.84615385
|
|
0.84615385 0.88461538 0.84615385 0.84615385]
|
|
|
|
mean value: 0.8488603988603989
|
|
|
|
key: train_roc_auc
|
|
value: [0.87419531 0.88057829 0.85531915 0.87446809 0.8787234 0.88510638
|
|
0.85957447 0.84255319 0.87021277 0.87659574]
|
|
|
|
mean value: 0.8697326786688488
|
|
|
|
key: test_jcc
|
|
value: [0.76666667 0.60606061 0.66666667 0.88888889 0.74193548 0.73333333
|
|
0.72413793 0.78571429 0.74193548 0.73333333]
|
|
|
|
mean value: 0.7388672679440199
|
|
|
|
key: train_jcc
|
|
value: [0.77735849 0.78461538 0.74626866 0.77735849 0.78651685 0.79622642
|
|
0.7518797 0.7238806 0.76717557 0.77862595]
|
|
|
|
mean value: 0.7689906114471405
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC0...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.0810225 0.0706861 0.15280747 0.10494733 0.06948256 0.08076954
|
|
0.07875824 0.08115697 0.0695591 0.0707655 ]
|
|
|
|
mean value: 0.08599553108215333
|
|
|
|
key: score_time
|
|
value: [0.01104665 0.010849 0.01349568 0.01159859 0.01075864 0.01105809
|
|
0.01134682 0.01259756 0.01232362 0.01064587]
|
|
|
|
mean value: 0.011572051048278808
|
|
|
|
key: test_mcc
|
|
value: [0.85164138 0.96291111 0.96225045 0.96225045 0.9258201 0.96225045
|
|
0.84866842 0.9258201 0.9258201 0.96225045]
|
|
|
|
mean value: 0.9289683006952936
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9245283 0.98113208 0.98076923 0.98076923 0.96153846 0.98076923
|
|
0.92307692 0.96153846 0.96153846 0.98076923]
|
|
|
|
mean value: 0.9636429608127721
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.92592593 0.98181818 0.98039216 0.98113208 0.96296296 0.98039216
|
|
0.92 0.96 0.96296296 0.98113208]
|
|
|
|
mean value: 0.963671849833892
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.89285714 0.96428571 1. 0.96296296 0.92857143 1.
|
|
0.95833333 1. 0.92857143 0.96296296]
|
|
|
|
mean value: 0.9598544973544973
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96153846 1. 0.96153846 1. 1. 0.96153846
|
|
0.88461538 0.92307692 1. 1. ]
|
|
|
|
mean value: 0.9692307692307692
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.92521368 0.98076923 0.98076923 0.98076923 0.96153846 0.98076923
|
|
0.92307692 0.96153846 0.96153846 0.98076923]
|
|
|
|
mean value: 0.9636752136752137
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.86206897 0.96428571 0.96153846 0.96296296 0.92857143 0.96153846
|
|
0.85185185 0.92307692 0.92857143 0.96296296]
|
|
|
|
mean value: 0.9307429160877436
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.86
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.03567243 0.04290986 0.0472014 0.07730103 0.06250048 0.05814052
|
|
0.0723269 0.04689837 0.08232975 0.05255485]
|
|
|
|
mean value: 0.05778355598449707
|
|
|
|
key: score_time
|
|
value: [0.01217079 0.01232409 0.01893544 0.01794481 0.01220798 0.01889324
|
|
0.01251721 0.01215696 0.01258445 0.0193646 ]
|
|
|
|
mean value: 0.014909958839416504
|
|
|
|
key: test_mcc
|
|
value: [0.73646724 0.73997003 0.77849894 0.88527041 0.89056356 0.77849894
|
|
0.57735027 0.73131034 0.74466871 0.71151247]
|
|
|
|
mean value: 0.7574110914556645
|
|
|
|
key: train_mcc
|
|
value: [0.89794254 0.89379475 0.91542421 0.90220118 0.91922384 0.91084449
|
|
0.91084449 0.91922384 0.91935705 0.90641581]
|
|
|
|
mean value: 0.909527219993544
|
|
|
|
key: test_accuracy
|
|
value: [0.86792453 0.86792453 0.88461538 0.94230769 0.94230769 0.88461538
|
|
0.78846154 0.86538462 0.86538462 0.84615385]
|
|
|
|
mean value: 0.8755079825834543
|
|
|
|
key: train_accuracy
|
|
value: [0.94882729 0.9466951 0.95744681 0.95106383 0.95957447 0.95531915
|
|
0.95531915 0.95957447 0.95957447 0.95319149]
|
|
|
|
mean value: 0.9546586217846935
|
|
|
|
key: test_fscore
|
|
value: [0.86792453 0.87719298 0.875 0.94339623 0.94545455 0.875
|
|
0.79245283 0.8627451 0.87719298 0.86206897]
|
|
|
|
mean value: 0.8778428158828944
|
|
|
|
key: train_fscore
|
|
value: [0.94957983 0.94736842 0.958159 0.95137421 0.95983087 0.95578947
|
|
0.95578947 0.95983087 0.96 0.95338983]
|
|
|
|
mean value: 0.9551111967481583
|
|
|
|
key: test_precision
|
|
value: [0.85185185 0.83333333 0.95454545 0.92592593 0.89655172 0.95454545
|
|
0.77777778 0.88 0.80645161 0.78125 ]
|
|
|
|
mean value: 0.8662233135020955
|
|
|
|
key: train_precision
|
|
value: [0.93775934 0.93360996 0.94238683 0.94537815 0.95378151 0.94583333
|
|
0.94583333 0.95378151 0.95 0.94936709]
|
|
|
|
mean value: 0.945773105762638
|
|
|
|
key: test_recall
|
|
value: [0.88461538 0.92592593 0.80769231 0.96153846 1. 0.80769231
|
|
0.80769231 0.84615385 0.96153846 0.96153846]
|
|
|
|
mean value: 0.8964387464387464
|
|
|
|
key: train_recall
|
|
value: [0.96170213 0.96153846 0.97446809 0.95744681 0.96595745 0.96595745
|
|
0.96595745 0.96595745 0.97021277 0.95744681]
|
|
|
|
mean value: 0.9646644844517185
|
|
|
|
key: test_roc_auc
|
|
value: [0.86823362 0.86680912 0.88461538 0.94230769 0.94230769 0.88461538
|
|
0.78846154 0.86538462 0.86538462 0.84615385]
|
|
|
|
mean value: 0.8754273504273504
|
|
|
|
key: train_roc_auc
|
|
value: [0.94879978 0.94672668 0.95744681 0.95106383 0.95957447 0.95531915
|
|
0.95531915 0.95957447 0.95957447 0.95319149]
|
|
|
|
mean value: 0.9546590289143482
|
|
|
|
key: test_jcc
|
|
value: [0.76666667 0.78125 0.77777778 0.89285714 0.89655172 0.77777778
|
|
0.65625 0.75862069 0.78125 0.75757576]
|
|
|
|
mean value: 0.7846577536448226
|
|
|
|
key: train_jcc
|
|
value: [0.904 0.9 0.91967871 0.90725806 0.92276423 0.91532258
|
|
0.91532258 0.92276423 0.92307692 0.91093117]
|
|
|
|
mean value: 0.9141118493116434
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.0105176 0.01079559 0.01000309 0.00996852 0.00984311 0.00979662
|
|
0.01083136 0.00997591 0.00995231 0.01000428]
|
|
|
|
mean value: 0.010168838500976562
|
|
|
|
key: score_time
|
|
value: [0.01233459 0.0136776 0.00898337 0.00869012 0.00872898 0.0086844
|
|
0.00921631 0.00893378 0.00907898 0.00953197]
|
|
|
|
mean value: 0.009786009788513184
|
|
|
|
key: test_mcc
|
|
value: [0.6980057 0.51359557 0.73568294 0.84615385 0.84615385 0.73568294
|
|
0.65824263 0.65433031 0.73568294 0.6172134 ]
|
|
|
|
mean value: 0.704074411237807
|
|
|
|
key: train_mcc
|
|
value: [0.68485508 0.7273009 0.70669657 0.70654292 0.74910575 0.75745367
|
|
0.69818215 0.6730782 0.69863813 0.71087004]
|
|
|
|
mean value: 0.7112723392902409
|
|
|
|
key: test_accuracy
|
|
value: [0.8490566 0.75471698 0.86538462 0.92307692 0.92307692 0.86538462
|
|
0.82692308 0.82692308 0.86538462 0.80769231]
|
|
|
|
mean value: 0.8507619738751815
|
|
|
|
key: train_accuracy
|
|
value: [0.84221748 0.86353945 0.85319149 0.85319149 0.87446809 0.8787234
|
|
0.84893617 0.83617021 0.84893617 0.85531915]
|
|
|
|
mean value: 0.8554693099850292
|
|
|
|
key: test_fscore
|
|
value: [0.84615385 0.74509804 0.85714286 0.92307692 0.92307692 0.85714286
|
|
0.81632653 0.82352941 0.87272727 0.8 ]
|
|
|
|
mean value: 0.8464274660913317
|
|
|
|
key: train_fscore
|
|
value: [0.83982684 0.86147186 0.85097192 0.8516129 0.87311828 0.87846482
|
|
0.84665227 0.83224401 0.8453159 0.85344828]
|
|
|
|
mean value: 0.8533127081638621
|
|
|
|
key: test_precision
|
|
value: [0.84615385 0.79166667 0.91304348 0.92307692 0.92307692 0.91304348
|
|
0.86956522 0.84 0.82758621 0.83333333]
|
|
|
|
mean value: 0.8680546073117288
|
|
|
|
key: train_precision
|
|
value: [0.85462555 0.87280702 0.86403509 0.86086957 0.8826087 0.88034188
|
|
0.85964912 0.85267857 0.86607143 0.86462882]
|
|
|
|
mean value: 0.8658315740903113
|
|
|
|
key: test_recall
|
|
value: [0.84615385 0.7037037 0.80769231 0.92307692 0.92307692 0.80769231
|
|
0.76923077 0.80769231 0.92307692 0.76923077]
|
|
|
|
mean value: 0.8280626780626781
|
|
|
|
key: train_recall
|
|
value: [0.82553191 0.85042735 0.83829787 0.84255319 0.86382979 0.87659574
|
|
0.83404255 0.81276596 0.82553191 0.84255319]
|
|
|
|
mean value: 0.8412129478086925
|
|
|
|
key: test_roc_auc
|
|
value: [0.84900285 0.75569801 0.86538462 0.92307692 0.92307692 0.86538462
|
|
0.82692308 0.82692308 0.86538462 0.80769231]
|
|
|
|
mean value: 0.8508547008547008
|
|
|
|
key: train_roc_auc
|
|
value: [0.84225314 0.86351155 0.85319149 0.85319149 0.87446809 0.8787234
|
|
0.84893617 0.83617021 0.84893617 0.85531915]
|
|
|
|
mean value: 0.8554700854700855
|
|
|
|
key: test_jcc
|
|
value: [0.73333333 0.59375 0.75 0.85714286 0.85714286 0.75
|
|
0.68965517 0.7 0.77419355 0.66666667]
|
|
|
|
mean value: 0.7371884435086604
|
|
|
|
key: train_jcc
|
|
value: [0.7238806 0.75665399 0.7406015 0.74157303 0.77480916 0.78326996
|
|
0.7340824 0.71268657 0.73207547 0.7443609 ]
|
|
|
|
mean value: 0.7443993587281833
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01451564 0.01753974 0.02073836 0.02360868 0.0248158 0.01841426
|
|
0.02023935 0.01849508 0.02454448 0.02476811]
|
|
|
|
mean value: 0.020767951011657716
|
|
|
|
key: score_time
|
|
value: [0.01012969 0.01133466 0.01186538 0.01195502 0.01203322 0.01204181
|
|
0.01217699 0.0120542 0.01200485 0.01193643]
|
|
|
|
mean value: 0.011753225326538086
|
|
|
|
key: test_mcc
|
|
value: [0.77350427 0.70527596 0.74466871 0.88527041 0.85634884 0.6789146
|
|
0.77849894 0.79056942 0.88527041 0.69436507]
|
|
|
|
mean value: 0.7792686643983939
|
|
|
|
key: train_mcc
|
|
value: [0.87814682 0.88584735 0.88164966 0.92424143 0.93221879 0.84593758
|
|
0.87157206 0.76874221 0.86448019 0.85856681]
|
|
|
|
mean value: 0.8711402909907915
|
|
|
|
key: test_accuracy
|
|
value: [0.88679245 0.8490566 0.86538462 0.94230769 0.92307692 0.82692308
|
|
0.88461538 0.88461538 0.94230769 0.84615385]
|
|
|
|
mean value: 0.8851233671988389
|
|
|
|
key: train_accuracy
|
|
value: [0.93816631 0.9424307 0.94042553 0.96170213 0.96595745 0.9212766
|
|
0.93404255 0.87446809 0.92978723 0.92553191]
|
|
|
|
mean value: 0.9333788504287075
|
|
|
|
key: test_fscore
|
|
value: [0.88461538 0.86206897 0.85106383 0.94117647 0.92857143 0.8
|
|
0.875 0.86956522 0.94117647 0.84 ]
|
|
|
|
mean value: 0.8793237767059063
|
|
|
|
key: train_fscore
|
|
value: [0.93626374 0.94363257 0.93913043 0.96086957 0.96551724 0.91759465
|
|
0.93095768 0.85851319 0.9258427 0.92027335]
|
|
|
|
mean value: 0.9298595118619817
|
|
|
|
key: test_precision
|
|
value: [0.88461538 0.80645161 0.95238095 0.96 0.86666667 0.94736842
|
|
0.95454545 1. 0.96 0.875 ]
|
|
|
|
mean value: 0.9207028492164315
|
|
|
|
key: train_precision
|
|
value: [0.96818182 0.92244898 0.96 0.98222222 0.97816594 0.96261682
|
|
0.97663551 0.98351648 0.98095238 0.99019608]
|
|
|
|
mean value: 0.9704936238209341
|
|
|
|
key: test_recall
|
|
value: [0.88461538 0.92592593 0.76923077 0.92307692 1. 0.69230769
|
|
0.80769231 0.76923077 0.92307692 0.80769231]
|
|
|
|
mean value: 0.8502849002849003
|
|
|
|
key: train_recall
|
|
value: [0.90638298 0.96581197 0.91914894 0.94042553 0.95319149 0.87659574
|
|
0.8893617 0.76170213 0.87659574 0.85957447]
|
|
|
|
mean value: 0.8948790689216222
|
|
|
|
key: test_roc_auc
|
|
value: [0.88675214 0.84757835 0.86538462 0.94230769 0.92307692 0.82692308
|
|
0.88461538 0.88461538 0.94230769 0.84615385]
|
|
|
|
mean value: 0.88497150997151
|
|
|
|
key: train_roc_auc
|
|
value: [0.93823422 0.94248045 0.94042553 0.96170213 0.96595745 0.9212766
|
|
0.93404255 0.87446809 0.92978723 0.92553191]
|
|
|
|
mean value: 0.9333906164757229
|
|
|
|
key: test_jcc
|
|
value: [0.79310345 0.75757576 0.74074074 0.88888889 0.86666667 0.66666667
|
|
0.77777778 0.76923077 0.88888889 0.72413793]
|
|
|
|
mean value: 0.7873677535746502
|
|
|
|
key: train_jcc
|
|
value: [0.88016529 0.89328063 0.8852459 0.92468619 0.93333333 0.84773663
|
|
0.87083333 0.75210084 0.86192469 0.85232068]
|
|
|
|
mean value: 0.8701627509590387
|
|
|
|
MCC on Blind test: 0.71
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02051973 0.02308512 0.02264977 0.02195239 0.01951814 0.01922989
|
|
0.01749086 0.02073479 0.02035427 0.01736856]
|
|
|
|
mean value: 0.020290350914001463
|
|
|
|
key: score_time
|
|
value: [0.01103067 0.01198816 0.01198959 0.0119803 0.0121727 0.01211762
|
|
0.01197362 0.01196384 0.01207113 0.01193643]
|
|
|
|
mean value: 0.011922407150268554
|
|
|
|
key: test_mcc
|
|
value: [0.81688878 0.68308228 0.80829038 0.75878691 0.84615385 0.81312325
|
|
0.6789146 0.84866842 0.84866842 0.74466871]
|
|
|
|
mean value: 0.7847245604589297
|
|
|
|
key: train_mcc
|
|
value: [0.82318874 0.80844901 0.89143025 0.73855496 0.85379422 0.90278998
|
|
0.67317249 0.89094414 0.89427309 0.82331429]
|
|
|
|
mean value: 0.8299911175592818
|
|
|
|
key: test_accuracy
|
|
value: [0.90566038 0.83018868 0.90384615 0.86538462 0.92307692 0.90384615
|
|
0.82692308 0.92307692 0.92307692 0.86538462]
|
|
|
|
mean value: 0.8870464441219158
|
|
|
|
key: train_accuracy
|
|
value: [0.90618337 0.89765458 0.94468085 0.85531915 0.92340426 0.95106383
|
|
0.81489362 0.94468085 0.94680851 0.90851064]
|
|
|
|
mean value: 0.9093199655219344
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.85245902 0.90566038 0.88135593 0.92307692 0.89795918
|
|
0.8 0.92 0.92592593 0.85106383]
|
|
|
|
mean value: 0.8866592097509784
|
|
|
|
key: train_fscore
|
|
value: [0.91338583 0.90588235 0.94650206 0.87265918 0.91818182 0.9519833
|
|
0.7751938 0.94298246 0.94780793 0.90249433]
|
|
|
|
mean value: 0.9077073048926279
|
|
|
|
key: test_precision
|
|
value: [0.86206897 0.76470588 0.88888889 0.78787879 0.92307692 0.95652174
|
|
0.94736842 0.95833333 0.89285714 0.95238095]
|
|
|
|
mean value: 0.8934081036469277
|
|
|
|
key: train_precision
|
|
value: [0.84981685 0.83695652 0.91633466 0.77926421 0.98536585 0.93442623
|
|
0.98684211 0.97285068 0.93032787 0.96601942]
|
|
|
|
mean value: 0.9158204400448495
|
|
|
|
key: test_recall
|
|
value: [0.96153846 0.96296296 0.92307692 1. 0.92307692 0.84615385
|
|
0.69230769 0.88461538 0.96153846 0.76923077]
|
|
|
|
mean value: 0.8924501424501424
|
|
|
|
key: train_recall
|
|
value: [0.98723404 0.98717949 0.9787234 0.99148936 0.85957447 0.97021277
|
|
0.63829787 0.91489362 0.96595745 0.84680851]
|
|
|
|
mean value: 0.9140370976541189
|
|
|
|
key: test_roc_auc
|
|
value: [0.90669516 0.82763533 0.90384615 0.86538462 0.92307692 0.90384615
|
|
0.82692308 0.92307692 0.92307692 0.86538462]
|
|
|
|
mean value: 0.8868945868945869
|
|
|
|
key: train_roc_auc
|
|
value: [0.90601018 0.89784506 0.94468085 0.85531915 0.92340426 0.95106383
|
|
0.81489362 0.94468085 0.94680851 0.90851064]
|
|
|
|
mean value: 0.9093216948536097
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.74285714 0.82758621 0.78787879 0.85714286 0.81481481
|
|
0.66666667 0.85185185 0.86206897 0.74074074]
|
|
|
|
mean value: 0.7984941367699988
|
|
|
|
key: train_jcc
|
|
value: [0.84057971 0.82795699 0.8984375 0.77408638 0.8487395 0.90836653
|
|
0.63291139 0.89211618 0.90079365 0.82231405]
|
|
|
|
mean value: 0.8346301883150747
|
|
|
|
MCC on Blind test: 0.69
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.18447971 0.18334126 0.18150234 0.1833899 0.18306136 0.18128395
|
|
0.18111897 0.17938328 0.18043876 0.18003964]
|
|
|
|
mean value: 0.18180391788482667
|
|
|
|
key: score_time
|
|
value: [0.015414 0.01569867 0.0158639 0.01544142 0.01620245 0.01581216
|
|
0.01540232 0.01535916 0.01594925 0.01538944]
|
|
|
|
mean value: 0.015653276443481447
|
|
|
|
key: test_mcc
|
|
value: [0.85164138 0.96291111 0.96225045 0.96225045 0.9258201 0.96225045
|
|
0.81312325 0.96225045 0.9258201 0.92307692]
|
|
|
|
mean value: 0.9251394652761287
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9245283 0.98113208 0.98076923 0.98076923 0.96153846 0.98076923
|
|
0.90384615 0.98076923 0.96153846 0.96153846]
|
|
|
|
mean value: 0.9617198838896952
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.92592593 0.98181818 0.98039216 0.98113208 0.96296296 0.98039216
|
|
0.89795918 0.98039216 0.96296296 0.96153846]
|
|
|
|
mean value: 0.9615476224941898
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.89285714 0.96428571 1. 0.96296296 0.92857143 1.
|
|
0.95652174 1. 0.92857143 0.96153846]
|
|
|
|
mean value: 0.9595308877917573
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96153846 1. 0.96153846 1. 1. 0.96153846
|
|
0.84615385 0.96153846 1. 0.96153846]
|
|
|
|
mean value: 0.9653846153846154
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.92521368 0.98076923 0.98076923 0.98076923 0.96153846 0.98076923
|
|
0.90384615 0.98076923 0.96153846 0.96153846]
|
|
|
|
mean value: 0.9617521367521368
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.86206897 0.96428571 0.96153846 0.96296296 0.92857143 0.96153846
|
|
0.81481481 0.96153846 0.92857143 0.92592593]
|
|
|
|
mean value: 0.9271816625264901
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.9
|
|
|
|
Accuracy on Blind test: 0.95
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.06588101 0.05572891 0.06235313 0.06345272 0.08077598 0.07156444
|
|
0.08671784 0.07806897 0.08278823 0.08851504]
|
|
|
|
mean value: 0.07358462810516357
|
|
|
|
key: score_time
|
|
value: [0.02843833 0.02826238 0.02912283 0.02685213 0.03997207 0.02721906
|
|
0.03725529 0.02411723 0.03921318 0.03339529]
|
|
|
|
mean value: 0.03138477802276611
|
|
|
|
key: test_mcc
|
|
value: [0.85164138 0.92450142 0.9258201 0.96225045 0.9258201 0.96225045
|
|
0.84866842 0.88527041 0.89056356 0.96225045]
|
|
|
|
mean value: 0.9139036744202026
|
|
|
|
key: train_mcc
|
|
value: [0.98721586 0.98721563 0.98312115 0.97873227 0.9957537 0.9873145
|
|
0.98724298 0.99152527 0.97478586 0.98297872]
|
|
|
|
mean value: 0.9855885925245994
|
|
|
|
key: test_accuracy
|
|
value: [0.9245283 0.96226415 0.96153846 0.98076923 0.96153846 0.98076923
|
|
0.92307692 0.94230769 0.94230769 0.98076923]
|
|
|
|
mean value: 0.9559869375907112
|
|
|
|
key: train_accuracy
|
|
value: [0.99360341 0.99360341 0.99148936 0.9893617 0.99787234 0.99361702
|
|
0.99361702 0.99574468 0.98723404 0.99148936]
|
|
|
|
mean value: 0.9927632354942613
|
|
|
|
key: test_fscore
|
|
value: [0.92592593 0.96296296 0.96 0.98113208 0.96296296 0.98039216
|
|
0.92 0.94339623 0.94545455 0.98113208]
|
|
|
|
mean value: 0.9563358931527632
|
|
|
|
key: train_fscore
|
|
value: [0.99360341 0.99357602 0.99141631 0.98933902 0.9978678 0.99357602
|
|
0.99363057 0.9957265 0.98739496 0.99148936]
|
|
|
|
mean value: 0.9927619966475919
|
|
|
|
key: test_precision
|
|
value: [0.89285714 0.96296296 1. 0.96296296 0.92857143 1.
|
|
0.95833333 0.92592593 0.89655172 0.96296296]
|
|
|
|
mean value: 0.949112844371465
|
|
|
|
key: train_precision
|
|
value: [0.9957265 0.99570815 1. 0.99145299 1. 1.
|
|
0.99152542 1. 0.97510373 0.99148936]
|
|
|
|
mean value: 0.99410061615567
|
|
|
|
key: test_recall
|
|
value: [0.96153846 0.96296296 0.92307692 1. 1. 0.96153846
|
|
0.88461538 0.96153846 1. 1. ]
|
|
|
|
mean value: 0.9655270655270656
|
|
|
|
key: train_recall
|
|
value: [0.99148936 0.99145299 0.98297872 0.98723404 0.99574468 0.98723404
|
|
0.99574468 0.99148936 1. 0.99148936]
|
|
|
|
mean value: 0.9914857246772141
|
|
|
|
key: test_roc_auc
|
|
value: [0.92521368 0.96225071 0.96153846 0.98076923 0.96153846 0.98076923
|
|
0.92307692 0.94230769 0.94230769 0.98076923]
|
|
|
|
mean value: 0.9560541310541311
|
|
|
|
key: train_roc_auc
|
|
value: [0.99360793 0.99359884 0.99148936 0.9893617 0.99787234 0.99361702
|
|
0.99361702 0.99574468 0.98723404 0.99148936]
|
|
|
|
mean value: 0.9927632296781232
|
|
|
|
key: test_jcc
|
|
value: [0.86206897 0.92857143 0.92307692 0.96296296 0.92857143 0.96153846
|
|
0.85185185 0.89285714 0.89655172 0.96296296]
|
|
|
|
mean value: 0.9171013852048335
|
|
|
|
key: train_jcc
|
|
value: [0.98728814 0.98723404 0.98297872 0.97890295 0.99574468 0.98723404
|
|
0.98734177 0.99148936 0.97510373 0.98312236]
|
|
|
|
mean value: 0.9856439809704479
|
|
|
|
MCC on Blind test: 0.9
|
|
|
|
Accuracy on Blind test: 0.95
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.14703774 0.11411095 0.19120908 0.13331914 0.13890243 0.21100497
|
|
0.18627715 0.17440438 0.17817092 0.1670382 ]
|
|
|
|
mean value: 0.1641474962234497
|
|
|
|
key: score_time
|
|
value: [0.0247128 0.0149591 0.02889299 0.01530313 0.01501536 0.02427626
|
|
0.02445054 0.0240407 0.02411985 0.02399731]
|
|
|
|
mean value: 0.021976804733276366
|
|
|
|
key: test_mcc
|
|
value: [0.73646724 0.50997151 0.50336201 0.77151675 0.6789146 0.65433031
|
|
0.69436507 0.81312325 0.81312325 0.53846154]
|
|
|
|
mean value: 0.6713635523699482
|
|
|
|
key: train_mcc
|
|
value: [0.98728791 0.99150708 0.9873145 0.9873145 0.9873145 0.99152527
|
|
0.9873145 0.9873145 0.9873145 0.9873145 ]
|
|
|
|
mean value: 0.9881521740698564
|
|
|
|
key: test_accuracy
|
|
value: [0.86792453 0.75471698 0.75 0.88461538 0.82692308 0.82692308
|
|
0.84615385 0.90384615 0.90384615 0.76923077]
|
|
|
|
mean value: 0.8334179970972424
|
|
|
|
key: train_accuracy
|
|
value: [0.99360341 0.99573561 0.99361702 0.99361702 0.99361702 0.99574468
|
|
0.99361702 0.99361702 0.99361702 0.99361702]
|
|
|
|
mean value: 0.9940402848977
|
|
|
|
key: test_fscore
|
|
value: [0.86792453 0.75471698 0.73469388 0.88 0.84745763 0.82352941
|
|
0.84 0.89795918 0.90909091 0.76923077]
|
|
|
|
mean value: 0.832460328786348
|
|
|
|
key: train_fscore
|
|
value: [0.99357602 0.99570815 0.99357602 0.99357602 0.99357602 0.9957265
|
|
0.99357602 0.99357602 0.99357602 0.99357602]
|
|
|
|
mean value: 0.9940042787277901
|
|
|
|
key: test_precision
|
|
value: [0.85185185 0.76923077 0.7826087 0.91666667 0.75757576 0.84
|
|
0.875 0.95652174 0.86206897 0.76923077]
|
|
|
|
mean value: 0.8380755214855664
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.88461538 0.74074074 0.69230769 0.84615385 0.96153846 0.80769231
|
|
0.80769231 0.84615385 0.96153846 0.76923077]
|
|
|
|
mean value: 0.8317663817663817
|
|
|
|
key: train_recall
|
|
value: [0.98723404 0.99145299 0.98723404 0.98723404 0.98723404 0.99148936
|
|
0.98723404 0.98723404 0.98723404 0.98723404]
|
|
|
|
mean value: 0.9880814693580651
|
|
|
|
key: test_roc_auc
|
|
value: [0.86823362 0.75498575 0.75 0.88461538 0.82692308 0.82692308
|
|
0.84615385 0.90384615 0.90384615 0.76923077]
|
|
|
|
mean value: 0.8334757834757835
|
|
|
|
key: train_roc_auc
|
|
value: [0.99361702 0.9957265 0.99361702 0.99361702 0.99361702 0.99574468
|
|
0.99361702 0.99361702 0.99361702 0.99361702]
|
|
|
|
mean value: 0.9940407346790325
|
|
|
|
key: test_jcc
|
|
value: [0.76666667 0.60606061 0.58064516 0.78571429 0.73529412 0.7
|
|
0.72413793 0.81481481 0.83333333 0.625 ]
|
|
|
|
mean value: 0.7171666916561571
|
|
|
|
key: train_jcc
|
|
value: [0.98723404 0.99145299 0.98723404 0.98723404 0.98723404 0.99148936
|
|
0.98723404 0.98723404 0.98723404 0.98723404]
|
|
|
|
mean value: 0.9880814693580651
|
|
|
|
MCC on Blind test: 0.62
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.73474646 0.72688985 0.72868824 0.73179102 0.73014355 0.73290229
|
|
0.73310161 0.72938943 0.7374897 0.73583198]
|
|
|
|
mean value: 0.7320974111557007
|
|
|
|
key: score_time
|
|
value: [0.00944114 0.00925851 0.00926185 0.00938869 0.00955248 0.00936866
|
|
0.00937533 0.00944853 0.0102365 0.00951147]
|
|
|
|
mean value: 0.009484314918518066
|
|
|
|
key: test_mcc
|
|
value: [0.85164138 0.92704716 0.96225045 0.96225045 0.9258201 0.96225045
|
|
0.88527041 0.92307692 0.89056356 0.96225045]
|
|
|
|
mean value: 0.9252421331587133
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9245283 0.96226415 0.98076923 0.98076923 0.96153846 0.98076923
|
|
0.94230769 0.96153846 0.94230769 0.98076923]
|
|
|
|
mean value: 0.9617561683599419
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.92592593 0.96428571 0.98039216 0.98113208 0.96296296 0.98039216
|
|
0.94117647 0.96153846 0.94545455 0.98113208]
|
|
|
|
mean value: 0.9624392545424731
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.89285714 0.93103448 1. 0.96296296 0.92857143 1.
|
|
0.96 0.96153846 0.89655172 0.96296296]
|
|
|
|
mean value: 0.9496479165789511
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96153846 1. 0.96153846 1. 1. 0.96153846
|
|
0.92307692 0.96153846 1. 1. ]
|
|
|
|
mean value: 0.9769230769230769
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.92521368 0.96153846 0.98076923 0.98076923 0.96153846 0.98076923
|
|
0.94230769 0.96153846 0.94230769 0.98076923]
|
|
|
|
mean value: 0.9617521367521368
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.86206897 0.93103448 0.96153846 0.96296296 0.92857143 0.96153846
|
|
0.88888889 0.92592593 0.89655172 0.96296296]
|
|
|
|
mean value: 0.9282044264802886
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.9
|
|
|
|
Accuracy on Blind test: 0.95
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.03148365 0.05373335 0.05133128 0.03273869 0.03149581 0.03127074
|
|
0.03117824 0.03164434 0.03118968 0.03135705]
|
|
|
|
mean value: 0.03574228286743164
|
|
|
|
key: score_time
|
|
value: [0.01274085 0.01517105 0.01338124 0.01331782 0.01315212 0.01508641
|
|
0.01483965 0.01540351 0.014956 0.01507926]
|
|
|
|
mean value: 0.01431279182434082
|
|
|
|
key: test_mcc
|
|
value: [0.48187381 0.35897436 0.6172134 0.73131034 0.4259217 0.39528471
|
|
0.35273781 0.50951017 0.5990423 0.38575837]
|
|
|
|
mean value: 0.48576269695469176
|
|
|
|
key: train_mcc
|
|
value: [0.86418083 0.95749365 0.97029183 0.95361464 0.92156343 0.80568158
|
|
0.86066297 0.97873227 0.79494933 0.926125 ]
|
|
|
|
mean value: 0.9033295534903398
|
|
|
|
key: test_accuracy
|
|
value: [0.73584906 0.67924528 0.80769231 0.86538462 0.71153846 0.69230769
|
|
0.67307692 0.75 0.78846154 0.69230769]
|
|
|
|
mean value: 0.7395863570391872
|
|
|
|
key: train_accuracy
|
|
value: [0.92750533 0.97867804 0.98510638 0.97659574 0.95957447 0.89361702
|
|
0.92553191 0.9893617 0.88723404 0.96170213]
|
|
|
|
mean value: 0.9484906773125255
|
|
|
|
key: test_fscore
|
|
value: [0.69565217 0.67924528 0.8 0.86792453 0.69387755 0.65217391
|
|
0.63829787 0.77192982 0.75555556 0.68 ]
|
|
|
|
mean value: 0.7234656701755069
|
|
|
|
key: train_fscore
|
|
value: [0.92201835 0.97844828 0.98501071 0.9769392 0.9580574 0.88095238
|
|
0.91954023 0.98938429 0.87290168 0.96017699]
|
|
|
|
mean value: 0.9443429499014124
|
|
|
|
key: test_precision
|
|
value: [0.8 0.69230769 0.83333333 0.85185185 0.73913043 0.75
|
|
0.71428571 0.70967742 0.89473684 0.70833333]
|
|
|
|
mean value: 0.7693656621354635
|
|
|
|
key: train_precision
|
|
value: [1. 0.98695652 0.99137931 0.96280992 0.99541284 1.
|
|
1. 0.98728814 1. 1. ]
|
|
|
|
mean value: 0.9923846729069248
|
|
|
|
key: test_recall
|
|
value: [0.61538462 0.66666667 0.76923077 0.88461538 0.65384615 0.57692308
|
|
0.57692308 0.84615385 0.65384615 0.65384615]
|
|
|
|
mean value: 0.6897435897435897
|
|
|
|
key: train_recall
|
|
value: [0.85531915 0.97008547 0.9787234 0.99148936 0.92340426 0.78723404
|
|
0.85106383 0.99148936 0.77446809 0.92340426]
|
|
|
|
mean value: 0.9046681214766321
|
|
|
|
key: test_roc_auc
|
|
value: [0.73361823 0.67948718 0.80769231 0.86538462 0.71153846 0.69230769
|
|
0.67307692 0.75 0.78846154 0.69230769]
|
|
|
|
mean value: 0.7393874643874644
|
|
|
|
key: train_roc_auc
|
|
value: [0.92765957 0.97865976 0.98510638 0.97659574 0.95957447 0.89361702
|
|
0.92553191 0.9893617 0.88723404 0.96170213]
|
|
|
|
mean value: 0.9485042735042735
|
|
|
|
key: test_jcc
|
|
value: [0.53333333 0.51428571 0.66666667 0.76666667 0.53125 0.48387097
|
|
0.46875 0.62857143 0.60714286 0.51515152]
|
|
|
|
mean value: 0.5715689149560117
|
|
|
|
key: train_jcc
|
|
value: [0.85531915 0.95780591 0.97046414 0.95491803 0.91949153 0.78723404
|
|
0.85106383 0.9789916 0.77446809 0.92340426]
|
|
|
|
mean value: 0.897316055874549
|
|
|
|
MCC on Blind test: 0.57
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02873778 0.03986144 0.03904796 0.03797555 0.03872371 0.03989911
|
|
0.04019141 0.05784249 0.0290029 0.0340817 ]
|
|
|
|
mean value: 0.03853640556335449
|
|
|
|
key: score_time
|
|
value: [0.02153182 0.02466536 0.01894593 0.01888561 0.02026582 0.01964355
|
|
0.01979423 0.03150344 0.03011656 0.02050829]
|
|
|
|
mean value: 0.0225860595703125
|
|
|
|
key: test_mcc
|
|
value: [0.85164138 0.70527596 0.77849894 0.92307692 0.89056356 0.74466871
|
|
0.73131034 0.88527041 0.81312325 0.77849894]
|
|
|
|
mean value: 0.8101928417182709
|
|
|
|
key: train_mcc
|
|
value: [0.86403192 0.87219919 0.86461295 0.86411148 0.86828166 0.85995606
|
|
0.88136192 0.87262489 0.86847048 0.86395495]
|
|
|
|
mean value: 0.8679605508568666
|
|
|
|
key: test_accuracy
|
|
value: [0.9245283 0.8490566 0.88461538 0.96153846 0.94230769 0.86538462
|
|
0.86538462 0.94230769 0.90384615 0.88461538]
|
|
|
|
mean value: 0.9023584905660378
|
|
|
|
key: train_accuracy
|
|
value: [0.93176972 0.93603412 0.93191489 0.93191489 0.93404255 0.92978723
|
|
0.94042553 0.93617021 0.93404255 0.93191489]
|
|
|
|
mean value: 0.9338016603910538
|
|
|
|
key: test_fscore
|
|
value: [0.92592593 0.86206897 0.875 0.96153846 0.94545455 0.85106383
|
|
0.86792453 0.94117647 0.90909091 0.89285714]
|
|
|
|
mean value: 0.9032100779061583
|
|
|
|
key: train_fscore
|
|
value: [0.93305439 0.93644068 0.93333333 0.93277311 0.93473684 0.93081761
|
|
0.94142259 0.93697479 0.93501048 0.93248945]
|
|
|
|
mean value: 0.9347053283732041
|
|
|
|
key: test_precision
|
|
value: [0.89285714 0.80645161 0.95454545 0.96153846 0.89655172 0.95238095
|
|
0.85185185 0.96 0.86206897 0.83333333]
|
|
|
|
mean value: 0.8971579499065595
|
|
|
|
key: train_precision
|
|
value: [0.91769547 0.92857143 0.91428571 0.92116183 0.925 0.91735537
|
|
0.92592593 0.9253112 0.9214876 0.92468619]
|
|
|
|
mean value: 0.9221480738754971
|
|
|
|
key: test_recall
|
|
value: [0.96153846 0.92592593 0.80769231 0.96153846 1. 0.76923077
|
|
0.88461538 0.92307692 0.96153846 0.96153846]
|
|
|
|
mean value: 0.9156695156695157
|
|
|
|
key: train_recall
|
|
value: [0.94893617 0.94444444 0.95319149 0.94468085 0.94468085 0.94468085
|
|
0.95744681 0.94893617 0.94893617 0.94042553]
|
|
|
|
mean value: 0.9476359338061466
|
|
|
|
key: test_roc_auc
|
|
value: [0.92521368 0.84757835 0.88461538 0.96153846 0.94230769 0.86538462
|
|
0.86538462 0.94230769 0.90384615 0.88461538]
|
|
|
|
mean value: 0.9022792022792023
|
|
|
|
key: train_roc_auc
|
|
value: [0.93173304 0.93605201 0.93191489 0.93191489 0.93404255 0.92978723
|
|
0.94042553 0.93617021 0.93404255 0.93191489]
|
|
|
|
mean value: 0.9337997817785052
|
|
|
|
key: test_jcc
|
|
value: [0.86206897 0.75757576 0.77777778 0.92592593 0.89655172 0.74074074
|
|
0.76666667 0.88888889 0.83333333 0.80645161]
|
|
|
|
mean value: 0.8255981393467489
|
|
|
|
key: train_jcc
|
|
value: [0.8745098 0.88047809 0.875 0.87401575 0.87747036 0.87058824
|
|
0.88932806 0.88142292 0.87795276 0.87351779]
|
|
|
|
mean value: 0.877428376123688
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.14161825 0.28556347 0.29658747 0.24531269 0.1590662 0.27728081
|
|
0.2977221 0.21515226 0.22903538 0.18153119]
|
|
|
|
mean value: 0.23288698196411134
|
|
|
|
key: score_time
|
|
value: [0.0169847 0.02052855 0.02147937 0.02026939 0.01914334 0.02140737
|
|
0.02591467 0.01721787 0.02034116 0.01228237]
|
|
|
|
mean value: 0.019556879997253418
|
|
|
|
key: test_mcc
|
|
value: [0.85164138 0.62867836 0.74466871 0.92307692 0.89056356 0.74466871
|
|
0.73131034 0.88527041 0.81312325 0.77849894]
|
|
|
|
mean value: 0.7991500590969782
|
|
|
|
key: train_mcc
|
|
value: [0.86403192 0.80817284 0.80058734 0.86411148 0.86828166 0.85995606
|
|
0.88136192 0.87262489 0.86847048 0.86395495]
|
|
|
|
mean value: 0.855155354590099
|
|
|
|
key: test_accuracy
|
|
value: [0.9245283 0.81132075 0.86538462 0.96153846 0.94230769 0.86538462
|
|
0.86538462 0.94230769 0.90384615 0.88461538]
|
|
|
|
mean value: 0.8966618287373005
|
|
|
|
key: train_accuracy
|
|
value: [0.93176972 0.90405117 0.9 0.93191489 0.93404255 0.92978723
|
|
0.94042553 0.93617021 0.93404255 0.93191489]
|
|
|
|
mean value: 0.9274118767862813
|
|
|
|
key: test_fscore
|
|
value: [0.92592593 0.82758621 0.85106383 0.96153846 0.94545455 0.85106383
|
|
0.86792453 0.94117647 0.90909091 0.89285714]
|
|
|
|
mean value: 0.8973681850228127
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_sl.py:128: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
smnc_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_sl.py:131: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
smnc_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[0.93305439 0.9044586 0.90187891 0.93277311 0.93473684 0.93081761
|
|
0.94142259 0.93697479 0.93501048 0.93248945]
|
|
|
|
mean value: 0.9283616785563731
|
|
|
|
key: test_precision
|
|
value: [0.89285714 0.77419355 0.95238095 0.96153846 0.89655172 0.95238095
|
|
0.85185185 0.96 0.86206897 0.83333333]
|
|
|
|
mean value: 0.8937156932384963
|
|
|
|
key: train_precision
|
|
value: [0.91769547 0.89873418 0.8852459 0.92116183 0.925 0.91735537
|
|
0.92592593 0.9253112 0.9214876 0.92468619]
|
|
|
|
mean value: 0.9162603674752363
|
|
|
|
key: test_recall
|
|
value: [0.96153846 0.88888889 0.76923077 0.96153846 1. 0.76923077
|
|
0.88461538 0.92307692 0.96153846 0.96153846]
|
|
|
|
mean value: 0.9081196581196581
|
|
|
|
key: train_recall
|
|
value: [0.94893617 0.91025641 0.91914894 0.94468085 0.94468085 0.94468085
|
|
0.95744681 0.94893617 0.94893617 0.94042553]
|
|
|
|
mean value: 0.9408128750681942
|
|
|
|
key: test_roc_auc
|
|
value: [0.92521368 0.80982906 0.86538462 0.96153846 0.94230769 0.86538462
|
|
0.86538462 0.94230769 0.90384615 0.88461538]
|
|
|
|
mean value: 0.8965811965811966
|
|
|
|
key: train_roc_auc
|
|
value: [0.93173304 0.90406438 0.9 0.93191489 0.93404255 0.92978723
|
|
0.94042553 0.93617021 0.93404255 0.93191489]
|
|
|
|
mean value: 0.9274095290052737
|
|
|
|
key: test_jcc
|
|
value: [0.86206897 0.70588235 0.74074074 0.92592593 0.89655172 0.74074074
|
|
0.76666667 0.88888889 0.83333333 0.80645161]
|
|
|
|
mean value: 0.8167250951795871
|
|
|
|
key: train_jcc
|
|
value: [0.8745098 0.8255814 0.82129278 0.87401575 0.87747036 0.87058824
|
|
0.88932806 0.88142292 0.87795276 0.87351779]
|
|
|
|
mean value: 0.8665679844601714
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03287292 0.03884459 0.03730083 0.0362246 0.03666592 0.03770685
|
|
0.03795171 0.03666735 0.04778528 0.03587842]
|
|
|
|
mean value: 0.03778984546661377
|
|
|
|
key: score_time
|
|
value: [0.01219273 0.01410508 0.03120208 0.01247478 0.01239181 0.01472044
|
|
0.01600051 0.01480579 0.01244426 0.01247621]
|
|
|
|
mean value: 0.015281367301940917
|
|
|
|
key: test_mcc
|
|
value: [0.85164138 0.73997003 0.77849894 0.96225045 0.89056356 0.74466871
|
|
0.80829038 0.88527041 0.84866842 0.79056942]
|
|
|
|
mean value: 0.8300391695672018
|
|
|
|
key: train_mcc
|
|
value: [0.8593409 0.87640715 0.86411148 0.85113319 0.86815585 0.86395495
|
|
0.87246682 0.8597691 0.8597691 0.86386107]
|
|
|
|
mean value: 0.8638969612781384
|
|
|
|
key: test_accuracy
|
|
value: [0.9245283 0.86792453 0.88461538 0.98076923 0.94230769 0.86538462
|
|
0.90384615 0.94230769 0.92307692 0.88461538]
|
|
|
|
mean value: 0.9119375907111756
|
|
|
|
key: train_accuracy
|
|
value: [0.92963753 0.93816631 0.93191489 0.92553191 0.93404255 0.93191489
|
|
0.93617021 0.92978723 0.92978723 0.93191489]
|
|
|
|
mean value: 0.9318867667740326
|
|
|
|
key: test_fscore
|
|
value: [0.92592593 0.87719298 0.875 0.98113208 0.94545455 0.85106383
|
|
0.90566038 0.94117647 0.92592593 0.89655172]
|
|
|
|
mean value: 0.9125083857106127
|
|
|
|
key: train_fscore
|
|
value: [0.93023256 0.93842887 0.93277311 0.92600423 0.93446089 0.93248945
|
|
0.93670886 0.93052632 0.93052632 0.93162393]
|
|
|
|
mean value: 0.9323774533836076
|
|
|
|
key: test_precision
|
|
value: [0.89285714 0.83333333 0.95454545 0.96296296 0.89655172 0.95238095
|
|
0.88888889 0.96 0.89285714 0.8125 ]
|
|
|
|
mean value: 0.9046877601963809
|
|
|
|
key: train_precision
|
|
value: [0.92436975 0.93248945 0.92116183 0.92016807 0.92857143 0.92468619
|
|
0.92887029 0.92083333 0.92083333 0.93562232]
|
|
|
|
mean value: 0.9257605990519295
|
|
|
|
key: test_recall
|
|
value: [0.96153846 0.92592593 0.80769231 1. 1. 0.76923077
|
|
0.92307692 0.92307692 0.96153846 1. ]
|
|
|
|
mean value: 0.9272079772079772
|
|
|
|
key: train_recall
|
|
value: [0.93617021 0.94444444 0.94468085 0.93191489 0.94042553 0.94042553
|
|
0.94468085 0.94042553 0.94042553 0.92765957]
|
|
|
|
mean value: 0.9391252955082743
|
|
|
|
key: test_roc_auc
|
|
value: [0.92521368 0.86680912 0.88461538 0.98076923 0.94230769 0.86538462
|
|
0.90384615 0.94230769 0.92307692 0.88461538]
|
|
|
|
mean value: 0.9118945868945869
|
|
|
|
key: train_roc_auc
|
|
value: [0.92962357 0.93817967 0.93191489 0.92553191 0.93404255 0.93191489
|
|
0.93617021 0.92978723 0.92978723 0.93191489]
|
|
|
|
mean value: 0.9318867066739408
|
|
|
|
key: test_jcc
|
|
value: [0.86206897 0.78125 0.77777778 0.96296296 0.89655172 0.74074074
|
|
0.82758621 0.88888889 0.86206897 0.8125 ]
|
|
|
|
mean value: 0.8412396232439335
|
|
|
|
key: train_jcc
|
|
value: [0.86956522 0.884 0.87401575 0.86220472 0.87698413 0.87351779
|
|
0.88095238 0.87007874 0.87007874 0.872 ]
|
|
|
|
mean value: 0.8733397464644983
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.99820614 0.89078689 1.11161804 0.91593313 0.99522948 1.27120042
|
|
1.16889167 0.97342539 0.93785477 0.97674584]
|
|
|
|
mean value: 1.0239891767501832
|
|
|
|
key: score_time
|
|
value: [0.01472306 0.01496816 0.01541257 0.01507759 0.01992369 0.01551318
|
|
0.01557875 0.0176332 0.01491976 0.01504946]
|
|
|
|
mean value: 0.0158799409866333
|
|
|
|
key: test_mcc
|
|
value: [0.81196581 0.8116984 0.77849894 0.96225045 0.89056356 0.77849894
|
|
0.84615385 0.84866842 0.84866842 0.82305489]
|
|
|
|
mean value: 0.8400021693785612
|
|
|
|
key: train_mcc
|
|
value: [0.91045482 0.91484796 0.90667855 0.89790486 0.91064654 0.90233192
|
|
0.90252815 0.90233192 0.88965172 0.90220118]
|
|
|
|
mean value: 0.9039577621659811
|
|
|
|
key: test_accuracy
|
|
value: [0.90566038 0.90566038 0.88461538 0.98076923 0.94230769 0.88461538
|
|
0.92307692 0.92307692 0.92307692 0.90384615]
|
|
|
|
mean value: 0.9176705370101597
|
|
|
|
key: train_accuracy
|
|
value: [0.95522388 0.95735608 0.95319149 0.94893617 0.95531915 0.95106383
|
|
0.95106383 0.95106383 0.94468085 0.95106383]
|
|
|
|
mean value: 0.9518962936079481
|
|
|
|
key: test_fscore
|
|
value: [0.90566038 0.90909091 0.875 0.98113208 0.94545455 0.875
|
|
0.92307692 0.92 0.92592593 0.9122807 ]
|
|
|
|
mean value: 0.9172621458132878
|
|
|
|
key: train_fscore
|
|
value: [0.95541401 0.95762712 0.95378151 0.94915254 0.95541401 0.95157895
|
|
0.95178197 0.95157895 0.94537815 0.95137421]
|
|
|
|
mean value: 0.95230814229351
|
|
|
|
key: test_precision
|
|
value: [0.88888889 0.89285714 0.95454545 0.96296296 0.89655172 0.95454545
|
|
0.92307692 0.95833333 0.89285714 0.83870968]
|
|
|
|
mean value: 0.9163328704624589
|
|
|
|
key: train_precision
|
|
value: [0.95338983 0.94957983 0.94190871 0.94514768 0.95338983 0.94166667
|
|
0.93801653 0.94166667 0.93360996 0.94537815]
|
|
|
|
mean value: 0.9443753857993245
|
|
|
|
key: test_recall
|
|
value: [0.92307692 0.92592593 0.80769231 1. 1. 0.80769231
|
|
0.92307692 0.88461538 0.96153846 1. ]
|
|
|
|
mean value: 0.9233618233618234
|
|
|
|
key: train_recall
|
|
value: [0.95744681 0.96581197 0.96595745 0.95319149 0.95744681 0.96170213
|
|
0.96595745 0.96170213 0.95744681 0.95744681]
|
|
|
|
mean value: 0.9604109838152391
|
|
|
|
key: test_roc_auc
|
|
value: [0.90598291 0.90527066 0.88461538 0.98076923 0.94230769 0.88461538
|
|
0.92307692 0.92307692 0.92307692 0.90384615]
|
|
|
|
mean value: 0.9176638176638177
|
|
|
|
key: train_roc_auc
|
|
value: [0.95521913 0.95737407 0.95319149 0.94893617 0.95531915 0.95106383
|
|
0.95106383 0.95106383 0.94468085 0.95106383]
|
|
|
|
mean value: 0.9518976177486816
|
|
|
|
key: test_jcc
|
|
value: [0.82758621 0.83333333 0.77777778 0.96296296 0.89655172 0.77777778
|
|
0.85714286 0.85185185 0.86206897 0.83870968]
|
|
|
|
mean value: 0.848576313481764
|
|
|
|
key: train_jcc
|
|
value: [0.91463415 0.91869919 0.91164659 0.90322581 0.91463415 0.90763052
|
|
0.908 0.90763052 0.89641434 0.90725806]
|
|
|
|
mean value: 0.9089773323794109
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01577473 0.01172471 0.01135039 0.01113343 0.01135993 0.01071382
|
|
0.01026487 0.01151204 0.01141429 0.01121664]
|
|
|
|
mean value: 0.011646485328674317
|
|
|
|
key: score_time
|
|
value: [0.01278234 0.01041102 0.00982523 0.00985003 0.01006365 0.00986242
|
|
0.00955105 0.01019549 0.00995421 0.01008344]
|
|
|
|
mean value: 0.010257887840270995
|
|
|
|
key: test_mcc
|
|
value: [0.66048569 0.40912228 0.74466871 0.77151675 0.80829038 0.62279916
|
|
0.57735027 0.54006172 0.54006172 0.73568294]
|
|
|
|
mean value: 0.6410039620540624
|
|
|
|
key: train_mcc
|
|
value: [0.66639366 0.7005426 0.701239 0.68145013 0.69523029 0.7097907
|
|
0.66017245 0.67359644 0.68812845 0.69162595]
|
|
|
|
mean value: 0.6868169671603843
|
|
|
|
key: test_accuracy
|
|
value: [0.83018868 0.69811321 0.86538462 0.88461538 0.90384615 0.80769231
|
|
0.78846154 0.76923077 0.76923077 0.86538462]
|
|
|
|
mean value: 0.8182148040638607
|
|
|
|
key: train_accuracy
|
|
value: [0.8315565 0.84861407 0.84893617 0.83829787 0.84680851 0.85319149
|
|
0.82978723 0.83404255 0.84255319 0.84468085]
|
|
|
|
mean value: 0.8418468448033389
|
|
|
|
key: test_fscore
|
|
value: [0.82352941 0.66666667 0.85106383 0.88 0.90196078 0.79166667
|
|
0.78431373 0.77777778 0.76 0.85714286]
|
|
|
|
mean value: 0.809412171960983
|
|
|
|
key: train_fscore
|
|
value: [0.82326622 0.84044944 0.84116331 0.8280543 0.84140969 0.84563758
|
|
0.83333333 0.82272727 0.83482143 0.83813747]
|
|
|
|
mean value: 0.8349000049484545
|
|
|
|
key: test_precision
|
|
value: [0.84 0.76190476 0.95238095 0.91666667 0.92 0.86363636
|
|
0.8 0.75 0.79166667 0.91304348]
|
|
|
|
mean value: 0.850929888951628
|
|
|
|
key: train_precision
|
|
value: [0.86792453 0.88625592 0.88679245 0.88405797 0.87214612 0.89150943
|
|
0.81632653 0.88292683 0.87793427 0.875 ]
|
|
|
|
mean value: 0.8740874061181917
|
|
|
|
key: test_recall
|
|
value: [0.80769231 0.59259259 0.76923077 0.84615385 0.88461538 0.73076923
|
|
0.76923077 0.80769231 0.73076923 0.80769231]
|
|
|
|
mean value: 0.7746438746438746
|
|
|
|
key: train_recall
|
|
value: [0.78297872 0.7991453 0.8 0.7787234 0.81276596 0.80425532
|
|
0.85106383 0.77021277 0.79574468 0.80425532]
|
|
|
|
mean value: 0.7999145299145299
|
|
|
|
key: test_roc_auc
|
|
value: [0.82977208 0.70014245 0.86538462 0.88461538 0.90384615 0.80769231
|
|
0.78846154 0.76923077 0.76923077 0.86538462]
|
|
|
|
mean value: 0.8183760683760685
|
|
|
|
key: train_roc_auc
|
|
value: [0.8316603 0.84850882 0.84893617 0.83829787 0.84680851 0.85319149
|
|
0.82978723 0.83404255 0.84255319 0.84468085]
|
|
|
|
mean value: 0.8418466993998909
|
|
|
|
key: test_jcc
|
|
value: [0.7 0.5 0.74074074 0.78571429 0.82142857 0.65517241
|
|
0.64516129 0.63636364 0.61290323 0.75 ]
|
|
|
|
mean value: 0.684748416416937
|
|
|
|
key: train_jcc
|
|
value: [0.69961977 0.7248062 0.72586873 0.70656371 0.72623574 0.73255814
|
|
0.71428571 0.6988417 0.7164751 0.72137405]
|
|
|
|
mean value: 0.7166628841540069
|
|
|
|
MCC on Blind test: 0.63
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01205468 0.0115869 0.01135087 0.01066089 0.01150608 0.01090503
|
|
0.01032591 0.01174426 0.01039815 0.01049948]
|
|
|
|
mean value: 0.011103224754333497
|
|
|
|
key: score_time
|
|
value: [0.01041389 0.00969172 0.0097506 0.00891495 0.00965595 0.00975871
|
|
0.00914407 0.00987816 0.00919986 0.00920725]
|
|
|
|
mean value: 0.009561514854431153
|
|
|
|
key: test_mcc
|
|
value: [0.73646724 0.47360961 0.65433031 0.88527041 0.69436507 0.69230769
|
|
0.65824263 0.77151675 0.69436507 0.65433031]
|
|
|
|
mean value: 0.6914805091639882
|
|
|
|
key: train_mcc
|
|
value: [0.73140924 0.75708961 0.72356805 0.74043224 0.76629748 0.77032436
|
|
0.70276422 0.67337154 0.73659716 0.75330062]
|
|
|
|
mean value: 0.7355154534366981
|
|
|
|
key: test_accuracy
|
|
value: [0.86792453 0.73584906 0.82692308 0.94230769 0.84615385 0.84615385
|
|
0.82692308 0.88461538 0.84615385 0.82692308]
|
|
|
|
mean value: 0.8449927431059506
|
|
|
|
key: train_accuracy
|
|
value: [0.86567164 0.87846482 0.86170213 0.87021277 0.88297872 0.88510638
|
|
0.85106383 0.83617021 0.86808511 0.87659574]
|
|
|
|
mean value: 0.8676051354171392
|
|
|
|
key: test_fscore
|
|
value: [0.86792453 0.73076923 0.82352941 0.94117647 0.85185185 0.84615385
|
|
0.81632653 0.88 0.85185185 0.82352941]
|
|
|
|
mean value: 0.843311313365856
|
|
|
|
key: train_fscore
|
|
value: [0.86509636 0.87688985 0.86021505 0.87048832 0.88469602 0.88607595
|
|
0.84782609 0.83150985 0.86580087 0.87553648]
|
|
|
|
mean value: 0.8664134831445992
|
|
|
|
key: test_precision
|
|
value: [0.85185185 0.76 0.84 0.96 0.82142857 0.84615385
|
|
0.86956522 0.91666667 0.82142857 0.84 ]
|
|
|
|
mean value: 0.8527094724920812
|
|
|
|
key: train_precision
|
|
value: [0.87068966 0.88646288 0.86956522 0.86864407 0.87190083 0.87866109
|
|
0.86666667 0.85585586 0.88105727 0.88311688]
|
|
|
|
mean value: 0.873262041113066
|
|
|
|
key: test_recall
|
|
value: [0.88461538 0.7037037 0.80769231 0.92307692 0.88461538 0.84615385
|
|
0.76923077 0.84615385 0.88461538 0.80769231]
|
|
|
|
mean value: 0.8357549857549857
|
|
|
|
key: train_recall
|
|
value: [0.85957447 0.86752137 0.85106383 0.87234043 0.89787234 0.89361702
|
|
0.82978723 0.80851064 0.85106383 0.86808511]
|
|
|
|
mean value: 0.8599436261138389
|
|
|
|
key: test_roc_auc
|
|
value: [0.86823362 0.73646724 0.82692308 0.94230769 0.84615385 0.84615385
|
|
0.82692308 0.88461538 0.84615385 0.82692308]
|
|
|
|
mean value: 0.8450854700854701
|
|
|
|
key: train_roc_auc
|
|
value: [0.86568467 0.87844153 0.86170213 0.87021277 0.88297872 0.88510638
|
|
0.85106383 0.83617021 0.86808511 0.87659574]
|
|
|
|
mean value: 0.8676041098381524
|
|
|
|
key: test_jcc
|
|
value: [0.76666667 0.57575758 0.7 0.88888889 0.74193548 0.73333333
|
|
0.68965517 0.78571429 0.74193548 0.7 ]
|
|
|
|
mean value: 0.7323886890516479
|
|
|
|
key: train_jcc
|
|
value: [0.76226415 0.78076923 0.75471698 0.77067669 0.79323308 0.79545455
|
|
0.73584906 0.71161049 0.76335878 0.77862595]
|
|
|
|
mean value: 0.7646558959054925
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00998068 0.01181459 0.01135254 0.01113796 0.01147676 0.01108146
|
|
0.01116586 0.01172829 0.01143646 0.01152492]
|
|
|
|
mean value: 0.011269950866699218
|
|
|
|
key: score_time
|
|
value: [0.0131557 0.01342392 0.01350808 0.01317334 0.01755738 0.01329255
|
|
0.0138526 0.01361537 0.01634336 0.01371002]
|
|
|
|
mean value: 0.01416323184967041
|
|
|
|
key: test_mcc
|
|
value: [0.54793065 0.3223969 0.34641016 0.56591646 0.62279916 0.4233902
|
|
0.43929769 0.73568294 0.65824263 0.27104108]
|
|
|
|
mean value: 0.49331078658150723
|
|
|
|
key: train_mcc
|
|
value: [0.7231531 0.71458471 0.73192152 0.69364214 0.71917498 0.71066404
|
|
0.71495188 0.68936794 0.72008837 0.73208062]
|
|
|
|
mean value: 0.7149629304142088
|
|
|
|
key: test_accuracy
|
|
value: [0.77358491 0.66037736 0.67307692 0.76923077 0.80769231 0.71153846
|
|
0.71153846 0.86538462 0.82692308 0.63461538]
|
|
|
|
mean value: 0.7433962264150944
|
|
|
|
key: train_accuracy
|
|
value: [0.86140725 0.85714286 0.86595745 0.84680851 0.85957447 0.85531915
|
|
0.85744681 0.84468085 0.85957447 0.86595745]
|
|
|
|
mean value: 0.8573869255545978
|
|
|
|
key: test_fscore
|
|
value: [0.76 0.65384615 0.67924528 0.72727273 0.82142857 0.71698113
|
|
0.66666667 0.87272727 0.81632653 0.6122449 ]
|
|
|
|
mean value: 0.732673923560716
|
|
|
|
key: train_fscore
|
|
value: [0.85961123 0.85466377 0.86567164 0.84615385 0.85897436 0.85470085
|
|
0.85653105 0.84434968 0.8558952 0.86451613]
|
|
|
|
mean value: 0.8561067762085006
|
|
|
|
key: test_precision
|
|
value: [0.79166667 0.68 0.66666667 0.88888889 0.76666667 0.7037037
|
|
0.78947368 0.82758621 0.86956522 0.65217391]
|
|
|
|
mean value: 0.7636391614134453
|
|
|
|
key: train_precision
|
|
value: [0.87280702 0.86784141 0.86752137 0.84978541 0.86266094 0.8583691
|
|
0.86206897 0.84615385 0.87892377 0.87391304]
|
|
|
|
mean value: 0.8640044867366126
|
|
|
|
key: test_recall
|
|
value: [0.73076923 0.62962963 0.69230769 0.61538462 0.88461538 0.73076923
|
|
0.57692308 0.92307692 0.76923077 0.57692308]
|
|
|
|
mean value: 0.7129629629629629
|
|
|
|
key: train_recall
|
|
value: [0.84680851 0.84188034 0.86382979 0.84255319 0.85531915 0.85106383
|
|
0.85106383 0.84255319 0.83404255 0.85531915]
|
|
|
|
mean value: 0.8484433533369704
|
|
|
|
key: test_roc_auc
|
|
value: [0.77279202 0.66096866 0.67307692 0.76923077 0.80769231 0.71153846
|
|
0.71153846 0.86538462 0.82692308 0.63461538]
|
|
|
|
mean value: 0.7433760683760684
|
|
|
|
key: train_roc_auc
|
|
value: [0.86143844 0.85711038 0.86595745 0.84680851 0.85957447 0.85531915
|
|
0.85744681 0.84468085 0.85957447 0.86595745]
|
|
|
|
mean value: 0.8573867975995636
|
|
|
|
key: test_jcc
|
|
value: [0.61290323 0.48571429 0.51428571 0.57142857 0.6969697 0.55882353
|
|
0.5 0.77419355 0.68965517 0.44117647]
|
|
|
|
mean value: 0.584515021500561
|
|
|
|
key: train_jcc
|
|
value: [0.75378788 0.74621212 0.76315789 0.73333333 0.75280899 0.74626866
|
|
0.74906367 0.73062731 0.7480916 0.76136364]
|
|
|
|
mean value: 0.7484715089652758
|
|
|
|
MCC on Blind test: 0.33
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02623487 0.02498984 0.02270436 0.02161646 0.02071953 0.02152848
|
|
0.02124238 0.02357602 0.02255344 0.02224708]
|
|
|
|
mean value: 0.022741246223449706
|
|
|
|
key: score_time
|
|
value: [0.0136168 0.01234031 0.01236081 0.01204133 0.01282573 0.01181722
|
|
0.01200223 0.01174998 0.01184368 0.01281047]
|
|
|
|
mean value: 0.012340855598449708
|
|
|
|
key: test_mcc
|
|
value: [0.81196581 0.69957726 0.77849894 0.92307692 0.84866842 0.77849894
|
|
0.73131034 0.88527041 0.81312325 0.73568294]
|
|
|
|
mean value: 0.800567324577806
|
|
|
|
key: train_mcc
|
|
value: [0.7995781 0.81236588 0.80451759 0.78726255 0.79574468 0.80451759
|
|
0.80428445 0.79155386 0.80000724 0.80851796]
|
|
|
|
mean value: 0.8008349901879068
|
|
|
|
key: test_accuracy
|
|
value: [0.90566038 0.8490566 0.88461538 0.96153846 0.92307692 0.88461538
|
|
0.86538462 0.94230769 0.90384615 0.86538462]
|
|
|
|
mean value: 0.8985486211901307
|
|
|
|
key: train_accuracy
|
|
value: [0.89978678 0.90618337 0.90212766 0.89361702 0.89787234 0.90212766
|
|
0.90212766 0.89574468 0.9 0.90425532]
|
|
|
|
mean value: 0.9003842489679263
|
|
|
|
key: test_fscore
|
|
value: [0.90566038 0.85714286 0.875 0.96153846 0.92592593 0.875
|
|
0.86792453 0.94117647 0.90909091 0.87272727]
|
|
|
|
mean value: 0.8991186802674039
|
|
|
|
key: train_fscore
|
|
value: [0.90021231 0.90598291 0.90336134 0.8940678 0.89787234 0.90336134
|
|
0.90254237 0.89640592 0.89978678 0.90405117]
|
|
|
|
mean value: 0.9007644291954064
|
|
|
|
key: test_precision
|
|
value: [0.88888889 0.82758621 0.95454545 0.96153846 0.89285714 0.95454545
|
|
0.85185185 0.96 0.86206897 0.82758621]
|
|
|
|
mean value: 0.8981468633537599
|
|
|
|
key: train_precision
|
|
value: [0.89830508 0.90598291 0.89211618 0.89029536 0.89787234 0.89211618
|
|
0.89873418 0.8907563 0.9017094 0.90598291]
|
|
|
|
mean value: 0.8973870842377724
|
|
|
|
key: test_recall
|
|
value: [0.92307692 0.88888889 0.80769231 0.96153846 0.96153846 0.80769231
|
|
0.88461538 0.92307692 0.96153846 0.92307692]
|
|
|
|
mean value: 0.9042735042735043
|
|
|
|
key: train_recall
|
|
value: [0.90212766 0.90598291 0.91489362 0.89787234 0.89787234 0.91489362
|
|
0.90638298 0.90212766 0.89787234 0.90212766]
|
|
|
|
mean value: 0.9042153118748864
|
|
|
|
key: test_roc_auc
|
|
value: [0.90598291 0.8482906 0.88461538 0.96153846 0.92307692 0.88461538
|
|
0.86538462 0.94230769 0.90384615 0.86538462]
|
|
|
|
mean value: 0.8985042735042735
|
|
|
|
key: train_roc_auc
|
|
value: [0.89978178 0.90618294 0.90212766 0.89361702 0.89787234 0.90212766
|
|
0.90212766 0.89574468 0.9 0.90425532]
|
|
|
|
mean value: 0.9003837061283869
|
|
|
|
key: test_jcc
|
|
value: [0.82758621 0.75 0.77777778 0.92592593 0.86206897 0.77777778
|
|
0.76666667 0.88888889 0.83333333 0.77419355]
|
|
|
|
mean value: 0.818421909117126
|
|
|
|
key: train_jcc
|
|
value: [0.81853282 0.828125 0.82375479 0.80842912 0.81467181 0.82375479
|
|
0.82239382 0.81226054 0.81782946 0.82490272]
|
|
|
|
mean value: 0.819465487041468
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.69608831 1.78880715 2.0408814 2.11449838 1.90394425 1.97782683
|
|
1.97935748 2.07013917 2.04062533 2.04656172]
|
|
|
|
mean value: 1.9658730030059814
|
|
|
|
key: score_time
|
|
value: [0.01268101 0.01269102 0.01288629 0.01712489 0.01619506 0.01270795
|
|
0.01263285 0.01373196 0.01519346 0.01510119]
|
|
|
|
mean value: 0.01409456729888916
|
|
|
|
key: test_mcc
|
|
value: [0.77350427 0.8116984 0.74466871 0.96225045 0.85634884 0.73568294
|
|
0.84866842 0.84866842 0.84615385 0.82305489]
|
|
|
|
mean value: 0.825069919863237
|
|
|
|
key: train_mcc
|
|
value: [0.97037106 0.98721563 0.99148936 0.9873145 1. 0.9957537
|
|
0.9957537 1. 0.98312115 0.9957537 ]
|
|
|
|
mean value: 0.9906772783362988
|
|
|
|
key: test_accuracy
|
|
value: [0.88679245 0.90566038 0.86538462 0.98076923 0.92307692 0.86538462
|
|
0.92307692 0.92307692 0.92307692 0.90384615]
|
|
|
|
mean value: 0.9100145137880987
|
|
|
|
key: train_accuracy
|
|
value: [0.98507463 0.99360341 0.99574468 0.99361702 1. 0.99787234
|
|
0.99787234 1. 0.99148936 0.99787234]
|
|
|
|
mean value: 0.9953146123485914
|
|
|
|
key: test_fscore
|
|
value: [0.88461538 0.90909091 0.85106383 0.98113208 0.92857143 0.85714286
|
|
0.92 0.92 0.92307692 0.9122807 ]
|
|
|
|
mean value: 0.908697410951082
|
|
|
|
key: train_fscore
|
|
value: [0.98494624 0.99357602 0.99574468 0.99357602 1. 0.9978678
|
|
0.9978678 1. 0.99141631 0.99787686]
|
|
|
|
mean value: 0.9952871726109697
|
|
|
|
key: test_precision
|
|
value: [0.88461538 0.89285714 0.95238095 0.96296296 0.86666667 0.91304348
|
|
0.95833333 0.95833333 0.92307692 0.83870968]
|
|
|
|
mean value: 0.9150979854906923
|
|
|
|
key: train_precision
|
|
value: [0.99565217 0.99570815 0.99574468 1. 1. 1.
|
|
1. 1. 1. 0.99576271]
|
|
|
|
mean value: 0.9982867721134951
|
|
|
|
key: test_recall
|
|
value: [0.88461538 0.92592593 0.76923077 1. 1. 0.80769231
|
|
0.88461538 0.88461538 0.92307692 1. ]
|
|
|
|
mean value: 0.9079772079772079
|
|
|
|
key: train_recall
|
|
value: [0.97446809 0.99145299 0.99574468 0.98723404 1. 0.99574468
|
|
0.99574468 1. 0.98297872 1. ]
|
|
|
|
mean value: 0.9923367885070012
|
|
|
|
key: test_roc_auc
|
|
value: [0.88675214 0.90527066 0.86538462 0.98076923 0.92307692 0.86538462
|
|
0.92307692 0.92307692 0.92307692 0.90384615]
|
|
|
|
mean value: 0.90997150997151
|
|
|
|
key: train_roc_auc
|
|
value: [0.98509729 0.99359884 0.99574468 0.99361702 1. 0.99787234
|
|
0.99787234 1. 0.99148936 0.99787234]
|
|
|
|
mean value: 0.995316421167485
|
|
|
|
key: test_jcc
|
|
value: [0.79310345 0.83333333 0.74074074 0.96296296 0.86666667 0.75
|
|
0.85185185 0.85185185 0.85714286 0.83870968]
|
|
|
|
mean value: 0.8346363390245481
|
|
|
|
key: train_jcc
|
|
value: [0.97033898 0.98723404 0.99152542 0.98723404 1. 0.99574468
|
|
0.99574468 1. 0.98297872 0.99576271]
|
|
|
|
mean value: 0.9906563288856833
|
|
|
|
MCC on Blind test: 0.73
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02953148 0.02271771 0.02113962 0.02237082 0.01985717 0.02250838
|
|
0.02102804 0.02116036 0.02456594 0.02486062]
|
|
|
|
mean value: 0.022974014282226562
|
|
|
|
key: score_time
|
|
value: [0.01245403 0.00974846 0.00945234 0.00895834 0.00919867 0.00918245
|
|
0.00941133 0.00912023 0.00937629 0.01030064]
|
|
|
|
mean value: 0.009720277786254884
|
|
|
|
key: test_mcc
|
|
value: [0.81688878 0.92704716 0.92307692 0.88527041 0.84866842 0.96225045
|
|
0.84615385 0.84866842 0.77151675 1. ]
|
|
|
|
mean value: 0.8829541177251579
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.90566038 0.96226415 0.96153846 0.94230769 0.92307692 0.98076923
|
|
0.92307692 0.92307692 0.88461538 1. ]
|
|
|
|
mean value: 0.9406386066763426
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.96428571 0.96153846 0.94339623 0.92592593 0.98039216
|
|
0.92307692 0.92592593 0.88888889 1. ]
|
|
|
|
mean value: 0.9422521132010588
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.86206897 0.93103448 0.96153846 0.92592593 0.89285714 1.
|
|
0.92307692 0.89285714 0.85714286 1. ]
|
|
|
|
mean value: 0.9246501901674316
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96153846 1. 0.96153846 0.96153846 0.96153846 0.96153846
|
|
0.92307692 0.96153846 0.92307692 1. ]
|
|
|
|
mean value: 0.9615384615384616
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.90669516 0.96153846 0.96153846 0.94230769 0.92307692 0.98076923
|
|
0.92307692 0.92307692 0.88461538 1. ]
|
|
|
|
mean value: 0.9406695156695157
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.93103448 0.92592593 0.89285714 0.86206897 0.96153846
|
|
0.85714286 0.86206897 0.8 1. ]
|
|
|
|
mean value: 0.8925970134590824
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.87
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.12382007 0.12043929 0.12239695 0.12167001 0.12279487 0.1219244
|
|
0.12101841 0.12090349 0.12148833 0.12095571]
|
|
|
|
mean value: 0.12174115180969239
|
|
|
|
key: score_time
|
|
value: [0.01793957 0.01816487 0.01760411 0.01803041 0.0179038 0.01825953
|
|
0.01798344 0.01780105 0.01787972 0.01786542]
|
|
|
|
mean value: 0.017943191528320312
|
|
|
|
key: test_mcc
|
|
value: [0.77603503 0.66096866 0.77151675 0.88527041 0.88527041 0.81312325
|
|
0.76923077 0.92307692 0.89056356 0.71151247]
|
|
|
|
mean value: 0.8086568232416546
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.88679245 0.83018868 0.88461538 0.94230769 0.94230769 0.90384615
|
|
0.88461538 0.96153846 0.94230769 0.84615385]
|
|
|
|
mean value: 0.902467343976778
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.83018868 0.88 0.94117647 0.94339623 0.89795918
|
|
0.88461538 0.96153846 0.94545455 0.86206897]
|
|
|
|
mean value: 0.9035286805936604
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.85714286 0.84615385 0.91666667 0.96 0.92592593 0.95652174
|
|
0.88461538 0.96153846 0.89655172 0.78125 ]
|
|
|
|
mean value: 0.8986366605311508
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.92307692 0.81481481 0.84615385 0.92307692 0.96153846 0.84615385
|
|
0.88461538 0.96153846 1. 0.96153846]
|
|
|
|
mean value: 0.9122507122507123
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.88746439 0.83048433 0.88461538 0.94230769 0.94230769 0.90384615
|
|
0.88461538 0.96153846 0.94230769 0.84615385]
|
|
|
|
mean value: 0.9025641025641026
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.70967742 0.78571429 0.88888889 0.89285714 0.81481481
|
|
0.79310345 0.92592593 0.89655172 0.75757576]
|
|
|
|
mean value: 0.8265109407545448
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.81
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0100944 0.01009727 0.01013374 0.01006222 0.01016641 0.01010799
|
|
0.01006365 0.01008892 0.01019859 0.01012897]
|
|
|
|
mean value: 0.010114216804504394
|
|
|
|
key: score_time
|
|
value: [0.00880814 0.00876117 0.00883174 0.0087719 0.00877357 0.00879431
|
|
0.00871038 0.00887156 0.00874829 0.0087359 ]
|
|
|
|
mean value: 0.008780694007873536
|
|
|
|
key: test_mcc
|
|
value: [ 0.43447293 0.43366663 0.4233902 0.50336201 0.6172134 0.46291005
|
|
0.43929769 0.73568294 0.54006172 -0.08084521]
|
|
|
|
mean value: 0.4509212363913005
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.71698113 0.71698113 0.71153846 0.75 0.80769231 0.73076923
|
|
0.71153846 0.86538462 0.76923077 0.46153846]
|
|
|
|
mean value: 0.7241654571843251
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.71698113 0.72727273 0.70588235 0.76363636 0.81481481 0.74074074
|
|
0.74576271 0.87272727 0.77777778 0.53333333]
|
|
|
|
mean value: 0.7398929227184086
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.7037037 0.71428571 0.72 0.72413793 0.78571429 0.71428571
|
|
0.66666667 0.82758621 0.75 0.47058824]
|
|
|
|
mean value: 0.7076968457881236
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.73076923 0.74074074 0.69230769 0.80769231 0.84615385 0.76923077
|
|
0.84615385 0.92307692 0.80769231 0.61538462]
|
|
|
|
mean value: 0.777920227920228
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.71723647 0.71652422 0.71153846 0.75 0.80769231 0.73076923
|
|
0.71153846 0.86538462 0.76923077 0.46153846]
|
|
|
|
mean value: 0.7241452991452991
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.55882353 0.57142857 0.54545455 0.61764706 0.6875 0.58823529
|
|
0.59459459 0.77419355 0.63636364 0.36363636]
|
|
|
|
mean value: 0.5937877142217749
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.76943898 1.83275151 1.84324455 1.89071012 1.83631063 1.77790737
|
|
1.78165555 1.7913177 1.78348899 1.77847815]
|
|
|
|
mean value: 1.8085303544998168
|
|
|
|
key: score_time
|
|
value: [0.09400439 0.09472203 0.10127854 0.10282397 0.09269404 0.09585142
|
|
0.09383512 0.09480047 0.15068531 0.09217691]
|
|
|
|
mean value: 0.10128722190856934
|
|
|
|
key: test_mcc
|
|
value: [0.81688878 0.92450142 0.9258201 0.92307692 0.9258201 0.9258201
|
|
0.9258201 0.96225045 0.9258201 0.9258201 ]
|
|
|
|
mean value: 0.9181638177359281
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.90566038 0.96226415 0.96153846 0.96153846 0.96153846 0.96153846
|
|
0.96153846 0.98076923 0.96153846 0.96153846]
|
|
|
|
mean value: 0.9579462989840348
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.96296296 0.96 0.96153846 0.96296296 0.96
|
|
0.96 0.98039216 0.96296296 0.96296296]
|
|
|
|
mean value: 0.9582873379343968
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.86206897 0.96296296 1. 0.96153846 0.92857143 1.
|
|
1. 1. 0.92857143 0.92857143]
|
|
|
|
mean value: 0.9572284675732952
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96153846 0.96296296 0.92307692 0.96153846 1. 0.92307692
|
|
0.92307692 0.96153846 1. 1. ]
|
|
|
|
mean value: 0.9616809116809117
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.90669516 0.96225071 0.96153846 0.96153846 0.96153846 0.96153846
|
|
0.96153846 0.98076923 0.96153846 0.96153846]
|
|
|
|
mean value: 0.9580484330484331
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.92857143 0.92307692 0.92592593 0.92857143 0.92307692
|
|
0.92307692 0.96153846 0.92857143 0.92857143]
|
|
|
|
mean value: 0.9204314204314205
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.81
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC0...05', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.92528653 0.9376781 0.96513009 1.02676868 1.00038338 1.00743413
|
|
0.93951488 1.03116369 0.93464684 0.97854924]
|
|
|
|
mean value: 0.974655556678772
|
|
|
|
key: score_time
|
|
value: [0.26444244 0.20015526 0.24083567 0.22042966 0.2193284 0.27657557
|
|
0.22205067 0.23441887 0.12471294 0.23463321]
|
|
|
|
mean value: 0.22375826835632323
|
|
|
|
key: test_mcc
|
|
value: [0.81688878 0.77350427 0.9258201 0.92307692 0.9258201 0.88527041
|
|
0.9258201 0.96225045 0.9258201 0.9258201 ]
|
|
|
|
mean value: 0.8990091339347005
|
|
|
|
key: train_mcc
|
|
value: [0.96162939 0.95309971 0.95744681 0.95320012 0.95748148 0.95320012
|
|
0.95744681 0.95320012 0.94893617 0.95748148]
|
|
|
|
mean value: 0.9553122213642232
|
|
|
|
key: test_accuracy
|
|
value: [0.90566038 0.88679245 0.96153846 0.96153846 0.96153846 0.94230769
|
|
0.96153846 0.98076923 0.96153846 0.96153846]
|
|
|
|
mean value: 0.9484760522496372
|
|
|
|
key: train_accuracy
|
|
value: [0.98081023 0.97654584 0.9787234 0.97659574 0.9787234 0.97659574
|
|
0.9787234 0.97659574 0.97446809 0.9787234 ]
|
|
|
|
mean value: 0.9776505012929274
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.88888889 0.96 0.96153846 0.96296296 0.94117647
|
|
0.96 0.98039216 0.96296296 0.96296296]
|
|
|
|
mean value: 0.9489975775858129
|
|
|
|
key: train_fscore
|
|
value: [0.98081023 0.9764454 0.9787234 0.97654584 0.97863248 0.97654584
|
|
0.9787234 0.97654584 0.97446809 0.97863248]
|
|
|
|
mean value: 0.9776073008221619
|
|
|
|
key: test_precision
|
|
value: [0.86206897 0.88888889 1. 0.96153846 0.92857143 0.96
|
|
1. 1. 0.92857143 0.92857143]
|
|
|
|
mean value: 0.9458210601658877
|
|
|
|
key: train_precision
|
|
value: [0.98290598 0.97854077 0.9787234 0.97863248 0.98283262 0.97863248
|
|
0.9787234 0.97863248 0.97446809 0.98283262]
|
|
|
|
mean value: 0.979492432100413
|
|
|
|
key: test_recall
|
|
value: [0.96153846 0.88888889 0.92307692 0.96153846 1. 0.92307692
|
|
0.92307692 0.96153846 1. 1. ]
|
|
|
|
mean value: 0.9542735042735043
|
|
|
|
key: train_recall
|
|
value: [0.9787234 0.97435897 0.9787234 0.97446809 0.97446809 0.97446809
|
|
0.9787234 0.97446809 0.97446809 0.97446809]
|
|
|
|
mean value: 0.975733769776323
|
|
|
|
key: test_roc_auc
|
|
value: [0.90669516 0.88675214 0.96153846 0.96153846 0.96153846 0.94230769
|
|
0.96153846 0.98076923 0.96153846 0.96153846]
|
|
|
|
mean value: 0.9485754985754986
|
|
|
|
key: train_roc_auc
|
|
value: [0.98081469 0.97654119 0.9787234 0.97659574 0.9787234 0.97659574
|
|
0.9787234 0.97659574 0.97446809 0.9787234 ]
|
|
|
|
mean value: 0.977650481905801
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.8 0.92307692 0.92592593 0.92857143 0.88888889
|
|
0.92307692 0.96153846 0.92857143 0.92857143]
|
|
|
|
mean value: 0.9041554741554741
|
|
|
|
key: train_jcc
|
|
value: [0.9623431 0.9539749 0.95833333 0.95416667 0.958159 0.95416667
|
|
0.95833333 0.95416667 0.95020747 0.958159 ]
|
|
|
|
mean value: 0.9562010118809934
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01119733 0.01217985 0.01093841 0.01117277 0.01136184 0.0108006
|
|
0.01067758 0.01132941 0.01079106 0.01140022]
|
|
|
|
mean value: 0.011184906959533692
|
|
|
|
key: score_time
|
|
value: [0.00896358 0.00987315 0.00957155 0.00922394 0.00965858 0.00929475
|
|
0.0097177 0.00967407 0.00959873 0.00960207]
|
|
|
|
mean value: 0.009517812728881836
|
|
|
|
key: test_mcc
|
|
value: [0.73646724 0.47360961 0.65433031 0.88527041 0.69436507 0.69230769
|
|
0.65824263 0.77151675 0.69436507 0.65433031]
|
|
|
|
mean value: 0.6914805091639882
|
|
|
|
key: train_mcc
|
|
value: [0.73140924 0.75708961 0.72356805 0.74043224 0.76629748 0.77032436
|
|
0.70276422 0.67337154 0.73659716 0.75330062]
|
|
|
|
mean value: 0.7355154534366981
|
|
|
|
key: test_accuracy
|
|
value: [0.86792453 0.73584906 0.82692308 0.94230769 0.84615385 0.84615385
|
|
0.82692308 0.88461538 0.84615385 0.82692308]
|
|
|
|
mean value: 0.8449927431059506
|
|
|
|
key: train_accuracy
|
|
value: [0.86567164 0.87846482 0.86170213 0.87021277 0.88297872 0.88510638
|
|
0.85106383 0.83617021 0.86808511 0.87659574]
|
|
|
|
mean value: 0.8676051354171392
|
|
|
|
key: test_fscore
|
|
value: [0.86792453 0.73076923 0.82352941 0.94117647 0.85185185 0.84615385
|
|
0.81632653 0.88 0.85185185 0.82352941]
|
|
|
|
mean value: 0.843311313365856
|
|
|
|
key: train_fscore
|
|
value: [0.86509636 0.87688985 0.86021505 0.87048832 0.88469602 0.88607595
|
|
0.84782609 0.83150985 0.86580087 0.87553648]
|
|
|
|
mean value: 0.8664134831445992
|
|
|
|
key: test_precision
|
|
value: [0.85185185 0.76 0.84 0.96 0.82142857 0.84615385
|
|
0.86956522 0.91666667 0.82142857 0.84 ]
|
|
|
|
mean value: 0.8527094724920812
|
|
|
|
key: train_precision
|
|
value: [0.87068966 0.88646288 0.86956522 0.86864407 0.87190083 0.87866109
|
|
0.86666667 0.85585586 0.88105727 0.88311688]
|
|
|
|
mean value: 0.873262041113066
|
|
|
|
key: test_recall
|
|
value: [0.88461538 0.7037037 0.80769231 0.92307692 0.88461538 0.84615385
|
|
0.76923077 0.84615385 0.88461538 0.80769231]
|
|
|
|
mean value: 0.8357549857549857
|
|
|
|
key: train_recall
|
|
value: [0.85957447 0.86752137 0.85106383 0.87234043 0.89787234 0.89361702
|
|
0.82978723 0.80851064 0.85106383 0.86808511]
|
|
|
|
mean value: 0.8599436261138389
|
|
|
|
key: test_roc_auc
|
|
value: [0.86823362 0.73646724 0.82692308 0.94230769 0.84615385 0.84615385
|
|
0.82692308 0.88461538 0.84615385 0.82692308]
|
|
|
|
mean value: 0.8450854700854701
|
|
|
|
key: train_roc_auc
|
|
value: [0.86568467 0.87844153 0.86170213 0.87021277 0.88297872 0.88510638
|
|
0.85106383 0.83617021 0.86808511 0.87659574]
|
|
|
|
mean value: 0.8676041098381524
|
|
|
|
key: test_jcc
|
|
value: [0.76666667 0.57575758 0.7 0.88888889 0.74193548 0.73333333
|
|
0.68965517 0.78571429 0.74193548 0.7 ]
|
|
|
|
mean value: 0.7323886890516479
|
|
|
|
key: train_jcc
|
|
value: [0.76226415 0.78076923 0.75471698 0.77067669 0.79323308 0.79545455
|
|
0.73584906 0.71161049 0.76335878 0.77862595]
|
|
|
|
mean value: 0.7646558959054925
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC0...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.08577728 0.07616615 0.09151053 0.07672358 0.0770278 0.08412862
|
|
0.06949353 0.0763917 0.08024454 0.08369136]
|
|
|
|
mean value: 0.08011550903320312
|
|
|
|
key: score_time
|
|
value: [0.01101613 0.01094556 0.0117805 0.01137733 0.0115149 0.01199055
|
|
0.01296544 0.01189756 0.01115489 0.0111413 ]
|
|
|
|
mean value: 0.01157841682434082
|
|
|
|
key: test_mcc
|
|
value: [0.88746439 0.96291111 0.96225045 0.96225045 0.9258201 0.96225045
|
|
0.88527041 0.9258201 0.9258201 0.96225045]
|
|
|
|
mean value: 0.9362108002286528
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.94339623 0.98113208 0.98076923 0.98076923 0.96153846 0.98076923
|
|
0.94230769 0.96153846 0.96153846 0.98076923]
|
|
|
|
mean value: 0.9674528301886792
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94339623 0.98181818 0.98039216 0.98113208 0.96296296 0.98039216
|
|
0.94117647 0.96 0.96296296 0.98113208]
|
|
|
|
mean value: 0.9675365269416324
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.92592593 0.96428571 1. 0.96296296 0.92857143 1.
|
|
0.96 1. 0.92857143 0.96296296]
|
|
|
|
mean value: 0.9633280423280424
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96153846 1. 0.96153846 1. 1. 0.96153846
|
|
0.92307692 0.92307692 1. 1. ]
|
|
|
|
mean value: 0.9730769230769231
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.94373219 0.98076923 0.98076923 0.98076923 0.96153846 0.98076923
|
|
0.94230769 0.96153846 0.96153846 0.98076923]
|
|
|
|
mean value: 0.9674501424501425
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.89285714 0.96428571 0.96153846 0.96296296 0.92857143 0.96153846
|
|
0.88888889 0.92307692 0.92857143 0.96296296]
|
|
|
|
mean value: 0.9375254375254375
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.86
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.04911208 0.07845759 0.0794065 0.07849884 0.07238936 0.04618835
|
|
0.09146833 0.09963346 0.07126188 0.0715549 ]
|
|
|
|
mean value: 0.07379713058471679
|
|
|
|
key: score_time
|
|
value: [0.01892233 0.0187223 0.01950288 0.0187819 0.01244593 0.01247454
|
|
0.01240921 0.01876998 0.02508521 0.01220512]
|
|
|
|
mean value: 0.016931939125061034
|
|
|
|
key: test_mcc
|
|
value: [0.77603503 0.73997003 0.77849894 0.88527041 0.84866842 0.81312325
|
|
0.6172134 0.80829038 0.74466871 0.71151247]
|
|
|
|
mean value: 0.7723251042423636
|
|
|
|
key: train_mcc
|
|
value: [0.89794254 0.89379475 0.91104256 0.90651431 0.91922384 0.91084449
|
|
0.91084449 0.91502618 0.91519196 0.90641581]
|
|
|
|
mean value: 0.9086840926765474
|
|
|
|
key: test_accuracy
|
|
value: [0.88679245 0.86792453 0.88461538 0.94230769 0.92307692 0.90384615
|
|
0.80769231 0.90384615 0.86538462 0.84615385]
|
|
|
|
mean value: 0.8831640058055152
|
|
|
|
key: train_accuracy
|
|
value: [0.94882729 0.9466951 0.95531915 0.95319149 0.95957447 0.95531915
|
|
0.95531915 0.95744681 0.95744681 0.95319149]
|
|
|
|
mean value: 0.9542330898697999
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.87719298 0.875 0.94339623 0.92592593 0.89795918
|
|
0.81481481 0.90196078 0.87719298 0.86206897]
|
|
|
|
mean value: 0.8864400754461441
|
|
|
|
key: train_fscore
|
|
value: [0.94957983 0.94736842 0.95597484 0.9535865 0.95983087 0.95578947
|
|
0.95578947 0.95780591 0.95798319 0.95338983]
|
|
|
|
mean value: 0.9547098338777809
|
|
|
|
key: test_precision
|
|
value: [0.85714286 0.83333333 0.95454545 0.92592593 0.89285714 0.95652174
|
|
0.78571429 0.92 0.80645161 0.78125 ]
|
|
|
|
mean value: 0.871374235155266
|
|
|
|
key: train_precision
|
|
value: [0.93775934 0.93360996 0.94214876 0.94560669 0.95378151 0.94583333
|
|
0.94583333 0.94979079 0.94605809 0.94936709]
|
|
|
|
mean value: 0.9449788903641747
|
|
|
|
key: test_recall
|
|
value: [0.92307692 0.92592593 0.80769231 0.96153846 0.96153846 0.84615385
|
|
0.84615385 0.88461538 0.96153846 0.96153846]
|
|
|
|
mean value: 0.9079772079772079
|
|
|
|
key: train_recall
|
|
value: [0.96170213 0.96153846 0.97021277 0.96170213 0.96595745 0.96595745
|
|
0.96595745 0.96595745 0.97021277 0.95744681]
|
|
|
|
mean value: 0.9646644844517185
|
|
|
|
key: test_roc_auc
|
|
value: [0.88746439 0.86680912 0.88461538 0.94230769 0.92307692 0.90384615
|
|
0.80769231 0.90384615 0.86538462 0.84615385]
|
|
|
|
mean value: 0.8831196581196582
|
|
|
|
key: train_roc_auc
|
|
value: [0.94879978 0.94672668 0.95531915 0.95319149 0.95957447 0.95531915
|
|
0.95531915 0.95744681 0.95744681 0.95319149]
|
|
|
|
mean value: 0.9542334969994545
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.78125 0.77777778 0.89285714 0.86206897 0.81481481
|
|
0.6875 0.82142857 0.78125 0.75757576]
|
|
|
|
mean value: 0.7976523029971305
|
|
|
|
key: train_jcc
|
|
value: [0.904 0.9 0.91566265 0.91129032 0.92276423 0.91532258
|
|
0.91532258 0.91902834 0.91935484 0.91093117]
|
|
|
|
mean value: 0.9133676714995371
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01811981 0.0114007 0.00968623 0.00974464 0.01074147 0.00977039
|
|
0.01065302 0.01098585 0.00996947 0.00996566]
|
|
|
|
mean value: 0.01110372543334961
|
|
|
|
key: score_time
|
|
value: [0.0099473 0.00906754 0.00856543 0.00886488 0.00860167 0.00880289
|
|
0.00895095 0.00944519 0.00875688 0.00867391]
|
|
|
|
mean value: 0.00896766185760498
|
|
|
|
key: test_mcc
|
|
value: [0.6980057 0.51359557 0.81312325 0.84615385 0.84615385 0.70064905
|
|
0.65824263 0.65433031 0.73568294 0.6172134 ]
|
|
|
|
mean value: 0.7083150533360429
|
|
|
|
key: train_mcc
|
|
value: [0.68508531 0.71894691 0.70253486 0.69424587 0.74910575 0.74478875
|
|
0.69401929 0.67747959 0.69041892 0.71541847]
|
|
|
|
mean value: 0.7072043720898497
|
|
|
|
key: test_accuracy
|
|
value: [0.8490566 0.75471698 0.90384615 0.92307692 0.92307692 0.84615385
|
|
0.82692308 0.82692308 0.86538462 0.80769231]
|
|
|
|
mean value: 0.8526850507982584
|
|
|
|
key: train_accuracy
|
|
value: [0.84221748 0.85927505 0.85106383 0.84680851 0.87446809 0.87234043
|
|
0.84680851 0.83829787 0.84468085 0.85744681]
|
|
|
|
mean value: 0.8533407430930454
|
|
|
|
key: test_fscore
|
|
value: [0.84615385 0.74509804 0.89795918 0.92307692 0.92307692 0.83333333
|
|
0.81632653 0.82352941 0.87272727 0.8 ]
|
|
|
|
mean value: 0.8481281463634405
|
|
|
|
key: train_fscore
|
|
value: [0.83913043 0.85652174 0.84848485 0.84347826 0.87311828 0.87124464
|
|
0.84415584 0.83406114 0.84026258 0.85466377]
|
|
|
|
mean value: 0.850512153401787
|
|
|
|
key: test_precision
|
|
value: [0.84615385 0.79166667 0.95652174 0.92307692 0.92307692 0.90909091
|
|
0.86956522 0.84 0.82758621 0.83333333]
|
|
|
|
mean value: 0.8720071764816892
|
|
|
|
key: train_precision
|
|
value: [0.85777778 0.87168142 0.86343612 0.86222222 0.8826087 0.87878788
|
|
0.85903084 0.85650224 0.86486486 0.87168142]
|
|
|
|
mean value: 0.8668593473668214
|
|
|
|
key: test_recall
|
|
value: [0.84615385 0.7037037 0.84615385 0.92307692 0.92307692 0.76923077
|
|
0.76923077 0.80769231 0.92307692 0.76923077]
|
|
|
|
mean value: 0.8280626780626781
|
|
|
|
key: train_recall
|
|
value: [0.8212766 0.84188034 0.83404255 0.82553191 0.86382979 0.86382979
|
|
0.82978723 0.81276596 0.81702128 0.83829787]
|
|
|
|
mean value: 0.8348263320603746
|
|
|
|
key: test_roc_auc
|
|
value: [0.84900285 0.75569801 0.90384615 0.92307692 0.92307692 0.84615385
|
|
0.82692308 0.82692308 0.86538462 0.80769231]
|
|
|
|
mean value: 0.8527777777777779
|
|
|
|
key: train_roc_auc
|
|
value: [0.84226223 0.85923804 0.85106383 0.84680851 0.87446809 0.87234043
|
|
0.84680851 0.83829787 0.84468085 0.85744681]
|
|
|
|
mean value: 0.853341516639389
|
|
|
|
key: test_jcc
|
|
value: [0.73333333 0.59375 0.81481481 0.85714286 0.85714286 0.71428571
|
|
0.68965517 0.7 0.77419355 0.66666667]
|
|
|
|
mean value: 0.7400984964187133
|
|
|
|
key: train_jcc
|
|
value: [0.72284644 0.74904943 0.73684211 0.72932331 0.77480916 0.77186312
|
|
0.73033708 0.71535581 0.7245283 0.74621212]
|
|
|
|
mean value: 0.7401166870309306
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01385999 0.0178628 0.02119446 0.02347088 0.0253675 0.01830435
|
|
0.02012086 0.01882529 0.02110195 0.020648 ]
|
|
|
|
mean value: 0.020075607299804687
|
|
|
|
key: score_time
|
|
value: [0.00991964 0.01136136 0.01177001 0.01195335 0.01180339 0.01197219
|
|
0.01181316 0.01186776 0.01188731 0.01246667]
|
|
|
|
mean value: 0.011681485176086425
|
|
|
|
key: test_mcc
|
|
value: [0.73609205 0.70527596 0.74466871 0.88527041 0.82305489 0.64676167
|
|
0.74466871 0.82305489 0.77151675 0.66666667]
|
|
|
|
mean value: 0.7547030706307085
|
|
|
|
key: train_mcc
|
|
value: [0.86611567 0.88176453 0.88164966 0.91163756 0.87947498 0.84270412
|
|
0.86448019 0.76515574 0.91064654 0.74380085]
|
|
|
|
mean value: 0.8547429840717615
|
|
|
|
key: test_accuracy
|
|
value: [0.86792453 0.8490566 0.86538462 0.94230769 0.90384615 0.80769231
|
|
0.86538462 0.90384615 0.88461538 0.80769231]
|
|
|
|
mean value: 0.8697750362844703
|
|
|
|
key: train_accuracy
|
|
value: [0.93176972 0.94029851 0.94042553 0.95531915 0.93829787 0.91914894
|
|
0.92978723 0.87234043 0.95531915 0.85957447]
|
|
|
|
mean value: 0.9242280996234632
|
|
|
|
key: test_fscore
|
|
value: [0.8627451 0.86206897 0.85106383 0.94117647 0.9122807 0.77272727
|
|
0.85106383 0.89361702 0.88888889 0.83870968]
|
|
|
|
mean value: 0.8674341755785658
|
|
|
|
key: train_fscore
|
|
value: [0.92920354 0.94166667 0.93913043 0.95424837 0.9406953 0.91479821
|
|
0.9258427 0.85576923 0.95541401 0.8754717 ]
|
|
|
|
mean value: 0.9232240148337406
|
|
|
|
key: test_precision
|
|
value: [0.88 0.80645161 0.95238095 0.96 0.83870968 0.94444444
|
|
0.95238095 1. 0.85714286 0.72222222]
|
|
|
|
mean value: 0.8913732718894009
|
|
|
|
key: train_precision
|
|
value: [0.96774194 0.91869919 0.96 0.97767857 0.90551181 0.96682464
|
|
0.98095238 0.98342541 0.95338983 0.78644068]
|
|
|
|
mean value: 0.9400664453269295
|
|
|
|
key: test_recall
|
|
value: [0.84615385 0.92592593 0.76923077 0.92307692 1. 0.65384615
|
|
0.76923077 0.80769231 0.92307692 1. ]
|
|
|
|
mean value: 0.8618233618233618
|
|
|
|
key: train_recall
|
|
value: [0.89361702 0.96581197 0.91914894 0.93191489 0.9787234 0.86808511
|
|
0.87659574 0.75744681 0.95744681 0.98723404]
|
|
|
|
mean value: 0.9136024731769412
|
|
|
|
key: test_roc_auc
|
|
value: [0.86752137 0.84757835 0.86538462 0.94230769 0.90384615 0.80769231
|
|
0.86538462 0.90384615 0.88461538 0.80769231]
|
|
|
|
mean value: 0.8695868945868945
|
|
|
|
key: train_roc_auc
|
|
value: [0.93185125 0.94035279 0.94042553 0.95531915 0.93829787 0.91914894
|
|
0.92978723 0.87234043 0.95531915 0.85957447]
|
|
|
|
mean value: 0.9242416803055101
|
|
|
|
key: test_jcc
|
|
value: [0.75862069 0.75757576 0.74074074 0.88888889 0.83870968 0.62962963
|
|
0.74074074 0.80769231 0.8 0.72222222]
|
|
|
|
mean value: 0.7684820654564815
|
|
|
|
key: train_jcc
|
|
value: [0.8677686 0.88976378 0.8852459 0.9125 0.88803089 0.84297521
|
|
0.86192469 0.74789916 0.91463415 0.77852349]
|
|
|
|
mean value: 0.8589265852981367
|
|
|
|
MCC on Blind test: 0.71
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01879811 0.01746774 0.02222395 0.02218533 0.02078748 0.01897097
|
|
0.02105546 0.02173853 0.02300715 0.02159905]
|
|
|
|
mean value: 0.020783376693725587
|
|
|
|
key: score_time
|
|
value: [0.01111412 0.01320291 0.01229048 0.01212955 0.0149734 0.01178837
|
|
0.01732373 0.01178002 0.01172805 0.0117414 ]
|
|
|
|
mean value: 0.012807202339172364
|
|
|
|
key: test_mcc
|
|
value: [0.85164138 0.65110205 0.60697698 0.85634884 0.80829038 0.61494005
|
|
0.77849894 0.84866842 0.79056942 0.73131034]
|
|
|
|
mean value: 0.7538346799690747
|
|
|
|
key: train_mcc
|
|
value: [0.88621044 0.79855158 0.65963501 0.84046667 0.82974725 0.83758899
|
|
0.87436938 0.91519196 0.87093638 0.88085106]
|
|
|
|
mean value: 0.8393548746757886
|
|
|
|
key: test_accuracy
|
|
value: [0.9245283 0.81132075 0.76923077 0.92307692 0.90384615 0.78846154
|
|
0.88461538 0.92307692 0.88461538 0.86538462]
|
|
|
|
mean value: 0.8678156748911466
|
|
|
|
key: train_accuracy
|
|
value: [0.9424307 0.89339019 0.80638298 0.91702128 0.90851064 0.91489362
|
|
0.93617021 0.95744681 0.93404255 0.94042553]
|
|
|
|
mean value: 0.9150714512543665
|
|
|
|
key: test_fscore
|
|
value: [0.92592593 0.83870968 0.7 0.92857143 0.90196078 0.74418605
|
|
0.875 0.92 0.89655172 0.86792453]
|
|
|
|
mean value: 0.8598830115181881
|
|
|
|
key: train_fscore
|
|
value: [0.94409938 0.9015748 0.76240209 0.92184369 0.8997669 0.9086758
|
|
0.9339207 0.95689655 0.93660532 0.94042553]
|
|
|
|
mean value: 0.9106210762491109
|
|
|
|
key: test_precision
|
|
value: [0.89285714 0.74285714 1. 0.86666667 0.92 0.94117647
|
|
0.95454545 0.95833333 0.8125 0.85185185]
|
|
|
|
mean value: 0.8940788062699827
|
|
|
|
key: train_precision
|
|
value: [0.91935484 0.83576642 0.98648649 0.87121212 0.99484536 0.98029557
|
|
0.96803653 0.96943231 0.9015748 0.94042553]
|
|
|
|
mean value: 0.93674299762485
|
|
|
|
key: test_recall
|
|
value: [0.96153846 0.96296296 0.53846154 1. 0.88461538 0.61538462
|
|
0.80769231 0.88461538 1. 0.88461538]
|
|
|
|
mean value: 0.853988603988604
|
|
|
|
key: train_recall
|
|
value: [0.97021277 0.97863248 0.6212766 0.9787234 0.8212766 0.84680851
|
|
0.90212766 0.94468085 0.97446809 0.94042553]
|
|
|
|
mean value: 0.8978632478632479
|
|
|
|
key: test_roc_auc
|
|
value: [0.92521368 0.80840456 0.76923077 0.92307692 0.90384615 0.78846154
|
|
0.88461538 0.92307692 0.88461538 0.86538462]
|
|
|
|
mean value: 0.8675925925925926
|
|
|
|
key: train_roc_auc
|
|
value: [0.94237134 0.89357156 0.80638298 0.91702128 0.90851064 0.91489362
|
|
0.93617021 0.95744681 0.93404255 0.94042553]
|
|
|
|
mean value: 0.9150836515730132
|
|
|
|
key: test_jcc
|
|
value: [0.86206897 0.72222222 0.53846154 0.86666667 0.82142857 0.59259259
|
|
0.77777778 0.85185185 0.8125 0.76666667]
|
|
|
|
mean value: 0.7612236853185129
|
|
|
|
key: train_jcc
|
|
value: [0.89411765 0.82078853 0.61603376 0.85501859 0.81779661 0.83263598
|
|
0.87603306 0.91735537 0.88076923 0.8875502 ]
|
|
|
|
mean value: 0.839809897491723
|
|
|
|
MCC on Blind test: 0.73
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.18865156 0.17809844 0.17845082 0.18266034 0.1782515 0.17729211
|
|
0.17885733 0.17905545 0.18050551 0.18259931]
|
|
|
|
mean value: 0.1804422378540039
|
|
|
|
key: score_time
|
|
value: [0.01532364 0.01549387 0.01535821 0.01534915 0.01540208 0.01566911
|
|
0.01548195 0.01556087 0.01541352 0.01596475]
|
|
|
|
mean value: 0.015501713752746582
|
|
|
|
key: test_mcc
|
|
value: [0.8116984 0.92450142 0.96225045 0.96225045 0.9258201 0.96225045
|
|
0.81312325 0.9258201 0.96225045 0.92307692]
|
|
|
|
mean value: 0.9173041989812138
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.90566038 0.96226415 0.98076923 0.98076923 0.96153846 0.98076923
|
|
0.90384615 0.96153846 0.98076923 0.96153846]
|
|
|
|
mean value: 0.9579462989840348
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.90196078 0.96296296 0.98039216 0.98113208 0.96296296 0.98039216
|
|
0.89795918 0.96 0.98113208 0.96153846]
|
|
|
|
mean value: 0.9570432820120469
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.92 0.96296296 1. 0.96296296 0.92857143 1.
|
|
0.95652174 1. 0.96296296 0.96153846]
|
|
|
|
mean value: 0.9655520518129214
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.88461538 0.96296296 0.96153846 1. 1. 0.96153846
|
|
0.84615385 0.92307692 1. 0.96153846]
|
|
|
|
mean value: 0.9501424501424501
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.90527066 0.96225071 0.98076923 0.98076923 0.96153846 0.98076923
|
|
0.90384615 0.96153846 0.98076923 0.96153846]
|
|
|
|
mean value: 0.957905982905983
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.82142857 0.92857143 0.96153846 0.96296296 0.92857143 0.96153846
|
|
0.81481481 0.92307692 0.96296296 0.92592593]
|
|
|
|
mean value: 0.9191391941391941
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.95
|
|
|
|
Accuracy on Blind test: 0.98
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.06805992 0.06910896 0.06634355 0.0792799 0.0696907 0.07058072
|
|
0.088202 0.07411814 0.07483792 0.07089567]
|
|
|
|
mean value: 0.07311174869537354
|
|
|
|
key: score_time
|
|
value: [0.02007461 0.0277791 0.03617716 0.03125453 0.03761816 0.03880787
|
|
0.03588963 0.03868651 0.02243757 0.03866386]
|
|
|
|
mean value: 0.032738900184631346
|
|
|
|
key: test_mcc
|
|
value: [0.85164138 0.96291111 0.9258201 0.96225045 0.9258201 0.96225045
|
|
0.84866842 0.88527041 0.89056356 0.96225045]
|
|
|
|
mean value: 0.9177446427850156
|
|
|
|
key: train_mcc
|
|
value: [0.98721586 0.98721563 0.9873145 0.97873227 0.9957537 0.9873145
|
|
0.98724298 0.99152527 0.97478586 0.98297872]
|
|
|
|
mean value: 0.9860079271688947
|
|
|
|
key: test_accuracy
|
|
value: [0.9245283 0.98113208 0.96153846 0.98076923 0.96153846 0.98076923
|
|
0.92307692 0.94230769 0.94230769 0.98076923]
|
|
|
|
mean value: 0.9578737300435414
|
|
|
|
key: train_accuracy
|
|
value: [0.99360341 0.99360341 0.99361702 0.9893617 0.99787234 0.99361702
|
|
0.99361702 0.99574468 0.98723404 0.99148936]
|
|
|
|
mean value: 0.9929760014517081
|
|
|
|
key: test_fscore
|
|
value: [0.92592593 0.98181818 0.96 0.98113208 0.96296296 0.98039216
|
|
0.92 0.94339623 0.94545455 0.98113208]
|
|
|
|
mean value: 0.9582214150382852
|
|
|
|
key: train_fscore
|
|
value: [0.99360341 0.99357602 0.99357602 0.98933902 0.9978678 0.99357602
|
|
0.99363057 0.9957265 0.98739496 0.99148936]
|
|
|
|
mean value: 0.9929779674593665
|
|
|
|
key: test_precision
|
|
value: [0.89285714 0.96428571 1. 0.96296296 0.92857143 1.
|
|
0.95833333 0.92592593 0.89655172 0.96296296]
|
|
|
|
mean value: 0.9492451195037402
|
|
|
|
key: train_precision
|
|
value: [0.9957265 0.99570815 1. 0.99145299 1. 1.
|
|
0.99152542 1. 0.97510373 0.99148936]
|
|
|
|
mean value: 0.99410061615567
|
|
|
|
key: test_recall
|
|
value: [0.96153846 1. 0.92307692 1. 1. 0.96153846
|
|
0.88461538 0.96153846 1. 1. ]
|
|
|
|
mean value: 0.9692307692307692
|
|
|
|
key: train_recall
|
|
value: [0.99148936 0.99145299 0.98723404 0.98723404 0.99574468 0.98723404
|
|
0.99574468 0.99148936 1. 0.99148936]
|
|
|
|
mean value: 0.9919112565921077
|
|
|
|
key: test_roc_auc
|
|
value: [0.92521368 0.98076923 0.96153846 0.98076923 0.96153846 0.98076923
|
|
0.92307692 0.94230769 0.94230769 0.98076923]
|
|
|
|
mean value: 0.957905982905983
|
|
|
|
key: train_roc_auc
|
|
value: [0.99360793 0.99359884 0.99361702 0.9893617 0.99787234 0.99361702
|
|
0.99361702 0.99574468 0.98723404 0.99148936]
|
|
|
|
mean value: 0.9929759956355702
|
|
|
|
key: test_jcc
|
|
value: [0.86206897 0.96428571 0.92307692 0.96296296 0.92857143 0.96153846
|
|
0.85185185 0.89285714 0.89655172 0.96296296]
|
|
|
|
mean value: 0.9206728137762621
|
|
|
|
key: train_jcc
|
|
value: [0.98728814 0.98723404 0.98723404 0.97890295 0.99574468 0.98723404
|
|
0.98734177 0.99148936 0.97510373 0.98312236]
|
|
|
|
mean value: 0.9860695128853415
|
|
|
|
MCC on Blind test: 0.9
|
|
|
|
Accuracy on Blind test: 0.95
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.16664958 0.13899064 0.1713109 0.12175751 0.12432766 0.15624762
|
|
0.1744194 0.12778425 0.15947342 0.16238427]
|
|
|
|
mean value: 0.1503345251083374
|
|
|
|
key: score_time
|
|
value: [0.02422047 0.01502085 0.02464485 0.02465963 0.01550269 0.01529956
|
|
0.02474427 0.02651215 0.02499962 0.02835417]
|
|
|
|
mean value: 0.022395825386047362
|
|
|
|
key: test_mcc
|
|
value: [0.6980057 0.54700855 0.50336201 0.77151675 0.63245553 0.65433031
|
|
0.76923077 0.81312325 0.73131034 0.50037023]
|
|
|
|
mean value: 0.662071343296795
|
|
|
|
key: train_mcc
|
|
value: [0.98728791 0.99150708 0.9873145 0.9873145 0.9873145 0.99152527
|
|
0.9873145 0.9873145 0.9873145 0.9873145 ]
|
|
|
|
mean value: 0.9881521740698564
|
|
|
|
key: test_accuracy
|
|
value: [0.8490566 0.77358491 0.75 0.88461538 0.80769231 0.82692308
|
|
0.88461538 0.90384615 0.86538462 0.75 ]
|
|
|
|
mean value: 0.8295718432510886
|
|
|
|
key: train_accuracy
|
|
value: [0.99360341 0.99573561 0.99361702 0.99361702 0.99361702 0.99574468
|
|
0.99361702 0.99361702 0.99361702 0.99361702]
|
|
|
|
mean value: 0.9940402848977
|
|
|
|
key: test_fscore
|
|
value: [0.84615385 0.77777778 0.73469388 0.88 0.82758621 0.82352941
|
|
0.88461538 0.89795918 0.86792453 0.74509804]
|
|
|
|
mean value: 0.8285338255950329
|
|
|
|
key: train_fscore
|
|
value: [0.99357602 0.99570815 0.99357602 0.99357602 0.99357602 0.9957265
|
|
0.99357602 0.99357602 0.99357602 0.99357602]
|
|
|
|
mean value: 0.9940042787277901
|
|
|
|
key: test_precision
|
|
value: [0.84615385 0.77777778 0.7826087 0.91666667 0.75 0.84
|
|
0.88461538 0.95652174 0.85185185 0.76 ]
|
|
|
|
mean value: 0.8366195961848135
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.84615385 0.77777778 0.69230769 0.84615385 0.92307692 0.80769231
|
|
0.88461538 0.84615385 0.88461538 0.73076923]
|
|
|
|
mean value: 0.8239316239316239
|
|
|
|
key: train_recall
|
|
value: [0.98723404 0.99145299 0.98723404 0.98723404 0.98723404 0.99148936
|
|
0.98723404 0.98723404 0.98723404 0.98723404]
|
|
|
|
mean value: 0.9880814693580651
|
|
|
|
key: test_roc_auc
|
|
value: [0.84900285 0.77350427 0.75 0.88461538 0.80769231 0.82692308
|
|
0.88461538 0.90384615 0.86538462 0.75 ]
|
|
|
|
mean value: 0.8295584045584046
|
|
|
|
key: train_roc_auc
|
|
value: [0.99361702 0.9957265 0.99361702 0.99361702 0.99361702 0.99574468
|
|
0.99361702 0.99361702 0.99361702 0.99361702]
|
|
|
|
mean value: 0.9940407346790325
|
|
|
|
key: test_jcc
|
|
value: [0.73333333 0.63636364 0.58064516 0.78571429 0.70588235 0.7
|
|
0.79310345 0.81481481 0.76666667 0.59375 ]
|
|
|
|
mean value: 0.7110273699400098
|
|
|
|
key: train_jcc
|
|
value: [0.98723404 0.99145299 0.98723404 0.98723404 0.98723404 0.99148936
|
|
0.98723404 0.98723404 0.98723404 0.98723404]
|
|
|
|
mean value: 0.9880814693580651
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.7296083 0.72827125 0.7328434 0.73026657 0.72733808 0.73085546
|
|
0.72294044 0.72570992 0.74692559 0.73820591]
|
|
|
|
mean value: 0.7312964916229248
|
|
|
|
key: score_time
|
|
value: [0.00976491 0.00955319 0.00944519 0.00955868 0.00969505 0.00937343
|
|
0.00935888 0.00958729 0.01031232 0.00962877]
|
|
|
|
mean value: 0.009627771377563477
|
|
|
|
key: test_mcc
|
|
value: [0.85164138 0.92704716 0.96225045 0.96225045 0.9258201 0.96225045
|
|
0.88527041 0.92307692 0.9258201 0.96225045]
|
|
|
|
mean value: 0.9287677874797963
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9245283 0.96226415 0.98076923 0.98076923 0.96153846 0.98076923
|
|
0.94230769 0.96153846 0.96153846 0.98076923]
|
|
|
|
mean value: 0.9636792452830188
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.92592593 0.96428571 0.98039216 0.98113208 0.96296296 0.98039216
|
|
0.94117647 0.96153846 0.96296296 0.98113208]
|
|
|
|
mean value: 0.964190096293315
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.89285714 0.93103448 1. 0.96296296 0.92857143 1.
|
|
0.96 0.96153846 0.92857143 0.96296296]
|
|
|
|
mean value: 0.9528498870223008
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96153846 1. 0.96153846 1. 1. 0.96153846
|
|
0.92307692 0.96153846 1. 1. ]
|
|
|
|
mean value: 0.9769230769230769
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.92521368 0.96153846 0.98076923 0.98076923 0.96153846 0.98076923
|
|
0.94230769 0.96153846 0.96153846 0.98076923]
|
|
|
|
mean value: 0.9636752136752137
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.86206897 0.93103448 0.96153846 0.96296296 0.92857143 0.96153846
|
|
0.88888889 0.92592593 0.92857143 0.96296296]
|
|
|
|
mean value: 0.9314063969236382
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.86
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.03424716 0.03178167 0.03212857 0.05195475 0.05340314 0.04832959
|
|
0.04696679 0.03173852 0.0314827 0.03206182]
|
|
|
|
mean value: 0.0394094705581665
|
|
|
|
key: score_time
|
|
value: [0.01278758 0.0395515 0.03736591 0.0196743 0.02260709 0.01882935
|
|
0.01456594 0.01464891 0.01470041 0.02665329]
|
|
|
|
mean value: 0.022138428688049317
|
|
|
|
key: test_mcc
|
|
value: [0.54700855 0.25905207 0.62279916 0.34684399 0.33333333 0.53846154
|
|
0.51916999 0.18257419 0.63245553 0.31622777]
|
|
|
|
mean value: 0.4297926104204046
|
|
|
|
key: train_mcc
|
|
value: [0.95006652 0.73515544 0.88334763 0.5920935 0.83105203 0.97880317
|
|
0.93009643 0.6846532 0.96191988 0.85319469]
|
|
|
|
mean value: 0.840038248069776
|
|
|
|
key: test_accuracy
|
|
value: [0.77358491 0.62264151 0.80769231 0.65384615 0.65384615 0.76923077
|
|
0.75 0.57692308 0.80769231 0.65384615]
|
|
|
|
mean value: 0.7069303338171262
|
|
|
|
key: train_accuracy
|
|
value: [0.97441365 0.85074627 0.93829787 0.75957447 0.90851064 0.9893617
|
|
0.96382979 0.81914894 0.98085106 0.9212766 ]
|
|
|
|
mean value: 0.9106010978541941
|
|
|
|
key: test_fscore
|
|
value: [0.76923077 0.6875 0.82142857 0.71875 0.70967742 0.76923077
|
|
0.77966102 0.66666667 0.82758621 0.68965517]
|
|
|
|
mean value: 0.7439386592171113
|
|
|
|
key: train_fscore
|
|
value: [0.97510373 0.86988848 0.94188377 0.80617496 0.91617934 0.98929336
|
|
0.9650924 0.84684685 0.98105263 0.9270217 ]
|
|
|
|
mean value: 0.9218537211188351
|
|
|
|
key: test_precision
|
|
value: [0.76923077 0.59459459 0.76666667 0.60526316 0.61111111 0.76923077
|
|
0.6969697 0.55 0.75 0.625 ]
|
|
|
|
mean value: 0.6738066765698345
|
|
|
|
key: train_precision
|
|
value: [0.951417 0.76973684 0.89015152 0.67528736 0.84532374 0.99568966
|
|
0.93253968 0.734375 0.97083333 0.86397059]
|
|
|
|
mean value: 0.8629324717915119
|
|
|
|
key: test_recall
|
|
value: [0.76923077 0.81481481 0.88461538 0.88461538 0.84615385 0.76923077
|
|
0.88461538 0.84615385 0.92307692 0.76923077]
|
|
|
|
mean value: 0.8391737891737892
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 0.98297872
|
|
1. 1. 0.99148936 1. ]
|
|
|
|
mean value: 0.9974468085106383
|
|
|
|
key: test_roc_auc
|
|
value: [0.77350427 0.61894587 0.80769231 0.65384615 0.65384615 0.76923077
|
|
0.75 0.57692308 0.80769231 0.65384615]
|
|
|
|
mean value: 0.7065527065527065
|
|
|
|
key: train_roc_auc
|
|
value: [0.97435897 0.85106383 0.93829787 0.75957447 0.90851064 0.9893617
|
|
0.96382979 0.81914894 0.98085106 0.9212766 ]
|
|
|
|
mean value: 0.9106273867975996
|
|
|
|
key: test_jcc
|
|
value: [0.625 0.52380952 0.6969697 0.56097561 0.55 0.625
|
|
0.63888889 0.5 0.70588235 0.52631579]
|
|
|
|
mean value: 0.5952841861839068
|
|
|
|
key: train_jcc
|
|
value: [0.951417 0.76973684 0.89015152 0.67528736 0.84532374 0.97881356
|
|
0.93253968 0.734375 0.96280992 0.86397059]
|
|
|
|
mean value: 0.8604425206086777
|
|
|
|
MCC on Blind test: 0.41
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02965641 0.03908777 0.03907251 0.03891897 0.03882527 0.03877044
|
|
0.03892875 0.03883457 0.03862453 0.0391221 ]
|
|
|
|
mean value: 0.03798413276672363
|
|
|
|
key: score_time
|
|
value: [0.01900005 0.01908803 0.01899123 0.01887655 0.01898837 0.01889229
|
|
0.01906919 0.0189023 0.0188899 0.01907778]
|
|
|
|
mean value: 0.01897757053375244
|
|
|
|
key: test_mcc
|
|
value: [0.85164138 0.73997003 0.77849894 0.92307692 0.89056356 0.74466871
|
|
0.73131034 0.88527041 0.81312325 0.77849894]
|
|
|
|
mean value: 0.8136622485831022
|
|
|
|
key: train_mcc
|
|
value: [0.85528213 0.86366944 0.85168866 0.85544308 0.85535013 0.85581519
|
|
0.8769849 0.86847048 0.86411148 0.85107154]
|
|
|
|
mean value: 0.8597887013936425
|
|
|
|
key: test_accuracy
|
|
value: [0.9245283 0.86792453 0.88461538 0.96153846 0.94230769 0.86538462
|
|
0.86538462 0.94230769 0.90384615 0.88461538]
|
|
|
|
mean value: 0.904245283018868
|
|
|
|
key: train_accuracy
|
|
value: [0.92750533 0.93176972 0.92553191 0.92765957 0.92765957 0.92765957
|
|
0.93829787 0.93404255 0.93191489 0.92553191]
|
|
|
|
mean value: 0.929757292564533
|
|
|
|
key: test_fscore
|
|
value: [0.92592593 0.87719298 0.875 0.96153846 0.94545455 0.85106383
|
|
0.86792453 0.94117647 0.90909091 0.89285714]
|
|
|
|
mean value: 0.9047224796000481
|
|
|
|
key: train_fscore
|
|
value: [0.92857143 0.93220339 0.92693111 0.92827004 0.9279661 0.92887029
|
|
0.93920335 0.93501048 0.93277311 0.92569002]
|
|
|
|
mean value: 0.9305489328602898
|
|
|
|
key: test_precision
|
|
value: [0.89285714 0.83333333 0.95454545 0.96153846 0.89655172 0.95238095
|
|
0.85185185 0.96 0.86206897 0.83333333]
|
|
|
|
mean value: 0.8998461219495703
|
|
|
|
key: train_precision
|
|
value: [0.91701245 0.92436975 0.90983607 0.92050209 0.92405063 0.91358025
|
|
0.92561983 0.9214876 0.92116183 0.92372881]
|
|
|
|
mean value: 0.9201349310782884
|
|
|
|
key: test_recall
|
|
value: [0.96153846 0.92592593 0.80769231 0.96153846 1. 0.76923077
|
|
0.88461538 0.92307692 0.96153846 0.96153846]
|
|
|
|
mean value: 0.9156695156695157
|
|
|
|
key: train_recall
|
|
value: [0.94042553 0.94017094 0.94468085 0.93617021 0.93191489 0.94468085
|
|
0.95319149 0.94893617 0.94468085 0.92765957]
|
|
|
|
mean value: 0.9412511365702856
|
|
|
|
key: test_roc_auc
|
|
value: [0.92521368 0.86680912 0.88461538 0.96153846 0.94230769 0.86538462
|
|
0.86538462 0.94230769 0.90384615 0.88461538]
|
|
|
|
mean value: 0.9042022792022792
|
|
|
|
key: train_roc_auc
|
|
value: [0.92747772 0.9317876 0.92553191 0.92765957 0.92765957 0.92765957
|
|
0.93829787 0.93404255 0.93191489 0.92553191]
|
|
|
|
mean value: 0.9297563193307874
|
|
|
|
key: test_jcc
|
|
value: [0.86206897 0.78125 0.77777778 0.92592593 0.89655172 0.74074074
|
|
0.76666667 0.88888889 0.83333333 0.80645161]
|
|
|
|
mean value: 0.8279655635891732
|
|
|
|
key: train_jcc
|
|
value: [0.86666667 0.87301587 0.86381323 0.86614173 0.86561265 0.8671875
|
|
0.88537549 0.87795276 0.87401575 0.86166008]
|
|
|
|
mean value: 0.870144172681887
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.27485728 0.28193116 0.35548592 0.28141141 0.28373504 0.31686974
|
|
0.28387547 0.28604722 0.28356791 0.33537459]
|
|
|
|
mean value: 0.29831557273864745
|
|
|
|
key: score_time
|
|
value: [0.01901937 0.01924253 0.01896763 0.01907969 0.01897073 0.01907015
|
|
0.01907754 0.01901555 0.01911354 0.01900244]
|
|
|
|
mean value: 0.019055914878845216
|
|
|
|
key: test_mcc
|
|
value: [0.85164138 0.62867836 0.77849894 0.92307692 0.89056356 0.74466871
|
|
0.73131034 0.88527041 0.81312325 0.77849894]
|
|
|
|
mean value: 0.8025330823545165
|
|
|
|
key: train_mcc
|
|
value: [0.85528213 0.80817284 0.80498447 0.85544308 0.85535013 0.85581519
|
|
0.8769849 0.86847048 0.86411148 0.85107154]
|
|
|
|
mean value: 0.8495686235274289
|
|
|
|
key: test_accuracy
|
|
value: [0.9245283 0.81132075 0.88461538 0.96153846 0.94230769 0.86538462
|
|
0.86538462 0.94230769 0.90384615 0.88461538]
|
|
|
|
mean value: 0.8985849056603774
|
|
|
|
key: train_accuracy
|
|
value: [0.92750533 0.90405117 0.90212766 0.92765957 0.92765957 0.92765957
|
|
0.93829787 0.93404255 0.93191489 0.92553191]
|
|
|
|
mean value: 0.924645012021957
|
|
|
|
key: test_fscore
|
|
value: [0.92592593 0.82758621 0.875 0.96153846 0.94545455 0.85106383
|
|
0.86792453 0.94117647 0.90909091 0.89285714]
|
|
|
|
mean value: 0.8997618020440893
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_sl.py:148: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
ros_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_sl.py:151: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
ros_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[0.92857143 0.9044586 0.90416667 0.92827004 0.9279661 0.92887029
|
|
0.93920335 0.93501048 0.93277311 0.92569002]
|
|
|
|
mean value: 0.9254980097693355
|
|
|
|
key: test_precision
|
|
value: [0.89285714 0.77419355 0.95454545 0.96153846 0.89655172 0.95238095
|
|
0.85185185 0.96 0.86206897 0.83333333]
|
|
|
|
mean value: 0.8939321434549465
|
|
|
|
key: train_precision
|
|
value: [0.91701245 0.89873418 0.88571429 0.92050209 0.92405063 0.91358025
|
|
0.92561983 0.9214876 0.92116183 0.92372881]
|
|
|
|
mean value: 0.9151591960239429
|
|
|
|
key: test_recall
|
|
value: [0.96153846 0.88888889 0.80769231 0.96153846 1. 0.76923077
|
|
0.88461538 0.92307692 0.96153846 0.96153846]
|
|
|
|
mean value: 0.911965811965812
|
|
|
|
key: train_recall
|
|
value: [0.94042553 0.91025641 0.92340426 0.93617021 0.93191489 0.94468085
|
|
0.95319149 0.94893617 0.94468085 0.92765957]
|
|
|
|
mean value: 0.9361320240043645
|
|
|
|
key: test_roc_auc
|
|
value: [0.92521368 0.80982906 0.88461538 0.96153846 0.94230769 0.86538462
|
|
0.86538462 0.94230769 0.90384615 0.88461538]
|
|
|
|
mean value: 0.8985042735042735
|
|
|
|
key: train_roc_auc
|
|
value: [0.92747772 0.90406438 0.90212766 0.92765957 0.92765957 0.92765957
|
|
0.93829787 0.93404255 0.93191489 0.92553191]
|
|
|
|
mean value: 0.9246435715584652
|
|
|
|
key: test_jcc
|
|
value: [0.86206897 0.70588235 0.77777778 0.92592593 0.89655172 0.74074074
|
|
0.76666667 0.88888889 0.83333333 0.80645161]
|
|
|
|
mean value: 0.8204287988832908
|
|
|
|
key: train_jcc
|
|
value: [0.86666667 0.8255814 0.82509506 0.86614173 0.86561265 0.8671875
|
|
0.88537549 0.87795276 0.87401575 0.86166008]
|
|
|
|
mean value: 0.861528907661407
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03136301 0.03624725 0.03802299 0.0339272 0.03786993 0.03451753
|
|
0.03579712 0.0273068 0.03700662 0.03485203]
|
|
|
|
mean value: 0.03469104766845703
|
|
|
|
key: score_time
|
|
value: [0.01213574 0.01400828 0.01402617 0.01210189 0.01214027 0.01211405
|
|
0.0120852 0.01208687 0.01415348 0.01223993]
|
|
|
|
mean value: 0.01270918846130371
|
|
|
|
key: test_mcc
|
|
value: [0.8459178 0.92427578 0.85407434 0.84544958 0.80461538 0.76662339
|
|
0.80431528 0.88289781 0.72057669 0.68 ]
|
|
|
|
mean value: 0.8128746057870775
|
|
|
|
key: train_mcc
|
|
value: [0.85120279 0.85123255 0.86870834 0.86453248 0.85558875 0.86874413
|
|
0.85565707 0.8690155 0.86912823 0.87776273]
|
|
|
|
mean value: 0.863157258117954
|
|
|
|
key: test_accuracy
|
|
value: [0.92156863 0.96078431 0.92156863 0.92156863 0.90196078 0.88235294
|
|
0.90196078 0.94117647 0.86 0.84 ]
|
|
|
|
mean value: 0.9052941176470588
|
|
|
|
key: train_accuracy
|
|
value: [0.92560175 0.92560175 0.93435449 0.9321663 0.92778993 0.93435449
|
|
0.92778993 0.93435449 0.93449782 0.93886463]
|
|
|
|
mean value: 0.9315375574517691
|
|
|
|
key: test_fscore
|
|
value: [0.92307692 0.95833333 0.92592593 0.91666667 0.90196078 0.88888889
|
|
0.90566038 0.94339623 0.85714286 0.84 ]
|
|
|
|
mean value: 0.9061051983121906
|
|
|
|
key: train_fscore
|
|
value: [0.92576419 0.92608696 0.93449782 0.93304536 0.92778993 0.93449782
|
|
0.92810458 0.93506494 0.93506494 0.93913043]
|
|
|
|
mean value: 0.9319046952651103
|
|
|
|
key: test_precision
|
|
value: [0.88888889 1. 0.86206897 0.95652174 0.92 0.85714286
|
|
0.88888889 0.92592593 0.875 0.84 ]
|
|
|
|
mean value: 0.9014437265494237
|
|
|
|
key: train_precision
|
|
value: [0.92576419 0.92207792 0.93449782 0.92307692 0.92576419 0.93043478
|
|
0.92207792 0.92307692 0.92703863 0.93506494]
|
|
|
|
mean value: 0.9268874235466126
|
|
|
|
key: test_recall
|
|
value: [0.96 0.92 1. 0.88 0.88461538 0.92307692
|
|
0.92307692 0.96153846 0.84 0.84 ]
|
|
|
|
mean value: 0.9132307692307693
|
|
|
|
key: train_recall
|
|
value: [0.92576419 0.930131 0.93449782 0.94323144 0.92982456 0.93859649
|
|
0.93421053 0.94736842 0.94323144 0.94323144]
|
|
|
|
mean value: 0.9370087336244541
|
|
|
|
key: test_roc_auc
|
|
value: [0.92230769 0.96 0.92307692 0.92076923 0.90230769 0.88153846
|
|
0.90153846 0.94076923 0.86 0.84 ]
|
|
|
|
mean value: 0.9052307692307693
|
|
|
|
key: train_roc_auc
|
|
value: [0.92560139 0.92559182 0.93435417 0.93214204 0.92779438 0.93436375
|
|
0.92780395 0.9343829 0.93449782 0.93886463]
|
|
|
|
mean value: 0.9315396843637478
|
|
|
|
key: test_jcc
|
|
value: [0.85714286 0.92 0.86206897 0.84615385 0.82142857 0.8
|
|
0.82758621 0.89285714 0.75 0.72413793]
|
|
|
|
mean value: 0.8301375521030694
|
|
|
|
key: train_jcc
|
|
value: [0.86178862 0.86234818 0.87704918 0.87449393 0.86530612 0.87704918
|
|
0.86585366 0.87804878 0.87804878 0.8852459 ]
|
|
|
|
mean value: 0.8725232327405593
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.08443904 0.96518111 0.84693241 0.97842479 0.87160254 0.96375942
|
|
0.91595888 0.93315125 0.96591616 0.96043491]
|
|
|
|
mean value: 0.9485800504684448
|
|
|
|
key: score_time
|
|
value: [0.01489997 0.01467752 0.01478934 0.02118349 0.01491809 0.01476455
|
|
0.01487756 0.01552033 0.01489472 0.01519156]
|
|
|
|
mean value: 0.0155717134475708
|
|
|
|
key: test_mcc
|
|
value: [0.8459178 0.88289781 0.88872671 0.80904133 0.88307692 0.76461538
|
|
0.80431528 0.92153846 0.76 0.6821865 ]
|
|
|
|
mean value: 0.8242316201242978
|
|
|
|
key: train_mcc
|
|
value: [0.90375223 0.89066391 0.91247223 0.95627191 0.88184708 0.90375591
|
|
0.89059986 0.89956325 0.90406806 0.96510231]
|
|
|
|
mean value: 0.9108096761378437
|
|
|
|
key: test_accuracy
|
|
value: [0.92156863 0.94117647 0.94117647 0.90196078 0.94117647 0.88235294
|
|
0.90196078 0.96078431 0.88 0.84 ]
|
|
|
|
mean value: 0.9112156862745098
|
|
|
|
key: train_accuracy
|
|
value: [0.95185996 0.9452954 0.95623632 0.97811816 0.94091904 0.95185996
|
|
0.9452954 0.94967177 0.95196507 0.98253275]
|
|
|
|
mean value: 0.9553753834099357
|
|
|
|
key: test_fscore
|
|
value: [0.92307692 0.93877551 0.94339623 0.89361702 0.94117647 0.88461538
|
|
0.90566038 0.96153846 0.88 0.83333333]
|
|
|
|
mean value: 0.91051897084066
|
|
|
|
key: train_fscore
|
|
value: [0.95217391 0.94577007 0.95633188 0.97826087 0.94091904 0.95196507
|
|
0.9452954 0.95010846 0.95238095 0.98245614]
|
|
|
|
mean value: 0.9555661785530866
|
|
|
|
key: test_precision
|
|
value: [0.88888889 0.95833333 0.89285714 0.95454545 0.96 0.88461538
|
|
0.88888889 0.96153846 0.88 0.86956522]
|
|
|
|
mean value: 0.9139232772058858
|
|
|
|
key: train_precision
|
|
value: [0.94805195 0.93965517 0.95633188 0.97402597 0.93886463 0.94782609
|
|
0.94323144 0.93991416 0.94420601 0.98678414]
|
|
|
|
mean value: 0.9518891441689473
|
|
|
|
key: test_recall
|
|
value: [0.96 0.92 1. 0.84 0.92307692 0.88461538
|
|
0.92307692 0.96153846 0.88 0.8 ]
|
|
|
|
mean value: 0.9092307692307693
|
|
|
|
key: train_recall
|
|
value: [0.95633188 0.95196507 0.95633188 0.98253275 0.94298246 0.95614035
|
|
0.94736842 0.96052632 0.96069869 0.97816594]
|
|
|
|
mean value: 0.9593043744733012
|
|
|
|
key: test_roc_auc
|
|
value: [0.92230769 0.94076923 0.94230769 0.90076923 0.94153846 0.88230769
|
|
0.90153846 0.96076923 0.88 0.84 ]
|
|
|
|
mean value: 0.9112307692307692
|
|
|
|
key: train_roc_auc
|
|
value: [0.95185015 0.94528078 0.95623611 0.97810848 0.94092354 0.9518693
|
|
0.94529993 0.94969547 0.95196507 0.98253275]
|
|
|
|
mean value: 0.955376158737455
|
|
|
|
key: test_jcc
|
|
value: [0.85714286 0.88461538 0.89285714 0.80769231 0.88888889 0.79310345
|
|
0.82758621 0.92592593 0.78571429 0.71428571]
|
|
|
|
mean value: 0.8377812162294921
|
|
|
|
key: train_jcc
|
|
value: [0.90871369 0.89711934 0.91631799 0.95744681 0.88842975 0.90833333
|
|
0.89626556 0.90495868 0.90909091 0.96551724]
|
|
|
|
mean value: 0.9152193308373876
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01459789 0.01174212 0.01132417 0.01122999 0.0112493 0.00993657
|
|
0.00992107 0.01049495 0.00995612 0.01000881]
|
|
|
|
mean value: 0.011046099662780761
|
|
|
|
key: score_time
|
|
value: [0.01251364 0.01037288 0.00981545 0.0099647 0.0097928 0.00900888
|
|
0.00895166 0.00888586 0.0090785 0.00888777]
|
|
|
|
mean value: 0.009727215766906739
|
|
|
|
key: test_mcc
|
|
value: [0.7531751 0.6938347 0.64715023 0.64769231 0.70728397 0.60769231
|
|
0.77353193 0.5372904 0.68 0.61806423]
|
|
|
|
mean value: 0.6665715181657266
|
|
|
|
key: train_mcc
|
|
value: [0.7051679 0.68663317 0.70105568 0.71111913 0.69491764 0.71441791
|
|
0.68598516 0.69626736 0.71979689 0.71353415]
|
|
|
|
mean value: 0.7028894982851523
|
|
|
|
key: test_accuracy
|
|
value: [0.8627451 0.84313725 0.82352941 0.82352941 0.84313725 0.80392157
|
|
0.88235294 0.76470588 0.84 0.8 ]
|
|
|
|
mean value: 0.8287058823529412
|
|
|
|
key: train_accuracy
|
|
value: [0.8512035 0.84245077 0.84901532 0.85339168 0.84682713 0.85557987
|
|
0.84026258 0.84682713 0.8580786 0.8558952 ]
|
|
|
|
mean value: 0.8499531785997535
|
|
|
|
key: test_fscore
|
|
value: [0.8372093 0.82608696 0.81632653 0.82352941 0.82608696 0.80769231
|
|
0.89285714 0.75 0.84 0.77272727]
|
|
|
|
mean value: 0.8192515881022734
|
|
|
|
key: train_fscore
|
|
value: [0.84474886 0.83710407 0.84210526 0.84526559 0.84162896 0.84792627
|
|
0.82903981 0.83944954 0.85057471 0.85067873]
|
|
|
|
mean value: 0.8428521809081373
|
|
|
|
key: test_precision
|
|
value: [1. 0.9047619 0.83333333 0.80769231 0.95 0.80769231
|
|
0.83333333 0.81818182 0.84 0.89473684]
|
|
|
|
mean value: 0.8689731847100268
|
|
|
|
key: train_precision
|
|
value: [0.88516746 0.8685446 0.88461538 0.89705882 0.86915888 0.89320388
|
|
0.88944724 0.87980769 0.89805825 0.88262911]
|
|
|
|
mean value: 0.8847691324095417
|
|
|
|
key: test_recall
|
|
value: [0.72 0.76 0.8 0.84 0.73076923 0.80769231
|
|
0.96153846 0.69230769 0.84 0.68 ]
|
|
|
|
mean value: 0.7832307692307692
|
|
|
|
key: train_recall
|
|
value: [0.80786026 0.80786026 0.80349345 0.79912664 0.81578947 0.80701754
|
|
0.77631579 0.80263158 0.80786026 0.8209607 ]
|
|
|
|
mean value: 0.8048915958017314
|
|
|
|
key: test_roc_auc
|
|
value: [0.86 0.84153846 0.82307692 0.82384615 0.84538462 0.80384615
|
|
0.88076923 0.76615385 0.84 0.8 ]
|
|
|
|
mean value: 0.8284615384615385
|
|
|
|
key: train_roc_auc
|
|
value: [0.85129855 0.84252662 0.84911515 0.85351069 0.84675937 0.85547384
|
|
0.84012296 0.84673064 0.8580786 0.8558952 ]
|
|
|
|
mean value: 0.8499511606527235
|
|
|
|
key: test_jcc
|
|
value: [0.72 0.7037037 0.68965517 0.7 0.7037037 0.67741935
|
|
0.80645161 0.6 0.72413793 0.62962963]
|
|
|
|
mean value: 0.6954701108227248
|
|
|
|
key: train_jcc
|
|
value: [0.7312253 0.71984436 0.72727273 0.732 0.7265625 0.736
|
|
0.708 0.72332016 0.74 0.74015748]
|
|
|
|
mean value: 0.7284382520109796
|
|
|
|
MCC on Blind test: 0.63
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01254296 0.01074314 0.0115149 0.01157999 0.01032472 0.01010489
|
|
0.01028371 0.01140451 0.01022911 0.01108527]
|
|
|
|
mean value: 0.010981321334838867
|
|
|
|
key: score_time
|
|
value: [0.00986099 0.00936627 0.00975776 0.00993228 0.00894356 0.0091629
|
|
0.00905657 0.00938177 0.00915146 0.01000309]
|
|
|
|
mean value: 0.009461665153503418
|
|
|
|
key: test_mcc
|
|
value: [0.88289781 0.62355907 0.77487835 0.68875274 0.72615385 0.72573276
|
|
0.68779719 0.61017022 0.60783067 0.72524067]
|
|
|
|
mean value: 0.705301332855258
|
|
|
|
key: train_mcc
|
|
value: [0.75071367 0.74619319 0.75930821 0.77253746 0.74212413 0.74619319
|
|
0.72014338 0.75504732 0.77747792 0.76862491]
|
|
|
|
mean value: 0.7538363372752251
|
|
|
|
key: test_accuracy
|
|
value: [0.94117647 0.80392157 0.88235294 0.84313725 0.8627451 0.8627451
|
|
0.84313725 0.80392157 0.8 0.86 ]
|
|
|
|
mean value: 0.8503137254901961
|
|
|
|
key: train_accuracy
|
|
value: [0.87527352 0.87308534 0.87964989 0.88621444 0.87089716 0.87308534
|
|
0.85995624 0.87746171 0.88864629 0.88427948]
|
|
|
|
mean value: 0.876854939657726
|
|
|
|
key: test_fscore
|
|
value: [0.93877551 0.77272727 0.88888889 0.84615385 0.8627451 0.86792453
|
|
0.85185185 0.8 0.81481481 0.85106383]
|
|
|
|
mean value: 0.8494945640769093
|
|
|
|
key: train_fscore
|
|
value: [0.87688985 0.87391304 0.87964989 0.88744589 0.86859688 0.8722467
|
|
0.85777778 0.87826087 0.88984881 0.88503254]
|
|
|
|
mean value: 0.8769662245721188
|
|
|
|
key: test_precision
|
|
value: [0.95833333 0.89473684 0.82758621 0.81481481 0.88 0.85185185
|
|
0.82142857 0.83333333 0.75862069 0.90909091]
|
|
|
|
mean value: 0.8549796552509801
|
|
|
|
key: train_precision
|
|
value: [0.86752137 0.87012987 0.88157895 0.87982833 0.88235294 0.87610619
|
|
0.86936937 0.87068966 0.88034188 0.87931034]
|
|
|
|
mean value: 0.8757228896777902
|
|
|
|
key: test_recall
|
|
value: [0.92 0.68 0.96 0.88 0.84615385 0.88461538
|
|
0.88461538 0.76923077 0.88 0.8 ]
|
|
|
|
mean value: 0.8504615384615385
|
|
|
|
key: train_recall
|
|
value: [0.88646288 0.87772926 0.87772926 0.89519651 0.85526316 0.86842105
|
|
0.84649123 0.88596491 0.89956332 0.89082969]
|
|
|
|
mean value: 0.878365126790776
|
|
|
|
key: test_roc_auc
|
|
value: [0.94076923 0.80153846 0.88384615 0.84384615 0.86307692 0.86230769
|
|
0.84230769 0.80461538 0.8 0.86 ]
|
|
|
|
mean value: 0.8502307692307692
|
|
|
|
key: train_roc_auc
|
|
value: [0.87524898 0.87307516 0.8796541 0.88619474 0.87086302 0.87307516
|
|
0.85992684 0.87748027 0.88864629 0.88427948]
|
|
|
|
mean value: 0.8768444035853827
|
|
|
|
key: test_jcc
|
|
value: [0.88461538 0.62962963 0.8 0.73333333 0.75862069 0.76666667
|
|
0.74193548 0.66666667 0.6875 0.74074074]
|
|
|
|
mean value: 0.7409708595178561
|
|
|
|
key: train_jcc
|
|
value: [0.78076923 0.77606178 0.78515625 0.79766537 0.76771654 0.7734375
|
|
0.75097276 0.78294574 0.80155642 0.79377432]
|
|
|
|
mean value: 0.7810055900293517
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.0107193 0.01060081 0.01052904 0.01056099 0.01059866 0.01077914
|
|
0.0097878 0.00977492 0.00972342 0.00962353]
|
|
|
|
mean value: 0.010269761085510254
|
|
|
|
key: score_time
|
|
value: [0.01285338 0.01341796 0.01517129 0.01273727 0.01308703 0.0128603
|
|
0.01233506 0.01229858 0.01202655 0.01252079]
|
|
|
|
mean value: 0.012930822372436524
|
|
|
|
key: test_mcc
|
|
value: [0.72984534 0.2668549 0.61017022 0.49076923 0.48998517 0.68875274
|
|
0.61017022 0.41140265 0.6 0.5 ]
|
|
|
|
mean value: 0.539795046786689
|
|
|
|
key: train_mcc
|
|
value: [0.69010909 0.70240558 0.68928004 0.69803298 0.72889968 0.71606598
|
|
0.68082181 0.72482631 0.70759226 0.72995395]
|
|
|
|
mean value: 0.7067987676078814
|
|
|
|
key: test_accuracy
|
|
value: [0.8627451 0.62745098 0.80392157 0.74509804 0.74509804 0.84313725
|
|
0.80392157 0.70588235 0.8 0.74 ]
|
|
|
|
mean value: 0.7677254901960785
|
|
|
|
key: train_accuracy
|
|
value: [0.84463895 0.8512035 0.84463895 0.84901532 0.8643326 0.85776805
|
|
0.84026258 0.86214442 0.85371179 0.86462882]
|
|
|
|
mean value: 0.8532344987721326
|
|
|
|
key: test_fscore
|
|
value: [0.85106383 0.53658537 0.80769231 0.74509804 0.75471698 0.84
|
|
0.8 0.71698113 0.8 0.69767442]
|
|
|
|
mean value: 0.7549812074361085
|
|
|
|
key: train_fscore
|
|
value: [0.84116331 0.85152838 0.8453159 0.8496732 0.86222222 0.85458613
|
|
0.83741648 0.8590604 0.85209713 0.86160714]
|
|
|
|
mean value: 0.8514670310824969
|
|
|
|
key: test_precision
|
|
value: [0.90909091 0.6875 0.77777778 0.73076923 0.74074074 0.875
|
|
0.83333333 0.7037037 0.8 0.83333333]
|
|
|
|
mean value: 0.7891249028749029
|
|
|
|
key: train_precision
|
|
value: [0.86238532 0.85152838 0.84347826 0.84782609 0.87387387 0.87214612
|
|
0.85067873 0.87671233 0.86160714 0.88127854]
|
|
|
|
mean value: 0.8621514789270541
|
|
|
|
key: test_recall
|
|
value: [0.8 0.44 0.84 0.76 0.76923077 0.80769231
|
|
0.76923077 0.73076923 0.8 0.6 ]
|
|
|
|
mean value: 0.7316923076923078
|
|
|
|
key: train_recall
|
|
value: [0.8209607 0.85152838 0.84716157 0.85152838 0.85087719 0.8377193
|
|
0.8245614 0.84210526 0.84279476 0.84279476]
|
|
|
|
mean value: 0.8412031716846702
|
|
|
|
key: test_roc_auc
|
|
value: [0.86153846 0.62384615 0.80461538 0.74538462 0.74461538 0.84384615
|
|
0.80461538 0.70538462 0.8 0.74 ]
|
|
|
|
mean value: 0.7673846153846153
|
|
|
|
key: train_roc_auc
|
|
value: [0.84469088 0.85120279 0.84463342 0.84900981 0.86430323 0.85772428
|
|
0.8402283 0.86210067 0.85371179 0.86462882]
|
|
|
|
mean value: 0.8532233969202482
|
|
|
|
key: test_jcc
|
|
value: [0.74074074 0.36666667 0.67741935 0.59375 0.60606061 0.72413793
|
|
0.66666667 0.55882353 0.66666667 0.53571429]
|
|
|
|
mean value: 0.613664644780059
|
|
|
|
key: train_jcc
|
|
value: [0.72586873 0.74144487 0.73207547 0.73863636 0.7578125 0.74609375
|
|
0.72030651 0.75294118 0.74230769 0.75686275]
|
|
|
|
mean value: 0.7414349805409636
|
|
|
|
MCC on Blind test: 0.38
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02395844 0.01976371 0.02095556 0.02216721 0.02027059 0.0211978
|
|
0.01987791 0.02031898 0.02200055 0.01978111]
|
|
|
|
mean value: 0.021029186248779298
|
|
|
|
key: score_time
|
|
value: [0.01175141 0.0115068 0.01230741 0.01178312 0.01150537 0.01177835
|
|
0.01278973 0.01174521 0.0115788 0.0126636 ]
|
|
|
|
mean value: 0.011940979957580566
|
|
|
|
key: test_mcc
|
|
value: [0.8459178 0.88823731 0.85407434 0.80431528 0.80461538 0.80431528
|
|
0.80461538 0.80431528 0.76 0.64051262]
|
|
|
|
mean value: 0.8010918682007854
|
|
|
|
key: train_mcc
|
|
value: [0.79431931 0.7943723 0.79881623 0.80306832 0.80307209 0.80746615
|
|
0.80307209 0.80306832 0.81223482 0.81662503]
|
|
|
|
mean value: 0.8036114663208501
|
|
|
|
key: test_accuracy
|
|
value: [0.92156863 0.94117647 0.92156863 0.90196078 0.90196078 0.90196078
|
|
0.90196078 0.90196078 0.88 0.82 ]
|
|
|
|
mean value: 0.8994117647058824
|
|
|
|
key: train_accuracy
|
|
value: [0.89715536 0.89715536 0.89934354 0.90153173 0.90153173 0.90371991
|
|
0.90153173 0.90153173 0.90611354 0.90829694]
|
|
|
|
mean value: 0.9017911574441249
|
|
|
|
key: test_fscore
|
|
value: [0.92307692 0.93617021 0.92592593 0.89795918 0.90196078 0.90566038
|
|
0.90196078 0.90566038 0.88 0.81632653]
|
|
|
|
mean value: 0.8994701099398953
|
|
|
|
key: train_fscore
|
|
value: [0.89715536 0.89804772 0.89867841 0.90196078 0.90153173 0.9030837
|
|
0.90153173 0.9010989 0.90631808 0.90869565]
|
|
|
|
mean value: 0.9018102075636133
|
|
|
|
key: test_precision
|
|
value: [0.88888889 1. 0.86206897 0.91666667 0.92 0.88888889
|
|
0.92 0.88888889 0.88 0.83333333]
|
|
|
|
mean value: 0.8998735632183907
|
|
|
|
key: train_precision
|
|
value: [0.89912281 0.89224138 0.90666667 0.9 0.89956332 0.90707965
|
|
0.89956332 0.9030837 0.90434783 0.9047619 ]
|
|
|
|
mean value: 0.901643056785623
|
|
|
|
key: test_recall
|
|
value: [0.96 0.88 1. 0.88 0.88461538 0.92307692
|
|
0.88461538 0.92307692 0.88 0.8 ]
|
|
|
|
mean value: 0.9015384615384615
|
|
|
|
key: train_recall
|
|
value: [0.89519651 0.90393013 0.89082969 0.90393013 0.90350877 0.89912281
|
|
0.90350877 0.89912281 0.90829694 0.91266376]
|
|
|
|
mean value: 0.902011031946679
|
|
|
|
key: test_roc_auc
|
|
value: [0.92230769 0.94 0.92307692 0.90153846 0.90230769 0.90153846
|
|
0.90230769 0.90153846 0.88 0.82 ]
|
|
|
|
mean value: 0.8994615384615384
|
|
|
|
key: train_roc_auc
|
|
value: [0.89715966 0.8971405 0.89936222 0.90152647 0.90153605 0.90370988
|
|
0.90153605 0.90152647 0.90611354 0.90829694]
|
|
|
|
mean value: 0.9017907760668046
|
|
|
|
key: test_jcc
|
|
value: [0.85714286 0.88 0.86206897 0.81481481 0.82142857 0.82758621
|
|
0.82142857 0.82758621 0.78571429 0.68965517]
|
|
|
|
mean value: 0.8187425652253238
|
|
|
|
key: train_jcc
|
|
value: [0.81349206 0.81496063 0.816 0.82142857 0.82071713 0.82329317
|
|
0.82071713 0.82 0.82868526 0.83266932]
|
|
|
|
mean value: 0.8211963282154172
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.95274234 1.77280283 1.89924169 1.68007755 2.21786213 1.90000033
|
|
1.98987079 1.87301683 1.61986661 2.00075674]
|
|
|
|
mean value: 1.8906237840652467
|
|
|
|
key: score_time
|
|
value: [0.01277399 0.01594472 0.01440883 0.01712847 0.0163753 0.01307678
|
|
0.0148952 0.02330542 0.01241088 0.01312971]
|
|
|
|
mean value: 0.015344929695129395
|
|
|
|
key: test_mcc
|
|
value: [0.8459178 0.88289781 0.8459178 0.80904133 0.84307692 0.65224812
|
|
0.84307692 0.80431528 0.72057669 0.64051262]
|
|
|
|
mean value: 0.7887581298524704
|
|
|
|
key: train_mcc
|
|
value: [0.99124722 0.9956331 1. 0.98695627 0.99128503 0.99563319
|
|
0.98688041 0.99563319 0.96966347 1. ]
|
|
|
|
mean value: 0.9912931889984316
|
|
|
|
key: test_accuracy
|
|
value: [0.92156863 0.94117647 0.92156863 0.90196078 0.92156863 0.82352941
|
|
0.92156863 0.90196078 0.86 0.82 ]
|
|
|
|
mean value: 0.8934901960784314
|
|
|
|
key: train_accuracy
|
|
value: [0.99562363 0.99781182 1. 0.99343545 0.99562363 0.99781182
|
|
0.99343545 0.99781182 0.98471616 1. ]
|
|
|
|
mean value: 0.9956269767708522
|
|
|
|
key: test_fscore
|
|
value: [0.92307692 0.93877551 0.92307692 0.89361702 0.92307692 0.81632653
|
|
0.92307692 0.90566038 0.8627451 0.81632653]
|
|
|
|
mean value: 0.8925758760410566
|
|
|
|
key: train_fscore
|
|
value: [0.99563319 0.99782135 1. 0.99340659 0.99559471 0.99781182
|
|
0.99343545 0.99781182 0.98488121 1. ]
|
|
|
|
mean value: 0.9956396136064475
|
|
|
|
key: test_precision
|
|
value: [0.88888889 0.95833333 0.88888889 0.95454545 0.92307692 0.86956522
|
|
0.92307692 0.88888889 0.84615385 0.83333333]
|
|
|
|
mean value: 0.8974751697577784
|
|
|
|
key: train_precision
|
|
value: [0.99563319 0.99565217 1. 1. 1. 0.99563319
|
|
0.99126638 0.99563319 0.97435897 1. ]
|
|
|
|
mean value: 0.9948177087136647
|
|
|
|
key: test_recall
|
|
value: [0.96 0.92 0.96 0.84 0.92307692 0.76923077
|
|
0.92307692 0.92307692 0.88 0.8 ]
|
|
|
|
mean value: 0.8898461538461538
|
|
|
|
key: train_recall
|
|
value: [0.99563319 1. 1. 0.98689956 0.99122807 1.
|
|
0.99561404 1. 0.99563319 1. ]
|
|
|
|
mean value: 0.9965008044127787
|
|
|
|
key: test_roc_auc
|
|
value: [0.92230769 0.94076923 0.92230769 0.90076923 0.92153846 0.82461538
|
|
0.92153846 0.90153846 0.86 0.82 ]
|
|
|
|
mean value: 0.8935384615384615
|
|
|
|
key: train_roc_auc
|
|
value: [0.99562361 0.99780702 1. 0.99344978 0.99561404 0.99781659
|
|
0.99344021 0.99781659 0.98471616 1. ]
|
|
|
|
mean value: 0.9956283996016241
|
|
|
|
key: test_jcc
|
|
value: [0.85714286 0.88461538 0.85714286 0.80769231 0.85714286 0.68965517
|
|
0.85714286 0.82758621 0.75862069 0.68965517]
|
|
|
|
mean value: 0.8086396362258431
|
|
|
|
key: train_jcc
|
|
value: [0.99130435 0.99565217 1. 0.98689956 0.99122807 0.99563319
|
|
0.98695652 0.99563319 0.97021277 1. ]
|
|
|
|
mean value: 0.9913519818475776
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.04281473 0.02258086 0.01959491 0.02242303 0.01986456 0.02302599
|
|
0.02203774 0.0218904 0.02118134 0.02031946]
|
|
|
|
mean value: 0.02357330322265625
|
|
|
|
key: score_time
|
|
value: [0.01040149 0.00911903 0.00908256 0.0089097 0.00903273 0.00893998
|
|
0.00948405 0.00905752 0.00999212 0.00940347]
|
|
|
|
mean value: 0.009342265129089356
|
|
|
|
key: test_mcc
|
|
value: [0.96153846 0.76662339 0.92450033 0.64715023 0.96148034 0.92153846
|
|
0.96148034 0.96153846 0.88070485 0.76 ]
|
|
|
|
mean value: 0.8746554854753938
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98039216 0.88235294 0.96078431 0.82352941 0.98039216 0.96078431
|
|
0.98039216 0.98039216 0.94 0.88 ]
|
|
|
|
mean value: 0.9369019607843136
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98039216 0.875 0.96153846 0.81632653 0.98113208 0.96153846
|
|
0.98113208 0.98039216 0.93877551 0.88 ]
|
|
|
|
mean value: 0.9356227428562136
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96153846 0.91304348 0.92592593 0.83333333 0.96296296 0.96153846
|
|
0.96296296 1. 0.95833333 0.88 ]
|
|
|
|
mean value: 0.9359638919856311
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.84 1. 0.8 1. 0.96153846
|
|
1. 0.96153846 0.92 0.88 ]
|
|
|
|
mean value: 0.9363076923076923
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98076923 0.88153846 0.96153846 0.82307692 0.98 0.96076923
|
|
0.98 0.98076923 0.94 0.88 ]
|
|
|
|
mean value: 0.9368461538461539
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96153846 0.77777778 0.92592593 0.68965517 0.96296296 0.92592593
|
|
0.96296296 0.96153846 0.88461538 0.78571429]
|
|
|
|
mean value: 0.8838617321375942
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.82
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.12462068 0.12486506 0.12026739 0.12022614 0.12115216 0.11893344
|
|
0.11995935 0.11945629 0.11925888 0.11902761]
|
|
|
|
mean value: 0.12077670097351074
|
|
|
|
key: score_time
|
|
value: [0.01896501 0.01762033 0.01811218 0.01792049 0.01771355 0.01781607
|
|
0.01778412 0.01798749 0.01778889 0.01803756]
|
|
|
|
mean value: 0.01797456741333008
|
|
|
|
key: test_mcc
|
|
value: [0.92153846 0.78581168 0.82041265 0.88289781 0.76733527 0.68779719
|
|
0.80904133 0.73107432 0.6821865 0.72057669]
|
|
|
|
mean value: 0.7808671912348781
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96078431 0.88235294 0.90196078 0.94117647 0.88235294 0.84313725
|
|
0.90196078 0.8627451 0.84 0.86 ]
|
|
|
|
mean value: 0.8876470588235295
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96 0.86363636 0.90909091 0.93877551 0.88 0.85185185
|
|
0.90909091 0.85714286 0.84615385 0.8627451 ]
|
|
|
|
mean value: 0.8878487345210034
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96 1. 0.83333333 0.95833333 0.91666667 0.82142857
|
|
0.86206897 0.91304348 0.81481481 0.84615385]
|
|
|
|
mean value: 0.8925843009508676
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96 0.76 1. 0.92 0.84615385 0.88461538
|
|
0.96153846 0.80769231 0.88 0.88 ]
|
|
|
|
mean value: 0.89
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96076923 0.88 0.90384615 0.94076923 0.88307692 0.84230769
|
|
0.90076923 0.86384615 0.84 0.86 ]
|
|
|
|
mean value: 0.8875384615384615
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.92307692 0.76 0.83333333 0.88461538 0.78571429 0.74193548
|
|
0.83333333 0.75 0.73333333 0.75862069]
|
|
|
|
mean value: 0.8003962766932734
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.81
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01009369 0.0099442 0.00995088 0.00998259 0.01005554 0.01011753
|
|
0.00999284 0.01003075 0.01016021 0.01010418]
|
|
|
|
mean value: 0.01004323959350586
|
|
|
|
key: score_time
|
|
value: [0.00866437 0.00865102 0.00874305 0.008708 0.00879359 0.00871396
|
|
0.00872636 0.00874496 0.00876403 0.00880837]
|
|
|
|
mean value: 0.008731770515441894
|
|
|
|
key: test_mcc
|
|
value: [0.72984534 0.53444024 0.61648638 0.41306141 0.61017022 0.30559708
|
|
0.5301448 0.29366622 0.52167203 0.24174689]
|
|
|
|
mean value: 0.47968306142454376
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.8627451 0.76470588 0.80392157 0.70588235 0.80392157 0.64705882
|
|
0.76470588 0.64705882 0.76 0.62 ]
|
|
|
|
mean value: 0.738
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.85106383 0.73913043 0.81481481 0.68085106 0.8 0.60869565
|
|
0.77777778 0.66666667 0.76923077 0.64150943]
|
|
|
|
mean value: 0.7349740443025836
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.90909091 0.80952381 0.75862069 0.72727273 0.83333333 0.7
|
|
0.75 0.64285714 0.74074074 0.60714286]
|
|
|
|
mean value: 0.7478582209616692
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.8 0.68 0.88 0.64 0.76923077 0.53846154
|
|
0.80769231 0.69230769 0.8 0.68 ]
|
|
|
|
mean value: 0.7287692307692308
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.86153846 0.76307692 0.80538462 0.70461538 0.80461538 0.64923077
|
|
0.76384615 0.64615385 0.76 0.62 ]
|
|
|
|
mean value: 0.7378461538461538
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.74074074 0.5862069 0.6875 0.51612903 0.66666667 0.4375
|
|
0.63636364 0.5 0.625 0.47222222]
|
|
|
|
mean value: 0.5868329194803055
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.84472108 1.87050414 1.77609229 1.76353192 1.76929617 1.75256467
|
|
1.76388931 1.7507031 1.74432325 1.72300816]
|
|
|
|
mean value: 1.7758634090423584
|
|
|
|
key: score_time
|
|
value: [0.10199165 0.10157084 0.09312439 0.09370327 0.0956409 0.09323788
|
|
0.1439209 0.09496427 0.09200835 0.09272313]
|
|
|
|
mean value: 0.10028855800628662
|
|
|
|
key: test_mcc
|
|
value: [0.96148034 0.88823731 0.88872671 0.84307692 1. 0.88289781
|
|
0.92427578 0.92450033 0.84 0.88070485]
|
|
|
|
mean value: 0.9033900045709102
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98039216 0.94117647 0.94117647 0.92156863 1. 0.94117647
|
|
0.96078431 0.96078431 0.92 0.94 ]
|
|
|
|
mean value: 0.9507058823529412
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.97959184 0.93617021 0.94339623 0.92 1. 0.94339623
|
|
0.96296296 0.96 0.92 0.93877551]
|
|
|
|
mean value: 0.9504292975497886
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.89285714 0.92 1. 0.92592593
|
|
0.92857143 1. 0.92 0.95833333]
|
|
|
|
mean value: 0.9545687830687831
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96 0.88 1. 0.92 1. 0.96153846
|
|
1. 0.92307692 0.92 0.92 ]
|
|
|
|
mean value: 0.9484615384615385
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98 0.94 0.94230769 0.92153846 1. 0.94076923
|
|
0.96 0.96153846 0.92 0.94 ]
|
|
|
|
mean value: 0.9506153846153846
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96 0.88 0.89285714 0.85185185 1. 0.89285714
|
|
0.92857143 0.92307692 0.85185185 0.88461538]
|
|
|
|
mean value: 0.9065681725681726
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.81
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC0...05', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.96350503 0.96648908 0.9615953 0.95503259 0.93873477 0.94843411
|
|
0.95473123 0.98758698 0.94644499 0.94424415]
|
|
|
|
mean value: 0.9566798210144043
|
|
|
|
key: score_time
|
|
value: [0.16573906 0.28320813 0.27402806 0.27604485 0.13133049 0.31606078
|
|
0.21691132 0.13033724 0.27074957 0.25981092]
|
|
|
|
mean value: 0.23242204189300536
|
|
|
|
key: test_mcc
|
|
value: [0.96148034 0.96148034 0.88872671 0.84307692 0.96148034 0.88289781
|
|
0.88823731 0.92450033 0.84 0.88070485]
|
|
|
|
mean value: 0.903258494769246
|
|
|
|
key: train_mcc
|
|
value: [0.9518693 0.94748334 0.95194315 0.96062133 0.93873056 0.95186838
|
|
0.95186838 0.94751863 0.95633188 0.96070785]
|
|
|
|
mean value: 0.9518942794629011
|
|
|
|
key: test_accuracy
|
|
value: [0.98039216 0.98039216 0.94117647 0.92156863 0.98039216 0.94117647
|
|
0.94117647 0.96078431 0.92 0.94 ]
|
|
|
|
mean value: 0.9507058823529412
|
|
|
|
key: train_accuracy
|
|
value: [0.97592998 0.97374179 0.97592998 0.98030635 0.96936543 0.97592998
|
|
0.97592998 0.97374179 0.97816594 0.98034934]
|
|
|
|
mean value: 0.975939055736577
|
|
|
|
key: test_fscore
|
|
value: [0.97959184 0.97959184 0.94339623 0.92 0.98113208 0.94339623
|
|
0.94545455 0.96 0.92 0.93877551]
|
|
|
|
mean value: 0.9511338257429902
|
|
|
|
key: train_fscore
|
|
value: [0.97592998 0.97379913 0.97582418 0.98039216 0.96929825 0.97582418
|
|
0.97582418 0.97356828 0.97816594 0.98039216]
|
|
|
|
mean value: 0.9759018412370725
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.89285714 0.92 0.96296296 0.92592593
|
|
0.89655172 1. 0.92 0.95833333]
|
|
|
|
mean value: 0.9476631089217297
|
|
|
|
key: train_precision
|
|
value: [0.97807018 0.97379913 0.98230088 0.97826087 0.96929825 0.97797357
|
|
0.97797357 0.97787611 0.97816594 0.97826087]
|
|
|
|
mean value: 0.9771979353399569
|
|
|
|
key: test_recall
|
|
value: [0.96 0.96 1. 0.92 1. 0.96153846
|
|
1. 0.92307692 0.92 0.92 ]
|
|
|
|
mean value: 0.9564615384615385
|
|
|
|
key: train_recall
|
|
value: [0.97379913 0.97379913 0.96943231 0.98253275 0.96929825 0.97368421
|
|
0.97368421 0.96929825 0.97816594 0.98253275]
|
|
|
|
mean value: 0.9746226921014326
|
|
|
|
key: test_roc_auc
|
|
value: [0.98 0.98 0.94230769 0.92153846 0.98 0.94076923
|
|
0.94 0.96153846 0.92 0.94 ]
|
|
|
|
mean value: 0.9506153846153846
|
|
|
|
key: train_roc_auc
|
|
value: [0.97593465 0.97374167 0.97594423 0.98030146 0.96936528 0.97592507
|
|
0.97592507 0.97373209 0.97816594 0.98034934]
|
|
|
|
mean value: 0.9759384815751169
|
|
|
|
key: test_jcc
|
|
value: [0.96 0.96 0.89285714 0.85185185 0.96296296 0.89285714
|
|
0.89655172 0.92307692 0.85185185 0.88461538]
|
|
|
|
mean value: 0.9076624984211191
|
|
|
|
key: train_jcc
|
|
value: [0.95299145 0.94893617 0.9527897 0.96153846 0.94042553 0.9527897
|
|
0.9527897 0.94849785 0.95726496 0.96153846]
|
|
|
|
mean value: 0.9529561988250692
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01163793 0.01159668 0.01061368 0.01021719 0.01039076 0.01039505
|
|
0.01055193 0.01151991 0.01037455 0.01054192]
|
|
|
|
mean value: 0.010783958435058593
|
|
|
|
key: score_time
|
|
value: [0.01002407 0.00981021 0.00904036 0.00957513 0.00927067 0.00931168
|
|
0.00938725 0.00996947 0.00909567 0.0091548 ]
|
|
|
|
mean value: 0.009463930130004882
|
|
|
|
key: test_mcc
|
|
value: [0.88289781 0.62355907 0.77487835 0.68875274 0.72615385 0.72573276
|
|
0.68779719 0.61017022 0.60783067 0.72524067]
|
|
|
|
mean value: 0.705301332855258
|
|
|
|
key: train_mcc
|
|
value: [0.75071367 0.74619319 0.75930821 0.77253746 0.74212413 0.74619319
|
|
0.72014338 0.75504732 0.77747792 0.76862491]
|
|
|
|
mean value: 0.7538363372752251
|
|
|
|
key: test_accuracy
|
|
value: [0.94117647 0.80392157 0.88235294 0.84313725 0.8627451 0.8627451
|
|
0.84313725 0.80392157 0.8 0.86 ]
|
|
|
|
mean value: 0.8503137254901961
|
|
|
|
key: train_accuracy
|
|
value: [0.87527352 0.87308534 0.87964989 0.88621444 0.87089716 0.87308534
|
|
0.85995624 0.87746171 0.88864629 0.88427948]
|
|
|
|
mean value: 0.876854939657726
|
|
|
|
key: test_fscore
|
|
value: [0.93877551 0.77272727 0.88888889 0.84615385 0.8627451 0.86792453
|
|
0.85185185 0.8 0.81481481 0.85106383]
|
|
|
|
mean value: 0.8494945640769093
|
|
|
|
key: train_fscore
|
|
value: [0.87688985 0.87391304 0.87964989 0.88744589 0.86859688 0.8722467
|
|
0.85777778 0.87826087 0.88984881 0.88503254]
|
|
|
|
mean value: 0.8769662245721188
|
|
|
|
key: test_precision
|
|
value: [0.95833333 0.89473684 0.82758621 0.81481481 0.88 0.85185185
|
|
0.82142857 0.83333333 0.75862069 0.90909091]
|
|
|
|
mean value: 0.8549796552509801
|
|
|
|
key: train_precision
|
|
value: [0.86752137 0.87012987 0.88157895 0.87982833 0.88235294 0.87610619
|
|
0.86936937 0.87068966 0.88034188 0.87931034]
|
|
|
|
mean value: 0.8757228896777902
|
|
|
|
key: test_recall
|
|
value: [0.92 0.68 0.96 0.88 0.84615385 0.88461538
|
|
0.88461538 0.76923077 0.88 0.8 ]
|
|
|
|
mean value: 0.8504615384615385
|
|
|
|
key: train_recall
|
|
value: [0.88646288 0.87772926 0.87772926 0.89519651 0.85526316 0.86842105
|
|
0.84649123 0.88596491 0.89956332 0.89082969]
|
|
|
|
mean value: 0.878365126790776
|
|
|
|
key: test_roc_auc
|
|
value: [0.94076923 0.80153846 0.88384615 0.84384615 0.86307692 0.86230769
|
|
0.84230769 0.80461538 0.8 0.86 ]
|
|
|
|
mean value: 0.8502307692307692
|
|
|
|
key: train_roc_auc
|
|
value: [0.87524898 0.87307516 0.8796541 0.88619474 0.87086302 0.87307516
|
|
0.85992684 0.87748027 0.88864629 0.88427948]
|
|
|
|
mean value: 0.8768444035853827
|
|
|
|
key: test_jcc
|
|
value: [0.88461538 0.62962963 0.8 0.73333333 0.75862069 0.76666667
|
|
0.74193548 0.66666667 0.6875 0.74074074]
|
|
|
|
mean value: 0.7409708595178561
|
|
|
|
key: train_jcc
|
|
value: [0.78076923 0.77606178 0.78515625 0.79766537 0.76771654 0.7734375
|
|
0.75097276 0.78294574 0.80155642 0.79377432]
|
|
|
|
mean value: 0.7810055900293517
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC0...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.09083033 0.08632159 0.07055283 0.07104707 0.07519531 0.07939005
|
|
0.07854795 0.09603834 0.08671165 0.07549286]
|
|
|
|
mean value: 0.08101279735565185
|
|
|
|
key: score_time
|
|
value: [0.01125097 0.01128316 0.01079941 0.01076365 0.01111865 0.01134443
|
|
0.01275802 0.01227999 0.01084495 0.01092196]
|
|
|
|
mean value: 0.011336517333984376
|
|
|
|
key: test_mcc
|
|
value: [1. 1. 0.84307692 0.84307692 0.92153846 0.96153846
|
|
0.96148034 1. 0.88070485 0.76 ]
|
|
|
|
mean value: 0.9171415955282479
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 1. 0.92156863 0.92156863 0.96078431 0.98039216
|
|
0.98039216 1. 0.94 0.88 ]
|
|
|
|
mean value: 0.9584705882352941
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 1. 0.92 0.92 0.96153846 0.98039216
|
|
0.98113208 1. 0.93877551 0.88 ]
|
|
|
|
mean value: 0.9581838204076987
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.92 0.92 0.96153846 1.
|
|
0.96296296 1. 0.95833333 0.88 ]
|
|
|
|
mean value: 0.9602834757834758
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.92 0.92 0.96153846 0.96153846
|
|
1. 1. 0.92 0.88 ]
|
|
|
|
mean value: 0.9563076923076923
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 1. 0.92153846 0.92153846 0.96076923 0.98076923
|
|
0.98 1. 0.94 0.88 ]
|
|
|
|
mean value: 0.9584615384615385
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 1. 0.85185185 0.85185185 0.92592593 0.96153846
|
|
0.96296296 1. 0.88461538 0.78571429]
|
|
|
|
mean value: 0.9224460724460725
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.86
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.06496072 0.06503797 0.07527375 0.07898951 0.1082499 0.08013272
|
|
0.07932186 0.04319263 0.06920362 0.0507133 ]
|
|
|
|
mean value: 0.07150759696960449
|
|
|
|
key: score_time
|
|
value: [0.01696396 0.01866412 0.01862216 0.01909184 0.03613448 0.01893544
|
|
0.01222348 0.01733971 0.01222539 0.01219845]
|
|
|
|
mean value: 0.018239903450012206
|
|
|
|
key: test_mcc
|
|
value: [0.88307692 0.80904133 0.80990051 0.76662339 0.80461538 0.72573276
|
|
0.72984534 0.76662339 0.76 0.60192927]
|
|
|
|
mean value: 0.765738828717061
|
|
|
|
key: train_mcc
|
|
value: [0.90817148 0.90830894 0.92560955 0.90426654 0.89956325 0.91693003
|
|
0.90386163 0.91250886 0.91710927 0.91703931]
|
|
|
|
mean value: 0.9113368869349576
|
|
|
|
key: test_accuracy
|
|
value: [0.94117647 0.90196078 0.90196078 0.88235294 0.90196078 0.8627451
|
|
0.8627451 0.88235294 0.88 0.8 ]
|
|
|
|
mean value: 0.8817254901960785
|
|
|
|
key: train_accuracy
|
|
value: [0.95404814 0.95404814 0.96280088 0.95185996 0.94967177 0.95842451
|
|
0.95185996 0.95623632 0.95851528 0.95851528]
|
|
|
|
mean value: 0.9555980239458018
|
|
|
|
key: test_fscore
|
|
value: [0.94117647 0.89361702 0.90566038 0.875 0.90196078 0.86792453
|
|
0.87272727 0.88888889 0.88 0.80769231]
|
|
|
|
mean value: 0.8834647651147403
|
|
|
|
key: train_fscore
|
|
value: [0.95444685 0.95464363 0.96296296 0.9527897 0.95010846 0.95860566
|
|
0.95217391 0.95633188 0.95878525 0.95860566]
|
|
|
|
mean value: 0.9559453974783592
|
|
|
|
key: test_precision
|
|
value: [0.92307692 0.95454545 0.85714286 0.91304348 0.92 0.85185185
|
|
0.82758621 0.85714286 0.88 0.77777778]
|
|
|
|
mean value: 0.8762167406695143
|
|
|
|
key: train_precision
|
|
value: [0.94827586 0.94444444 0.96086957 0.93670886 0.93991416 0.95238095
|
|
0.94396552 0.95217391 0.95258621 0.95652174]
|
|
|
|
mean value: 0.948784122427322
|
|
|
|
key: test_recall
|
|
value: [0.96 0.84 0.96 0.84 0.88461538 0.88461538
|
|
0.92307692 0.92307692 0.88 0.84 ]
|
|
|
|
mean value: 0.8935384615384615
|
|
|
|
key: train_recall
|
|
value: [0.96069869 0.9650655 0.9650655 0.96943231 0.96052632 0.96491228
|
|
0.96052632 0.96052632 0.9650655 0.96069869]
|
|
|
|
mean value: 0.9632517428943538
|
|
|
|
key: test_roc_auc
|
|
value: [0.94153846 0.90076923 0.90307692 0.88153846 0.90230769 0.86230769
|
|
0.86153846 0.88153846 0.88 0.8 ]
|
|
|
|
mean value: 0.8814615384615384
|
|
|
|
key: train_roc_auc
|
|
value: [0.95403356 0.95402398 0.96279591 0.95182142 0.94969547 0.95843867
|
|
0.95187888 0.95624569 0.95851528 0.95851528]
|
|
|
|
mean value: 0.9555964146173294
|
|
|
|
key: test_jcc
|
|
value: [0.88888889 0.80769231 0.82758621 0.77777778 0.82142857 0.76666667
|
|
0.77419355 0.8 0.78571429 0.67741935]
|
|
|
|
mean value: 0.7927367608290856
|
|
|
|
key: train_jcc
|
|
value: [0.91286307 0.91322314 0.92857143 0.90983607 0.90495868 0.92050209
|
|
0.90871369 0.91631799 0.92083333 0.92050209]
|
|
|
|
mean value: 0.9156321584878045
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01775408 0.01069021 0.01090527 0.01043916 0.01051974 0.01007056
|
|
0.00997186 0.01019859 0.01107121 0.01059556]
|
|
|
|
mean value: 0.011221623420715332
|
|
|
|
key: score_time
|
|
value: [0.01215219 0.00924611 0.00937891 0.00882244 0.00891948 0.0093646
|
|
0.00921845 0.00940609 0.00914979 0.00949597]
|
|
|
|
mean value: 0.00951540470123291
|
|
|
|
key: test_mcc
|
|
value: [0.80431528 0.85322916 0.76733527 0.68615385 0.72615385 0.72573276
|
|
0.72573276 0.61648638 0.72057669 0.68887476]
|
|
|
|
mean value: 0.73145907619674
|
|
|
|
key: train_mcc
|
|
value: [0.72878597 0.70713347 0.73311115 0.77687755 0.68978499 0.7418496
|
|
0.7166604 0.71585171 0.76445458 0.74270515]
|
|
|
|
mean value: 0.7317214561206101
|
|
|
|
key: test_accuracy
|
|
value: [0.90196078 0.92156863 0.88235294 0.84313725 0.8627451 0.8627451
|
|
0.8627451 0.80392157 0.86 0.84 ]
|
|
|
|
mean value: 0.8641176470588235
|
|
|
|
key: train_accuracy
|
|
value: [0.8643326 0.85339168 0.86652079 0.88840263 0.84463895 0.87089716
|
|
0.85776805 0.85776805 0.88209607 0.87117904]
|
|
|
|
mean value: 0.8656995021642954
|
|
|
|
key: test_fscore
|
|
value: [0.89795918 0.91304348 0.88461538 0.84 0.8627451 0.86792453
|
|
0.86792453 0.79166667 0.8627451 0.82608696]
|
|
|
|
mean value: 0.8614710922420334
|
|
|
|
key: train_fscore
|
|
value: [0.86343612 0.85144124 0.86593407 0.88791209 0.84116331 0.86975717
|
|
0.85327314 0.85523385 0.88053097 0.8691796 ]
|
|
|
|
mean value: 0.8637861569276664
|
|
|
|
key: test_precision
|
|
value: [0.91666667 1. 0.85185185 0.84 0.88 0.85185185
|
|
0.85185185 0.86363636 0.84615385 0.9047619 ]
|
|
|
|
mean value: 0.8806774336774337
|
|
|
|
key: train_precision
|
|
value: [0.87111111 0.86486486 0.87168142 0.89380531 0.85844749 0.87555556
|
|
0.87906977 0.86877828 0.89237668 0.88288288]
|
|
|
|
mean value: 0.8758573358261803
|
|
|
|
key: test_recall
|
|
value: [0.88 0.84 0.92 0.84 0.84615385 0.88461538
|
|
0.88461538 0.73076923 0.88 0.76 ]
|
|
|
|
mean value: 0.8466153846153845
|
|
|
|
key: train_recall
|
|
value: [0.8558952 0.83842795 0.86026201 0.88209607 0.8245614 0.86403509
|
|
0.82894737 0.84210526 0.86899563 0.8558952 ]
|
|
|
|
mean value: 0.8521221175208764
|
|
|
|
key: test_roc_auc
|
|
value: [0.90153846 0.92 0.88307692 0.84307692 0.86307692 0.86230769
|
|
0.86230769 0.80538462 0.86 0.84 ]
|
|
|
|
mean value: 0.8640769230769231
|
|
|
|
key: train_roc_auc
|
|
value: [0.86435111 0.8534245 0.86653451 0.88841646 0.84459511 0.87088217
|
|
0.85770513 0.85773385 0.88209607 0.87117904]
|
|
|
|
mean value: 0.8656917949896575
|
|
|
|
key: test_jcc
|
|
value: [0.81481481 0.84 0.79310345 0.72413793 0.75862069 0.76666667
|
|
0.76666667 0.65517241 0.75862069 0.7037037 ]
|
|
|
|
mean value: 0.7581507024265645
|
|
|
|
key: train_jcc
|
|
value: [0.75968992 0.74131274 0.76356589 0.79841897 0.72586873 0.76953125
|
|
0.74409449 0.74708171 0.78656126 0.76862745]
|
|
|
|
mean value: 0.7604752419520732
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01461196 0.01674128 0.02004933 0.01781726 0.01789832 0.02177954
|
|
0.01803398 0.0175209 0.01656842 0.02079844]
|
|
|
|
mean value: 0.01818194389343262
|
|
|
|
key: score_time
|
|
value: [0.01018262 0.01185966 0.01840734 0.01195478 0.01202202 0.01203704
|
|
0.01195407 0.01193404 0.01192832 0.01798701]
|
|
|
|
mean value: 0.013026690483093262
|
|
|
|
key: test_mcc
|
|
value: [0.80990051 0.81912621 0.82041265 0.76733527 0.78581168 0.73107432
|
|
0.80431528 0.78762135 0.76 0.6821865 ]
|
|
|
|
mean value: 0.7767783786155215
|
|
|
|
key: train_mcc
|
|
value: [0.80373177 0.8497961 0.8591878 0.81151328 0.70283343 0.88786716
|
|
0.85514592 0.74595689 0.84755764 0.83552208]
|
|
|
|
mean value: 0.8199112066271259
|
|
|
|
key: test_accuracy
|
|
value: [0.90196078 0.90196078 0.90196078 0.88235294 0.88235294 0.8627451
|
|
0.90196078 0.88235294 0.88 0.84 ]
|
|
|
|
mean value: 0.8837647058823529
|
|
|
|
key: train_accuracy
|
|
value: [0.89715536 0.92341357 0.92778993 0.90153173 0.83588621 0.94310722
|
|
0.92560175 0.85995624 0.92358079 0.91484716]
|
|
|
|
mean value: 0.9052869960727357
|
|
|
|
key: test_fscore
|
|
value: [0.90566038 0.88888889 0.90909091 0.88461538 0.89655172 0.85714286
|
|
0.90566038 0.86956522 0.88 0.84615385]
|
|
|
|
mean value: 0.8843329582138102
|
|
|
|
key: train_fscore
|
|
value: [0.90466531 0.92027335 0.93110647 0.90835031 0.85659656 0.94117647
|
|
0.92165899 0.83838384 0.92239468 0.91958763]
|
|
|
|
mean value: 0.9064193601059057
|
|
|
|
key: test_precision
|
|
value: [0.85714286 1. 0.83333333 0.85185185 0.8125 0.91304348
|
|
0.88888889 1. 0.88 0.81481481]
|
|
|
|
mean value: 0.8851575224292616
|
|
|
|
key: train_precision
|
|
value: [0.84469697 0.96190476 0.892 0.85114504 0.75932203 0.97196262
|
|
0.97087379 0.98809524 0.93693694 0.87109375]
|
|
|
|
mean value: 0.9048031131930347
|
|
|
|
key: test_recall
|
|
value: [0.96 0.8 1. 0.92 1. 0.80769231
|
|
0.92307692 0.76923077 0.88 0.88 ]
|
|
|
|
mean value: 0.894
|
|
|
|
key: train_recall
|
|
value: [0.97379913 0.88209607 0.97379913 0.97379913 0.98245614 0.9122807
|
|
0.87719298 0.72807018 0.90829694 0.97379913]
|
|
|
|
mean value: 0.9185589519650655
|
|
|
|
key: test_roc_auc
|
|
value: [0.90307692 0.9 0.90384615 0.88307692 0.88 0.86384615
|
|
0.90153846 0.88461538 0.88 0.84 ]
|
|
|
|
mean value: 0.884
|
|
|
|
key: train_roc_auc
|
|
value: [0.89698728 0.92350418 0.92768904 0.90137325 0.83620624 0.94303991
|
|
0.92549605 0.85966828 0.92358079 0.91484716]
|
|
|
|
mean value: 0.9052392170382287
|
|
|
|
key: test_jcc
|
|
value: [0.82758621 0.8 0.83333333 0.79310345 0.8125 0.75
|
|
0.82758621 0.76923077 0.78571429 0.73333333]
|
|
|
|
mean value: 0.7932387583680687
|
|
|
|
key: train_jcc
|
|
value: [0.82592593 0.85232068 0.87109375 0.83208955 0.74916388 0.88888889
|
|
0.85470085 0.72173913 0.85596708 0.85114504]
|
|
|
|
mean value: 0.8303034773250645
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0145123 0.02034068 0.02262974 0.02118397 0.01858521 0.01969814
|
|
0.02044988 0.02166414 0.02026725 0.01864958]
|
|
|
|
mean value: 0.019798088073730468
|
|
|
|
key: score_time
|
|
value: [0.01100039 0.01206136 0.01206374 0.01205349 0.01194215 0.0121634
|
|
0.01195788 0.01196051 0.0119977 0.01203704]
|
|
|
|
mean value: 0.011923766136169434
|
|
|
|
key: test_mcc
|
|
value: [0.80990051 0.84544958 0.85407434 0.73878883 0.80461538 0.61413747
|
|
0.84544958 0.88872671 0.76244374 0.64465837]
|
|
|
|
mean value: 0.7808244519509732
|
|
|
|
key: train_mcc
|
|
value: [0.87909672 0.86883646 0.90426654 0.76612696 0.85658732 0.8392754
|
|
0.89935264 0.87549121 0.88352087 0.82858789]
|
|
|
|
mean value: 0.8601142007805195
|
|
|
|
key: test_accuracy
|
|
value: [0.90196078 0.92156863 0.92156863 0.8627451 0.90196078 0.80392157
|
|
0.92156863 0.94117647 0.88 0.82 ]
|
|
|
|
mean value: 0.8876470588235295
|
|
|
|
key: train_accuracy
|
|
value: [0.93873085 0.93435449 0.95185996 0.87089716 0.92778993 0.91466083
|
|
0.94967177 0.93654267 0.94104803 0.91048035]
|
|
|
|
mean value: 0.9276036042922802
|
|
|
|
key: test_fscore
|
|
value: [0.90566038 0.91666667 0.92592593 0.84444444 0.90196078 0.82142857
|
|
0.92592593 0.93877551 0.88461538 0.83018868]
|
|
|
|
mean value: 0.88955922701285
|
|
|
|
key: train_fscore
|
|
value: [0.94067797 0.93506494 0.9527897 0.85286783 0.92933619 0.92057026
|
|
0.94967177 0.93394077 0.94267516 0.91615542]
|
|
|
|
mean value: 0.9273750009738929
|
|
|
|
key: test_precision
|
|
value: [0.85714286 0.95652174 0.86206897 0.95 0.92 0.76666667
|
|
0.89285714 1. 0.85185185 0.78571429]
|
|
|
|
mean value: 0.8842823508880481
|
|
|
|
key: train_precision
|
|
value: [0.91358025 0.92703863 0.93670886 0.99418605 0.90794979 0.85931559
|
|
0.94759825 0.97156398 0.91735537 0.86153846]
|
|
|
|
mean value: 0.9236835228699787
|
|
|
|
key: test_recall
|
|
value: [0.96 0.88 1. 0.76 0.88461538 0.88461538
|
|
0.96153846 0.88461538 0.92 0.88 ]
|
|
|
|
mean value: 0.9015384615384615
|
|
|
|
key: train_recall
|
|
value: [0.96943231 0.94323144 0.96943231 0.74672489 0.95175439 0.99122807
|
|
0.95175439 0.89912281 0.96943231 0.97816594]
|
|
|
|
mean value: 0.9370278863096606
|
|
|
|
key: test_roc_auc
|
|
value: [0.90307692 0.92076923 0.92307692 0.86076923 0.90230769 0.80230769
|
|
0.92076923 0.94230769 0.88 0.82 ]
|
|
|
|
mean value: 0.8875384615384615
|
|
|
|
key: train_roc_auc
|
|
value: [0.93866353 0.93433502 0.95182142 0.87116946 0.92784226 0.91482801
|
|
0.94967632 0.93646097 0.94104803 0.91048035]
|
|
|
|
mean value: 0.9276325365816287
|
|
|
|
key: test_jcc
|
|
value: [0.82758621 0.84615385 0.86206897 0.73076923 0.82142857 0.6969697
|
|
0.86206897 0.88461538 0.79310345 0.70967742]
|
|
|
|
mean value: 0.8034441735498465
|
|
|
|
key: train_jcc
|
|
value: [0.888 0.87804878 0.90983607 0.74347826 0.868 0.85283019
|
|
0.90416667 0.87606838 0.89156627 0.84528302]
|
|
|
|
mean value: 0.8657277622273594
|
|
|
|
MCC on Blind test: 0.75
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.20065451 0.18761659 0.18804526 0.18726611 0.18589187 0.18081141
|
|
0.18000293 0.18104124 0.18397188 0.17888808]
|
|
|
|
mean value: 0.18541898727416992
|
|
|
|
key: score_time
|
|
value: [0.01699638 0.0164814 0.01690269 0.01632905 0.01682019 0.01586986
|
|
0.01592231 0.0159359 0.01681352 0.01568103]
|
|
|
|
mean value: 0.016375231742858886
|
|
|
|
key: test_mcc
|
|
value: [0.96153846 1. 0.88307692 0.84544958 0.96148034 0.92450033
|
|
0.96148034 1. 0.92 0.80064077]
|
|
|
|
mean value: 0.9258166744058197
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98039216 1. 0.94117647 0.92156863 0.98039216 0.96078431
|
|
0.98039216 1. 0.96 0.9 ]
|
|
|
|
mean value: 0.9624705882352941
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98039216 1. 0.94117647 0.91666667 0.98113208 0.96
|
|
0.98113208 1. 0.96 0.90196078]
|
|
|
|
mean value: 0.9622460229374769
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96153846 1. 0.92307692 0.95652174 0.96296296 1.
|
|
0.96296296 1. 0.96 0.88461538]
|
|
|
|
mean value: 0.961167843428713
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.96 0.88 1. 0.92307692
|
|
1. 1. 0.96 0.92 ]
|
|
|
|
mean value: 0.9643076923076923
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98076923 1. 0.94153846 0.92076923 0.98 0.96153846
|
|
0.98 1. 0.96 0.9 ]
|
|
|
|
mean value: 0.9624615384615385
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96153846 1. 0.88888889 0.84615385 0.96296296 0.92307692
|
|
0.96296296 1. 0.92307692 0.82142857]
|
|
|
|
mean value: 0.929008954008954
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.95
|
|
|
|
Accuracy on Blind test: 0.98
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.06497097 0.07499814 0.08114505 0.0727787 0.07150054 0.08404374
|
|
0.08080506 0.05995893 0.06555724 0.06107831]
|
|
|
|
mean value: 0.07168366909027099
|
|
|
|
key: score_time
|
|
value: [0.01859593 0.03967738 0.02885842 0.04054928 0.02583647 0.03430438
|
|
0.03502369 0.02353096 0.0248487 0.02320862]
|
|
|
|
mean value: 0.02944338321685791
|
|
|
|
key: test_mcc
|
|
value: [1. 0.80904133 0.88872671 0.76662339 1. 0.96148034
|
|
0.96148034 1. 0.92 0.80064077]
|
|
|
|
mean value: 0.9107992872362165
|
|
|
|
key: train_mcc
|
|
value: [0.99124722 0.98688016 0.9956331 0.99128503 0.97812763 0.97812763
|
|
0.98249445 0.98695553 0.99126638 0.99126638]
|
|
|
|
mean value: 0.9873283508508414
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.90196078 0.94117647 0.88235294 1. 0.98039216
|
|
0.98039216 1. 0.96 0.9 ]
|
|
|
|
mean value: 0.9546274509803921
|
|
|
|
key: train_accuracy
|
|
value: [0.99562363 0.99343545 0.99781182 0.99562363 0.98905908 0.98905908
|
|
0.99124726 0.99343545 0.99563319 0.99563319]
|
|
|
|
mean value: 0.9936561780359856
|
|
|
|
key: test_fscore
|
|
value: [1. 0.89361702 0.94339623 0.875 1. 0.98113208
|
|
0.98113208 1. 0.96 0.90196078]
|
|
|
|
mean value: 0.9536238182948812
|
|
|
|
key: train_fscore
|
|
value: [0.99563319 0.99346405 0.99782135 0.99565217 0.98905908 0.98905908
|
|
0.99122807 0.99337748 0.99563319 0.99563319]
|
|
|
|
mean value: 0.9936560855826678
|
|
|
|
key: test_precision
|
|
value: [1. 0.95454545 0.89285714 0.91304348 1. 0.96296296
|
|
0.96296296 1. 0.96 0.88461538]
|
|
|
|
mean value: 0.9530987386204778
|
|
|
|
key: train_precision
|
|
value: [0.99563319 0.99130435 0.99565217 0.99134199 0.98689956 0.98689956
|
|
0.99122807 1. 0.99563319 0.99563319]
|
|
|
|
mean value: 0.9930225273212893
|
|
|
|
key: test_recall
|
|
value: [1. 0.84 1. 0.84 1. 1. 1. 1. 0.96 0.92]
|
|
|
|
mean value: 0.956
|
|
|
|
key: train_recall
|
|
value: [0.99563319 0.99563319 1. 1. 0.99122807 0.99122807
|
|
0.99122807 0.98684211 0.99563319 0.99563319]
|
|
|
|
mean value: 0.9943059066881177
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.90076923 0.94230769 0.88153846 1. 0.98
|
|
0.98 1. 0.96 0.9 ]
|
|
|
|
mean value: 0.9544615384615385
|
|
|
|
key: train_roc_auc
|
|
value: [0.99562361 0.99343063 0.99780702 0.99561404 0.98906382 0.98906382
|
|
0.99124722 0.99342105 0.99563319 0.99563319]
|
|
|
|
mean value: 0.9936537577568375
|
|
|
|
key: test_jcc
|
|
value: [1. 0.80769231 0.89285714 0.77777778 1. 0.96296296
|
|
0.96296296 1. 0.92307692 0.82142857]
|
|
|
|
mean value: 0.9148758648758649
|
|
|
|
key: train_jcc
|
|
value: [0.99130435 0.98701299 0.99565217 0.99134199 0.97835498 0.97835498
|
|
0.9826087 0.98684211 0.99130435 0.99130435]
|
|
|
|
mean value: 0.9874080953371571
|
|
|
|
MCC on Blind test: 0.9
|
|
|
|
Accuracy on Blind test: 0.95
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.1736908 0.16033888 0.1746788 0.15446329 0.17124557 0.15994787
|
|
0.13885593 0.10975528 0.1811161 0.09505463]
|
|
|
|
mean value: 0.15191471576690674
|
|
|
|
key: score_time
|
|
value: [0.02411532 0.02435613 0.01529169 0.02446175 0.02488351 0.02440786
|
|
0.01523852 0.02894855 0.02414536 0.0149796 ]
|
|
|
|
mean value: 0.022082829475402833
|
|
|
|
key: test_mcc
|
|
value: [0.80904133 0.60498161 0.75558816 0.76662339 0.69568237 0.52923077
|
|
0.76662339 0.52923077 0.56044854 0.60192927]
|
|
|
|
mean value: 0.6619379576424396
|
|
|
|
key: train_mcc
|
|
value: [0.99128536 0.98695627 0.98695627 0.98695627 0.98695553 0.99128503
|
|
0.98695553 0.98695553 0.98698426 0.99130418]
|
|
|
|
mean value: 0.9882594241358311
|
|
|
|
key: test_accuracy
|
|
value: [0.90196078 0.78431373 0.8627451 0.88235294 0.84313725 0.76470588
|
|
0.88235294 0.76470588 0.78 0.8 ]
|
|
|
|
mean value: 0.8266274509803921
|
|
|
|
key: train_accuracy
|
|
value: [0.99562363 0.99343545 0.99343545 0.99343545 0.99343545 0.99562363
|
|
0.99343545 0.99343545 0.99344978 0.99563319]
|
|
|
|
mean value: 0.9940942925668638
|
|
|
|
key: test_fscore
|
|
value: [0.89361702 0.73170732 0.87719298 0.875 0.83333333 0.76923077
|
|
0.88888889 0.76923077 0.78431373 0.79166667]
|
|
|
|
mean value: 0.821418147364653
|
|
|
|
key: train_fscore
|
|
value: [0.99561404 0.99340659 0.99340659 0.99340659 0.99337748 0.99559471
|
|
0.99337748 0.99337748 0.99340659 0.99561404]
|
|
|
|
mean value: 0.9940581607789326
|
|
|
|
key: test_precision
|
|
value: [0.95454545 0.9375 0.78125 0.91304348 0.90909091 0.76923077
|
|
0.85714286 0.76923077 0.76923077 0.82608696]
|
|
|
|
mean value: 0.8486351963254137
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.84 0.6 1. 0.84 0.76923077 0.76923077
|
|
0.92307692 0.76923077 0.8 0.76 ]
|
|
|
|
mean value: 0.8070769230769231
|
|
|
|
key: train_recall
|
|
value: [0.99126638 0.98689956 0.98689956 0.98689956 0.98684211 0.99122807
|
|
0.98684211 0.98684211 0.98689956 0.99126638]
|
|
|
|
mean value: 0.9881885390331724
|
|
|
|
key: test_roc_auc
|
|
value: [0.90076923 0.78076923 0.86538462 0.88153846 0.84461538 0.76461538
|
|
0.88153846 0.76461538 0.78 0.8 ]
|
|
|
|
mean value: 0.8263846153846154
|
|
|
|
key: train_roc_auc
|
|
value: [0.99563319 0.99344978 0.99344978 0.99344978 0.99342105 0.99561404
|
|
0.99342105 0.99342105 0.99344978 0.99563319]
|
|
|
|
mean value: 0.9940942695165862
|
|
|
|
key: test_jcc
|
|
value: [0.80769231 0.57692308 0.78125 0.77777778 0.71428571 0.625
|
|
0.8 0.625 0.64516129 0.65517241]
|
|
|
|
mean value: 0.7008262580794561
|
|
|
|
key: train_jcc
|
|
value: [0.99126638 0.98689956 0.98689956 0.98689956 0.98684211 0.99122807
|
|
0.98684211 0.98684211 0.98689956 0.99126638]
|
|
|
|
mean value: 0.9881885390331724
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.72235799 0.70944214 0.70883203 0.71152782 0.71902704 0.71618891
|
|
0.7211957 0.71571803 0.71437287 0.70587111]
|
|
|
|
mean value: 0.7144533634185791
|
|
|
|
key: score_time
|
|
value: [0.00964093 0.00959468 0.00950909 0.00943208 0.01016808 0.00956535
|
|
0.01000834 0.00979233 0.00962782 0.00996852]
|
|
|
|
mean value: 0.009730720520019531
|
|
|
|
key: test_mcc
|
|
value: [1. 0.85322916 0.88872671 0.84307692 0.92427578 0.96148034
|
|
1. 1. 0.88070485 0.80064077]
|
|
|
|
mean value: 0.9152134526595563
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.92156863 0.94117647 0.92156863 0.96078431 0.98039216
|
|
1. 1. 0.94 0.9 ]
|
|
|
|
mean value: 0.9565490196078431
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.91304348 0.94339623 0.92 0.96296296 0.98113208
|
|
1. 1. 0.93877551 0.90196078]
|
|
|
|
mean value: 0.9561271037628433
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.89285714 0.92 0.92857143 0.96296296
|
|
1. 1. 0.95833333 0.88461538]
|
|
|
|
mean value: 0.9547340252340253
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.84 1. 0.92 1. 1. 1. 1. 0.92 0.92]
|
|
|
|
mean value: 0.96
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.92 0.94230769 0.92153846 0.96 0.98
|
|
1. 1. 0.94 0.9 ]
|
|
|
|
mean value: 0.9563846153846154
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.84 0.89285714 0.85185185 0.92857143 0.96296296
|
|
1. 1. 0.88461538 0.82142857]
|
|
|
|
mean value: 0.9182287342287342
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.86
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.03209758 0.04663944 0.04156089 0.03064513 0.03051949 0.03812504
|
|
0.03040957 0.03081679 0.03012967 0.03015852]
|
|
|
|
mean value: 0.034110212326049806
|
|
|
|
key: score_time
|
|
value: [0.01268816 0.01741695 0.01369047 0.01662827 0.01425338 0.01279664
|
|
0.01463175 0.01472878 0.01473212 0.0146606 ]
|
|
|
|
mean value: 0.014622712135314941
|
|
|
|
key: test_mcc
|
|
value: [ 0.50162374 0.33282012 0.4779765 0.38593446 0.38074981 -0.08910647
|
|
0.67109832 0.41306141 0.52678658 0.42874646]
|
|
|
|
mean value: 0.40296909364640227
|
|
|
|
key: train_mcc
|
|
value: [0.89198163 0.97407901 0.94806064 0.63869807 0.73237152 0.51794578
|
|
0.96944796 0.93638281 0.93650904 0.94475499]
|
|
|
|
mean value: 0.8490231449196334
|
|
|
|
key: test_accuracy
|
|
value: [0.74509804 0.66666667 0.7254902 0.68627451 0.66666667 0.47058824
|
|
0.82352941 0.70588235 0.76 0.7 ]
|
|
|
|
mean value: 0.6950196078431372
|
|
|
|
key: train_accuracy
|
|
value: [0.94310722 0.9868709 0.97374179 0.78993435 0.84901532 0.71115974
|
|
0.98468271 0.96717724 0.96724891 0.97161572]
|
|
|
|
mean value: 0.9144553906720304
|
|
|
|
key: test_fscore
|
|
value: [0.76363636 0.65306122 0.75862069 0.71428571 0.73846154 0.59701493
|
|
0.84745763 0.72727273 0.77777778 0.74576271]
|
|
|
|
mean value: 0.7323351299935275
|
|
|
|
key: train_fscore
|
|
value: [0.94628099 0.98672566 0.97424893 0.8267148 0.86857143 0.7755102
|
|
0.98454746 0.96815287 0.96828753 0.97239915]
|
|
|
|
mean value: 0.9271439021368936
|
|
|
|
key: test_precision
|
|
value: [0.7 0.66666667 0.66666667 0.64516129 0.61538462 0.48780488
|
|
0.75757576 0.68965517 0.72413793 0.64705882]
|
|
|
|
mean value: 0.6600111801642755
|
|
|
|
key: train_precision
|
|
value: [0.89803922 1. 0.95780591 0.70461538 0.76767677 0.63333333
|
|
0.99111111 0.9382716 0.93852459 0.94628099]
|
|
|
|
mean value: 0.877565890643361
|
|
|
|
key: test_recall
|
|
value: [0.84 0.64 0.88 0.8 0.92307692 0.76923077
|
|
0.96153846 0.76923077 0.84 0.88 ]
|
|
|
|
mean value: 0.8303076923076923
|
|
|
|
key: train_recall
|
|
value: [1. 0.97379913 0.99126638 1. 1. 1.
|
|
0.97807018 1. 1. 1. ]
|
|
|
|
mean value: 0.9943135677622003
|
|
|
|
key: test_roc_auc
|
|
value: [0.74692308 0.66615385 0.72846154 0.68846154 0.66153846 0.46461538
|
|
0.82076923 0.70461538 0.76 0.7 ]
|
|
|
|
mean value: 0.6941538461538461
|
|
|
|
key: train_roc_auc
|
|
value: [0.94298246 0.98689956 0.97370336 0.78947368 0.84934498 0.71179039
|
|
0.98466828 0.96724891 0.96724891 0.97161572]
|
|
|
|
mean value: 0.9144976250670344
|
|
|
|
key: test_jcc
|
|
value: [0.61764706 0.48484848 0.61111111 0.55555556 0.58536585 0.42553191
|
|
0.73529412 0.57142857 0.63636364 0.59459459]
|
|
|
|
mean value: 0.5817740898924696
|
|
|
|
key: train_jcc
|
|
value: [0.89803922 0.97379913 0.94979079 0.70461538 0.76767677 0.63333333
|
|
0.96956522 0.9382716 0.93852459 0.94628099]
|
|
|
|
mean value: 0.8719897027157442
|
|
|
|
MCC on Blind test: 0.45
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02209425 0.01628399 0.02629995 0.04093146 0.03840542 0.03341603
|
|
0.021245 0.02528167 0.02654409 0.03870654]
|
|
|
|
mean value: 0.028920841217041016
|
|
|
|
key: score_time
|
|
value: [0.02987242 0.01222944 0.01866579 0.01888323 0.01911664 0.01218963
|
|
0.01224184 0.01220989 0.01884508 0.01884842]
|
|
|
|
mean value: 0.017310237884521483
|
|
|
|
key: test_mcc
|
|
value: [0.8459178 0.92427578 0.85407434 0.84544958 0.80461538 0.80431528
|
|
0.84307692 0.80904133 0.76 0.68 ]
|
|
|
|
mean value: 0.8170766413736327
|
|
|
|
key: train_mcc
|
|
value: [0.86024417 0.8425731 0.86433893 0.8559713 0.84690379 0.87309431
|
|
0.84274962 0.86454544 0.8735707 0.86470302]
|
|
|
|
mean value: 0.8588694380310397
|
|
|
|
key: test_accuracy
|
|
value: [0.92156863 0.96078431 0.92156863 0.92156863 0.90196078 0.90196078
|
|
0.92156863 0.90196078 0.88 0.84 ]
|
|
|
|
mean value: 0.9072941176470588
|
|
|
|
key: train_accuracy
|
|
value: [0.92997812 0.92122538 0.9321663 0.92778993 0.92341357 0.93654267
|
|
0.92122538 0.9321663 0.93668122 0.93231441]
|
|
|
|
mean value: 0.9293503291831099
|
|
|
|
key: test_fscore
|
|
value: [0.92307692 0.95833333 0.92592593 0.91666667 0.90196078 0.90566038
|
|
0.92307692 0.90909091 0.88 0.84 ]
|
|
|
|
mean value: 0.9083791842842898
|
|
|
|
key: train_fscore
|
|
value: [0.93103448 0.92207792 0.93246187 0.92903226 0.92374728 0.93654267
|
|
0.92207792 0.93275488 0.93736501 0.93275488]
|
|
|
|
mean value: 0.9299849177077446
|
|
|
|
key: test_precision
|
|
value: [0.88888889 1. 0.86206897 0.95652174 0.92 0.88888889
|
|
0.92307692 0.86206897 0.88 0.84 ]
|
|
|
|
mean value: 0.9021514371019619
|
|
|
|
key: train_precision
|
|
value: [0.91914894 0.91416309 0.93043478 0.91525424 0.91774892 0.93449782
|
|
0.91025641 0.92274678 0.92735043 0.92672414]
|
|
|
|
mean value: 0.9218325537192356
|
|
|
|
key: test_recall
|
|
value: [0.96 0.92 1. 0.88 0.88461538 0.92307692
|
|
0.92307692 0.96153846 0.88 0.84 ]
|
|
|
|
mean value: 0.9172307692307693
|
|
|
|
key: train_recall
|
|
value: [0.94323144 0.930131 0.93449782 0.94323144 0.92982456 0.93859649
|
|
0.93421053 0.94298246 0.94759825 0.93886463]
|
|
|
|
mean value: 0.9383168620240557
|
|
|
|
key: test_roc_auc
|
|
value: [0.92230769 0.96 0.92307692 0.92076923 0.90230769 0.90153846
|
|
0.92153846 0.90076923 0.88 0.84 ]
|
|
|
|
mean value: 0.9072307692307692
|
|
|
|
key: train_roc_auc
|
|
value: [0.92994905 0.92120585 0.93216119 0.92775607 0.92342756 0.93654715
|
|
0.92125373 0.93218992 0.93668122 0.93231441]
|
|
|
|
mean value: 0.929348617176128
|
|
|
|
key: test_jcc
|
|
value: [0.85714286 0.92 0.86206897 0.84615385 0.82142857 0.82758621
|
|
0.85714286 0.83333333 0.78571429 0.72413793]
|
|
|
|
mean value: 0.8334708854364027
|
|
|
|
key: train_jcc
|
|
value: [0.87096774 0.85542169 0.87346939 0.86746988 0.8582996 0.88065844
|
|
0.85542169 0.87398374 0.88211382 0.87398374]
|
|
|
|
mean value: 0.8691789714871334
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.21736836 0.28004289 0.27122521 0.16093874 0.27278447 0.15021992
|
|
0.3257041 0.41062498 0.28041005 0.27491283]
|
|
|
|
mean value: 0.26442315578460696
|
|
|
|
key: score_time
|
|
value: [0.01900768 0.01911998 0.01891971 0.01225471 0.01899242 0.01237059
|
|
0.02265048 0.02372003 0.01925325 0.02483606]
|
|
|
|
mean value: 0.019112491607666017
|
|
|
|
key: test_mcc
|
|
value: [0.8459178 0.92427578 0.80990051 0.84544958 0.80461538 0.80431528
|
|
0.84307692 0.80904133 0.76 0.68 ]
|
|
|
|
mean value: 0.8126592592815737
|
|
|
|
key: train_mcc
|
|
value: [0.86024417 0.8425731 0.91250886 0.8559713 0.84690379 0.87309431
|
|
0.84274962 0.86454544 0.8735707 0.86470302]
|
|
|
|
mean value: 0.8636864306196921
|
|
|
|
key: test_accuracy
|
|
value: [0.92156863 0.96078431 0.90196078 0.92156863 0.90196078 0.90196078
|
|
0.92156863 0.90196078 0.88 0.84 ]
|
|
|
|
mean value: 0.9053333333333333
|
|
|
|
key: train_accuracy
|
|
value: [0.92997812 0.92122538 0.95623632 0.92778993 0.92341357 0.93654267
|
|
0.92122538 0.9321663 0.93668122 0.93231441]
|
|
|
|
mean value: 0.9317573313712937
|
|
|
|
key: test_fscore
|
|
value: [0.92307692 0.95833333 0.90566038 0.91666667 0.90196078 0.90566038
|
|
0.92307692 0.90909091 0.88 0.84 ]
|
|
|
|
mean value: 0.9063526294275461
|
|
|
|
key: train_fscore
|
|
value:/home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_sl.py:168: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rus_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_sl.py:171: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rus_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[0.93103448 0.92207792 0.95614035 0.92903226 0.92374728 0.93654267
|
|
0.92207792 0.93275488 0.93736501 0.93275488]
|
|
|
|
mean value: 0.9323527654316295
|
|
|
|
key: test_precision
|
|
value: [0.88888889 1. 0.85714286 0.95652174 0.92 0.88888889
|
|
0.92307692 0.86206897 0.88 0.84 ]
|
|
|
|
mean value: 0.9016588262645234
|
|
|
|
key: train_precision
|
|
value: [0.91914894 0.91416309 0.96035242 0.91525424 0.91774892 0.93449782
|
|
0.91025641 0.92274678 0.92735043 0.92672414]
|
|
|
|
mean value: 0.9248243177491149
|
|
|
|
key: test_recall
|
|
value: [0.96 0.92 0.96 0.88 0.88461538 0.92307692
|
|
0.92307692 0.96153846 0.88 0.84 ]
|
|
|
|
mean value: 0.9132307692307693
|
|
|
|
key: train_recall
|
|
value: [0.94323144 0.930131 0.95196507 0.94323144 0.92982456 0.93859649
|
|
0.93421053 0.94298246 0.94759825 0.93886463]
|
|
|
|
mean value: 0.9400635869148855
|
|
|
|
key: test_roc_auc
|
|
value: [0.92230769 0.96 0.90307692 0.92076923 0.90230769 0.90153846
|
|
0.92153846 0.90076923 0.88 0.84 ]
|
|
|
|
mean value: 0.9052307692307692
|
|
|
|
key: train_roc_auc
|
|
value: [0.92994905 0.92120585 0.95624569 0.92775607 0.92342756 0.93654715
|
|
0.92125373 0.93218992 0.93668122 0.93231441]
|
|
|
|
mean value: 0.9317570673408412
|
|
|
|
key: test_jcc
|
|
value: [0.85714286 0.92 0.82758621 0.84615385 0.82142857 0.82758621
|
|
0.85714286 0.83333333 0.78571429 0.72413793]
|
|
|
|
mean value: 0.8300226095743337
|
|
|
|
key: train_jcc
|
|
value: [0.87096774 0.85542169 0.91596639 0.86746988 0.8582996 0.88065844
|
|
0.85542169 0.87398374 0.88211382 0.87398374]
|
|
|
|
mean value: 0.8734286713670855
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03605151 0.03672171 0.03721905 0.03759956 0.03047872 0.03578138
|
|
0.03592539 0.036762 0.03830004 0.03602743]
|
|
|
|
mean value: 0.03608667850494385
|
|
|
|
key: score_time
|
|
value: [0.0120666 0.01405454 0.01406908 0.01425576 0.01207781 0.01208544
|
|
0.01441956 0.01217318 0.01456451 0.01270866]
|
|
|
|
mean value: 0.013247513771057129
|
|
|
|
key: test_mcc
|
|
value: [0.92704716 0.88746439 0.57735027 0.84866842 0.79056942 0.74466871
|
|
0.84615385 0.84615385 0.77849894 0.84866842]
|
|
|
|
mean value: 0.8095243434058762
|
|
|
|
key: train_mcc
|
|
value: [0.86365953 0.86787786 0.88530679 0.85535013 0.86386107 0.87262489
|
|
0.87284634 0.86395495 0.86815585 0.86386107]
|
|
|
|
mean value: 0.867749849139245
|
|
|
|
key: test_accuracy
|
|
value: [0.96226415 0.94339623 0.78846154 0.92307692 0.88461538 0.86538462
|
|
0.92307692 0.92307692 0.88461538 0.92307692]
|
|
|
|
mean value: 0.9021044992743106
|
|
|
|
key: train_accuracy
|
|
value: [0.93176972 0.93390192 0.94255319 0.92765957 0.93191489 0.93617021
|
|
0.93617021 0.93191489 0.93404255 0.93191489]
|
|
|
|
mean value: 0.9338012067322959
|
|
|
|
key: test_fscore
|
|
value: [0.96 0.94339623 0.79245283 0.92592593 0.89655172 0.85106383
|
|
0.92307692 0.92307692 0.89285714 0.92592593]
|
|
|
|
mean value: 0.9034327451391779
|
|
|
|
key: train_fscore
|
|
value: [0.93248945 0.93418259 0.94315789 0.9279661 0.93220339 0.93697479
|
|
0.93723849 0.93248945 0.93446089 0.93220339]
|
|
|
|
mean value: 0.9343366440868982
|
|
|
|
key: test_precision
|
|
value: [1. 0.96153846 0.77777778 0.89285714 0.8125 0.95238095
|
|
0.92307692 0.92307692 0.83333333 0.89285714]
|
|
|
|
mean value: 0.8969398656898657
|
|
|
|
key: train_precision
|
|
value: [0.92468619 0.92827004 0.93333333 0.92405063 0.92827004 0.9253112
|
|
0.9218107 0.92468619 0.92857143 0.92827004]
|
|
|
|
mean value: 0.9267259809243651
|
|
|
|
key: test_recall
|
|
value: [0.92307692 0.92592593 0.80769231 0.96153846 1. 0.76923077
|
|
0.92307692 0.92307692 0.96153846 0.96153846]
|
|
|
|
mean value: 0.9156695156695157
|
|
|
|
key: train_recall
|
|
value: [0.94042553 0.94017094 0.95319149 0.93191489 0.93617021 0.94893617
|
|
0.95319149 0.94042553 0.94042553 0.93617021]
|
|
|
|
mean value: 0.9421022004000728
|
|
|
|
key: test_roc_auc
|
|
value: [0.96153846 0.94373219 0.78846154 0.92307692 0.88461538 0.86538462
|
|
0.92307692 0.92307692 0.88461538 0.92307692]
|
|
|
|
mean value: 0.9020655270655271
|
|
|
|
key: train_roc_auc
|
|
value: [0.93175123 0.93391526 0.94255319 0.92765957 0.93191489 0.93617021
|
|
0.93617021 0.93191489 0.93404255 0.93191489]
|
|
|
|
mean value: 0.9338006910347336
|
|
|
|
key: test_jcc
|
|
value: [0.92307692 0.89285714 0.65625 0.86206897 0.8125 0.74074074
|
|
0.85714286 0.85714286 0.80645161 0.86206897]
|
|
|
|
mean value: 0.8270300064898229
|
|
|
|
key: train_jcc
|
|
value: [0.87351779 0.87649402 0.89243028 0.86561265 0.87301587 0.88142292
|
|
0.88188976 0.87351779 0.87698413 0.87301587]
|
|
|
|
mean value: 0.8767901085829304
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.94098043 0.98036623 0.98186111 0.90004635 1.07260489 1.11815834
|
|
1.04107022 0.89132261 0.89728475 1.02238417]
|
|
|
|
mean value: 0.9846079111099243
|
|
|
|
key: score_time
|
|
value: [0.0144639 0.01220942 0.01224375 0.0165391 0.01481962 0.0146389
|
|
0.01722312 0.01506114 0.01477838 0.02237964]
|
|
|
|
mean value: 0.01543569564819336
|
|
|
|
key: test_mcc
|
|
value: [0.92704716 0.85164138 0.61538462 0.88527041 0.89056356 0.77849894
|
|
0.84615385 0.80829038 0.77849894 0.80829038]
|
|
|
|
mean value: 0.8189639617732613
|
|
|
|
key: train_mcc
|
|
value: [0.89778103 0.82535469 0.84262186 0.89790486 0.90641581 0.90233192
|
|
0.90252815 0.90651431 0.90220118 0.82571883]
|
|
|
|
mean value: 0.8809372623148111
|
|
|
|
key: test_accuracy
|
|
value: [0.96226415 0.9245283 0.80769231 0.94230769 0.94230769 0.88461538
|
|
0.92307692 0.90384615 0.88461538 0.90384615]
|
|
|
|
mean value: 0.907910014513788
|
|
|
|
key: train_accuracy
|
|
value: [0.94882729 0.91257996 0.9212766 0.94893617 0.95319149 0.95106383
|
|
0.95106383 0.95319149 0.95106383 0.91276596]
|
|
|
|
mean value: 0.9403960440956313
|
|
|
|
key: test_fscore
|
|
value: [0.96 0.92307692 0.80769231 0.94339623 0.94545455 0.875
|
|
0.92307692 0.90196078 0.89285714 0.90566038]
|
|
|
|
mean value: 0.9078175230245152
|
|
|
|
key: train_fscore
|
|
value: [0.94936709 0.91331924 0.9217759 0.94915254 0.95338983 0.95157895
|
|
0.95178197 0.9535865 0.95137421 0.91368421]
|
|
|
|
mean value: 0.9409010432532758
|
|
|
|
key: test_precision
|
|
value: [1. 0.96 0.80769231 0.92592593 0.89655172 0.95454545
|
|
0.92307692 0.92 0.83333333 0.88888889]
|
|
|
|
mean value: 0.9110014557600765
|
|
|
|
key: train_precision
|
|
value: [0.94142259 0.90376569 0.91596639 0.94514768 0.94936709 0.94166667
|
|
0.93801653 0.94560669 0.94537815 0.90416667]
|
|
|
|
mean value: 0.9330504147086066
|
|
|
|
key: test_recall
|
|
value: [0.92307692 0.88888889 0.80769231 0.96153846 1. 0.80769231
|
|
0.92307692 0.88461538 0.96153846 0.92307692]
|
|
|
|
mean value: 0.9081196581196581
|
|
|
|
key: train_recall
|
|
value: [0.95744681 0.92307692 0.92765957 0.95319149 0.95744681 0.96170213
|
|
0.96595745 0.96170213 0.95744681 0.92340426]
|
|
|
|
mean value: 0.9489034369885434
|
|
|
|
key: test_roc_auc
|
|
value: [0.96153846 0.92521368 0.80769231 0.94230769 0.94230769 0.88461538
|
|
0.92307692 0.90384615 0.88461538 0.90384615]
|
|
|
|
mean value: 0.9079059829059829
|
|
|
|
key: train_roc_auc
|
|
value: [0.94880887 0.91260229 0.9212766 0.94893617 0.95319149 0.95106383
|
|
0.95106383 0.95319149 0.95106383 0.91276596]
|
|
|
|
mean value: 0.9403964357155847
|
|
|
|
key: test_jcc
|
|
value: [0.92307692 0.85714286 0.67741935 0.89285714 0.89655172 0.77777778
|
|
0.85714286 0.82142857 0.80645161 0.82758621]
|
|
|
|
mean value: 0.8337435028202548
|
|
|
|
key: train_jcc
|
|
value: [0.90361446 0.84046693 0.85490196 0.90322581 0.91093117 0.90763052
|
|
0.908 0.91129032 0.90725806 0.84108527]
|
|
|
|
mean value: 0.8888404505729317
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01424456 0.01084495 0.01039648 0.00999761 0.01003218 0.01008916
|
|
0.01021504 0.01027536 0.01029849 0.00998974]
|
|
|
|
mean value: 0.01063835620880127
|
|
|
|
key: score_time
|
|
value: [0.01386261 0.00955462 0.00925779 0.00907087 0.00900865 0.00900984
|
|
0.00913215 0.00919223 0.00887728 0.00897932]
|
|
|
|
mean value: 0.009594535827636719
|
|
|
|
key: test_mcc
|
|
value: [0.82552431 0.46464327 0.54494926 0.69230769 0.73131034 0.58789635
|
|
0.57735027 0.77151675 0.54006172 0.61538462]
|
|
|
|
mean value: 0.6350944580966691
|
|
|
|
key: train_mcc
|
|
value: [0.66639366 0.66929675 0.7289762 0.68358593 0.69117257 0.68473679
|
|
0.69424587 0.68473679 0.70419643 0.65795145]
|
|
|
|
mean value: 0.6865292430336383
|
|
|
|
key: test_accuracy
|
|
value: [0.90566038 0.71698113 0.76923077 0.84615385 0.86538462 0.78846154
|
|
0.78846154 0.88461538 0.76923077 0.80769231]
|
|
|
|
mean value: 0.8141872278664731
|
|
|
|
key: train_accuracy
|
|
value: [0.8315565 0.8336887 0.86170213 0.84042553 0.84468085 0.84042553
|
|
0.84680851 0.84042553 0.85106383 0.82765957]
|
|
|
|
mean value: 0.8418436691920338
|
|
|
|
key: test_fscore
|
|
value: [0.89361702 0.66666667 0.78571429 0.84615385 0.86792453 0.76595745
|
|
0.78431373 0.88 0.76 0.80769231]
|
|
|
|
mean value: 0.8058039828104295
|
|
|
|
key: train_fscore
|
|
value: [0.82326622 0.82666667 0.85260771 0.83296214 0.8388521 0.83146067
|
|
0.85 0.83146067 0.84513274 0.81959911]
|
|
|
|
mean value: 0.8352008031680325
|
|
|
|
key: test_precision
|
|
value: [1. 0.83333333 0.73333333 0.84615385 0.85185185 0.85714286
|
|
0.8 0.91666667 0.79166667 0.80769231]
|
|
|
|
mean value: 0.8437840862840863
|
|
|
|
key: train_precision
|
|
value: [0.86792453 0.86111111 0.91262136 0.87383178 0.87155963 0.88095238
|
|
0.83265306 0.88095238 0.88018433 0.85981308]
|
|
|
|
mean value: 0.8721603646403393
|
|
|
|
key: test_recall
|
|
value: [0.80769231 0.55555556 0.84615385 0.84615385 0.88461538 0.69230769
|
|
0.76923077 0.84615385 0.73076923 0.80769231]
|
|
|
|
mean value: 0.7786324786324786
|
|
|
|
key: train_recall
|
|
value: [0.78297872 0.79487179 0.8 0.79574468 0.80851064 0.78723404
|
|
0.86808511 0.78723404 0.81276596 0.78297872]
|
|
|
|
mean value: 0.8020403709765412
|
|
|
|
key: test_roc_auc
|
|
value: [0.90384615 0.72008547 0.76923077 0.84615385 0.86538462 0.78846154
|
|
0.78846154 0.88461538 0.76923077 0.80769231]
|
|
|
|
mean value: 0.8143162393162393
|
|
|
|
key: train_roc_auc
|
|
value: [0.8316603 0.83360611 0.86170213 0.84042553 0.84468085 0.84042553
|
|
0.84680851 0.84042553 0.85106383 0.82765957]
|
|
|
|
mean value: 0.8418457901436625
|
|
|
|
key: test_jcc
|
|
value: [0.80769231 0.5 0.64705882 0.73333333 0.76666667 0.62068966
|
|
0.64516129 0.78571429 0.61290323 0.67741935]
|
|
|
|
mean value: 0.6796638943076161
|
|
|
|
key: train_jcc
|
|
value: [0.69961977 0.70454545 0.743083 0.71374046 0.72243346 0.71153846
|
|
0.73913043 0.71153846 0.73180077 0.69433962]
|
|
|
|
mean value: 0.717176989523702
|
|
|
|
MCC on Blind test: 0.63
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01102686 0.01034999 0.01119971 0.01043487 0.0116086 0.0105567
|
|
0.01021433 0.01092291 0.01056623 0.01027584]
|
|
|
|
mean value: 0.010715603828430176
|
|
|
|
key: score_time
|
|
value: [0.00961232 0.00919366 0.00915337 0.00907326 0.00980663 0.00913072
|
|
0.00897837 0.0090673 0.00913048 0.00903487]
|
|
|
|
mean value: 0.009218096733093262
|
|
|
|
key: test_mcc
|
|
value: [0.92704716 0.59688314 0.50336201 0.73131034 0.70064905 0.69436507
|
|
0.57735027 0.76923077 0.66628253 0.65433031]
|
|
|
|
mean value: 0.6820810650750807
|
|
|
|
key: train_mcc
|
|
value: [0.73987525 0.70625194 0.78298581 0.71521098 0.7745312 0.74043224
|
|
0.69894261 0.74470782 0.76195052 0.74048587]
|
|
|
|
mean value: 0.7405374249725031
|
|
|
|
key: test_accuracy
|
|
value: [0.96226415 0.79245283 0.75 0.86538462 0.84615385 0.84615385
|
|
0.78846154 0.88461538 0.82692308 0.82692308]
|
|
|
|
mean value: 0.8389332365747459
|
|
|
|
key: train_accuracy
|
|
value: [0.86993603 0.85287846 0.89148936 0.85744681 0.88723404 0.87021277
|
|
0.84893617 0.87234043 0.88085106 0.87021277]
|
|
|
|
mean value: 0.8701537903189221
|
|
|
|
key: test_fscore
|
|
value: [0.96 0.7755102 0.76363636 0.86792453 0.85714286 0.84
|
|
0.78431373 0.88461538 0.84210526 0.82352941]
|
|
|
|
mean value: 0.8398777738190921
|
|
|
|
key: train_fscore
|
|
value: [0.87048832 0.8496732 0.89171975 0.85529158 0.88794926 0.87048832
|
|
0.84463895 0.87179487 0.87931034 0.86937901]
|
|
|
|
mean value: 0.8690733611272227
|
|
|
|
key: test_precision
|
|
value: [1. 0.86363636 0.72413793 0.85185185 0.8 0.875
|
|
0.8 0.88461538 0.77419355 0.84 ]
|
|
|
|
mean value: 0.841343507952518
|
|
|
|
key: train_precision
|
|
value: [0.86864407 0.86666667 0.88983051 0.86842105 0.88235294 0.86864407
|
|
0.86936937 0.87553648 0.89082969 0.875 ]
|
|
|
|
mean value: 0.8755294848921722
|
|
|
|
key: test_recall
|
|
value: [0.92307692 0.7037037 0.80769231 0.88461538 0.92307692 0.80769231
|
|
0.76923077 0.88461538 0.92307692 0.80769231]
|
|
|
|
mean value: 0.8434472934472934
|
|
|
|
key: train_recall
|
|
value: [0.87234043 0.83333333 0.89361702 0.84255319 0.89361702 0.87234043
|
|
0.8212766 0.86808511 0.86808511 0.86382979]
|
|
|
|
mean value: 0.8629078014184397
|
|
|
|
key: test_roc_auc
|
|
value: [0.96153846 0.79415954 0.75 0.86538462 0.84615385 0.84615385
|
|
0.78846154 0.88461538 0.82692308 0.82692308]
|
|
|
|
mean value: 0.8390313390313391
|
|
|
|
key: train_roc_auc
|
|
value: [0.8699309 0.85283688 0.89148936 0.85744681 0.88723404 0.87021277
|
|
0.84893617 0.87234043 0.88085106 0.87021277]
|
|
|
|
mean value: 0.8701491180214584
|
|
|
|
key: test_jcc
|
|
value: [0.92307692 0.63333333 0.61764706 0.76666667 0.75 0.72413793
|
|
0.64516129 0.79310345 0.72727273 0.7 ]
|
|
|
|
mean value: 0.7280399378806105
|
|
|
|
key: train_jcc
|
|
value: [0.77067669 0.73863636 0.8045977 0.74716981 0.79847909 0.77067669
|
|
0.73106061 0.77272727 0.78461538 0.76893939]
|
|
|
|
mean value: 0.7687579004360319
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00984716 0.01065707 0.01068068 0.01061821 0.01073027 0.0107739
|
|
0.01069117 0.01094604 0.01067162 0.01062965]
|
|
|
|
mean value: 0.010624575614929199
|
|
|
|
key: score_time
|
|
value: [0.01390767 0.01305842 0.01274514 0.01643538 0.01310968 0.01333404
|
|
0.01317954 0.01302528 0.01261353 0.01284766]
|
|
|
|
mean value: 0.013425636291503906
|
|
|
|
key: test_mcc
|
|
value: [0.63760132 0.52028554 0.23145502 0.38575837 0.70064905 0.54006172
|
|
0.31139958 0.77151675 0.73568294 0.46291005]
|
|
|
|
mean value: 0.5297320348876765
|
|
|
|
key: train_mcc
|
|
value: [0.72710906 0.7106402 0.76601987 0.71495188 0.73197454 0.71066404
|
|
0.70669657 0.66043608 0.71980093 0.71490009]
|
|
|
|
mean value: 0.7163193248281741
|
|
|
|
key: test_accuracy
|
|
value: [0.81132075 0.75471698 0.61538462 0.69230769 0.84615385 0.76923077
|
|
0.65384615 0.88461538 0.86538462 0.73076923]
|
|
|
|
mean value: 0.7623730043541365
|
|
|
|
key: train_accuracy
|
|
value: [0.86353945 0.85501066 0.88297872 0.85744681 0.86595745 0.85531915
|
|
0.85319149 0.82978723 0.85957447 0.85744681]
|
|
|
|
mean value: 0.8580252234269382
|
|
|
|
key: test_fscore
|
|
value: [0.7826087 0.73469388 0.6 0.68 0.85714286 0.76
|
|
0.625 0.88888889 0.85714286 0.72 ]
|
|
|
|
mean value: 0.7505477176377797
|
|
|
|
key: train_fscore
|
|
value: [0.86324786 0.85152838 0.88222698 0.85653105 0.86509636 0.8559322
|
|
0.85097192 0.82532751 0.85652174 0.85714286]
|
|
|
|
mean value: 0.856452687007534
|
|
|
|
key: test_precision
|
|
value: [0.9 0.81818182 0.625 0.70833333 0.8 0.79166667
|
|
0.68181818 0.85714286 0.91304348 0.75 ]
|
|
|
|
mean value: 0.7845186335403727
|
|
|
|
key: train_precision
|
|
value: [0.86695279 0.87053571 0.88793103 0.86206897 0.87068966 0.85232068
|
|
0.86403509 0.84753363 0.87555556 0.85897436]
|
|
|
|
mean value: 0.8656597468799392
|
|
|
|
key: test_recall
|
|
value: [0.69230769 0.66666667 0.57692308 0.65384615 0.92307692 0.73076923
|
|
0.57692308 0.92307692 0.80769231 0.69230769]
|
|
|
|
mean value: 0.7243589743589743
|
|
|
|
key: train_recall
|
|
value: [0.85957447 0.83333333 0.87659574 0.85106383 0.85957447 0.85957447
|
|
0.83829787 0.80425532 0.83829787 0.85531915]
|
|
|
|
mean value: 0.8475886524822696
|
|
|
|
key: test_roc_auc
|
|
value: [0.80911681 0.75641026 0.61538462 0.69230769 0.84615385 0.76923077
|
|
0.65384615 0.88461538 0.86538462 0.73076923]
|
|
|
|
mean value: 0.7623219373219373
|
|
|
|
key: train_roc_auc
|
|
value: [0.86354792 0.85496454 0.88297872 0.85744681 0.86595745 0.85531915
|
|
0.85319149 0.82978723 0.85957447 0.85744681]
|
|
|
|
mean value: 0.8580214584469904
|
|
|
|
key: test_jcc
|
|
value: [0.64285714 0.58064516 0.42857143 0.51515152 0.75 0.61290323
|
|
0.45454545 0.8 0.75 0.5625 ]
|
|
|
|
mean value: 0.6097173928222316
|
|
|
|
key: train_jcc
|
|
value: [0.7593985 0.74144487 0.78927203 0.74906367 0.76226415 0.74814815
|
|
0.7406015 0.70260223 0.74904943 0.75 ]
|
|
|
|
mean value: 0.7491844527216088
|
|
|
|
MCC on Blind test: 0.33
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02203298 0.02313685 0.02299261 0.02083182 0.02209663 0.01996708
|
|
0.02413058 0.02186084 0.02108955 0.0205934 ]
|
|
|
|
mean value: 0.02187323570251465
|
|
|
|
key: score_time
|
|
value: [0.01181936 0.01257968 0.01236892 0.01224875 0.01212788 0.01159978
|
|
0.01305223 0.0125947 0.01200342 0.01176119]
|
|
|
|
mean value: 0.012215590476989746
|
|
|
|
key: test_mcc
|
|
value: [0.92704716 0.81688878 0.61538462 0.88527041 0.74466871 0.77849894
|
|
0.80829038 0.84615385 0.74466871 0.80829038]
|
|
|
|
mean value: 0.7975161942581178
|
|
|
|
key: train_mcc
|
|
value: [0.78688615 0.79976356 0.82130634 0.79149653 0.80857653 0.80035515
|
|
0.8000652 0.79155386 0.80857653 0.80000724]
|
|
|
|
mean value: 0.8008587091662202
|
|
|
|
key: test_accuracy
|
|
value: [0.96226415 0.90566038 0.80769231 0.94230769 0.86538462 0.88461538
|
|
0.90384615 0.92307692 0.86538462 0.90384615]
|
|
|
|
mean value: 0.8964078374455733
|
|
|
|
key: train_accuracy
|
|
value: [0.89339019 0.89978678 0.9106383 0.89574468 0.90425532 0.9
|
|
0.9 0.89574468 0.90425532 0.9 ]
|
|
|
|
mean value: 0.900381527015379
|
|
|
|
key: test_fscore
|
|
value: [0.96 0.90196078 0.80769231 0.94339623 0.87719298 0.875
|
|
0.90196078 0.92307692 0.87719298 0.90566038]
|
|
|
|
mean value: 0.8973133368082548
|
|
|
|
key: train_fscore
|
|
value: [0.89451477 0.90063425 0.91101695 0.89596603 0.90364026 0.90146751
|
|
0.90063425 0.89640592 0.90364026 0.90021231]
|
|
|
|
mean value: 0.9008132498798447
|
|
|
|
key: test_precision
|
|
value: [1. 0.95833333 0.80769231 0.92592593 0.80645161 0.95454545
|
|
0.92 0.92307692 0.80645161 0.88888889]
|
|
|
|
mean value: 0.8991366059269286
|
|
|
|
key: train_precision
|
|
value: [0.88702929 0.89121339 0.907173 0.8940678 0.90948276 0.88842975
|
|
0.89495798 0.8907563 0.90948276 0.89830508]
|
|
|
|
mean value: 0.8970898109982571
|
|
|
|
key: test_recall
|
|
value: [0.92307692 0.85185185 0.80769231 0.96153846 0.96153846 0.80769231
|
|
0.88461538 0.92307692 0.96153846 0.92307692]
|
|
|
|
mean value: 0.9005698005698006
|
|
|
|
key: train_recall
|
|
value: [0.90212766 0.91025641 0.91489362 0.89787234 0.89787234 0.91489362
|
|
0.90638298 0.90212766 0.89787234 0.90212766]
|
|
|
|
mean value: 0.9046426623022368
|
|
|
|
key: test_roc_auc
|
|
value: [0.96153846 0.90669516 0.80769231 0.94230769 0.86538462 0.88461538
|
|
0.90384615 0.92307692 0.86538462 0.90384615]
|
|
|
|
mean value: 0.8964387464387464
|
|
|
|
key: train_roc_auc
|
|
value: [0.89337152 0.89980906 0.9106383 0.89574468 0.90425532 0.9
|
|
0.9 0.89574468 0.90425532 0.9 ]
|
|
|
|
mean value: 0.9003818876159301
|
|
|
|
key: test_jcc
|
|
value: [0.92307692 0.82142857 0.67741935 0.89285714 0.78125 0.77777778
|
|
0.82142857 0.85714286 0.78125 0.82758621]
|
|
|
|
mean value: 0.8161217405447105
|
|
|
|
key: train_jcc
|
|
value: [0.80916031 0.81923077 0.83657588 0.81153846 0.82421875 0.82061069
|
|
0.81923077 0.81226054 0.82421875 0.81853282]
|
|
|
|
mean value: 0.819557772278408
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [2.02500224 2.07535386 2.03847885 2.11725354 2.08513308 2.02912474
|
|
1.15581083 2.13639879 2.00898528 2.09049392]
|
|
|
|
mean value: 1.9762035131454467
|
|
|
|
key: score_time
|
|
value: [0.01247501 0.01451373 0.01420856 0.01249385 0.01248074 0.02131391
|
|
0.01252007 0.02289152 0.01492286 0.01478481]
|
|
|
|
mean value: 0.01526050567626953
|
|
|
|
key: test_mcc
|
|
value: [0.92704716 0.77350427 0.54006172 0.82305489 0.9258201 0.77849894
|
|
0.80829038 0.88527041 0.77151675 0.73131034]
|
|
|
|
mean value: 0.7964374979189948
|
|
|
|
key: train_mcc
|
|
value: [1. 0.99150739 1. 1. 0.99148936 0.9957537
|
|
0.95320012 1. 0.9957537 1. ]
|
|
|
|
mean value: 0.9927704266438734
|
|
|
|
key: test_accuracy
|
|
value: [0.96226415 0.88679245 0.76923077 0.90384615 0.96153846 0.88461538
|
|
0.90384615 0.94230769 0.88461538 0.86538462]
|
|
|
|
mean value: 0.89644412191582
|
|
|
|
key: train_accuracy
|
|
value: [1. 0.99573561 1. 1. 0.99574468 0.99787234
|
|
0.97659574 1. 0.99787234 1. ]
|
|
|
|
mean value: 0.9963820714058885
|
|
|
|
key: test_fscore
|
|
value: [0.96 0.88888889 0.77777778 0.9122807 0.96296296 0.875
|
|
0.90196078 0.94117647 0.88888889 0.86792453]
|
|
|
|
mean value: 0.8976861003476753
|
|
|
|
key: train_fscore
|
|
value: [1. 0.99574468 1. 1. 0.99574468 0.9978678
|
|
0.97664544 1. 0.99787686 1. ]
|
|
|
|
mean value: 0.9963879458533711
|
|
|
|
key: test_precision
|
|
value: [1. 0.88888889 0.75 0.83870968 0.92857143 0.95454545
|
|
0.92 0.96 0.85714286 0.85185185]
|
|
|
|
mean value: 0.8949710158419836
|
|
|
|
key: train_precision
|
|
value: [1. 0.99152542 1. 1. 0.99574468 1.
|
|
0.97457627 1. 0.99576271 1. ]
|
|
|
|
mean value: 0.9957609087630724
|
|
|
|
key: test_recall
|
|
value: [0.92307692 0.88888889 0.80769231 1. 1. 0.80769231
|
|
0.88461538 0.92307692 0.92307692 0.88461538]
|
|
|
|
mean value: 0.9042735042735043
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 0.99574468 0.99574468
|
|
0.9787234 1. 1. 1. ]
|
|
|
|
mean value: 0.9970212765957447
|
|
|
|
key: test_roc_auc
|
|
value: [0.96153846 0.88675214 0.76923077 0.90384615 0.96153846 0.88461538
|
|
0.90384615 0.94230769 0.88461538 0.86538462]
|
|
|
|
mean value: 0.8963675213675214
|
|
|
|
key: train_roc_auc
|
|
value: [1. 0.99574468 1. 1. 0.99574468 0.99787234
|
|
0.97659574 1. 0.99787234 1. ]
|
|
|
|
mean value: 0.9963829787234042
|
|
|
|
key: test_jcc
|
|
value: [0.92307692 0.8 0.63636364 0.83870968 0.92857143 0.77777778
|
|
0.82142857 0.88888889 0.8 0.76666667]
|
|
|
|
mean value: 0.8181483570193248
|
|
|
|
key: train_jcc
|
|
value: [1. 0.99152542 1. 1. 0.99152542 0.99574468
|
|
0.95435685 1. 0.99576271 1. ]
|
|
|
|
mean value: 0.9928915086646127
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02680755 0.02203512 0.02136636 0.02250838 0.02036333 0.02217436
|
|
0.02208591 0.02092457 0.02369666 0.02297091]
|
|
|
|
mean value: 0.022493314743041993
|
|
|
|
key: score_time
|
|
value: [0.01222968 0.00935984 0.00914311 0.00899458 0.0090909 0.00908637
|
|
0.00928926 0.00905418 0.0090971 0.00970721]
|
|
|
|
mean value: 0.009505224227905274
|
|
|
|
key: test_mcc
|
|
value: [0.85164138 0.92450142 0.77849894 0.88527041 0.88527041 0.88527041
|
|
0.84615385 0.88527041 0.96225045 0.84866842]
|
|
|
|
mean value: 0.8752796119384096
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9245283 0.96226415 0.88461538 0.94230769 0.94230769 0.94230769
|
|
0.92307692 0.94230769 0.98076923 0.92307692]
|
|
|
|
mean value: 0.9367561683599419
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.92592593 0.96296296 0.89285714 0.94339623 0.94339623 0.94117647
|
|
0.92307692 0.94339623 0.98113208 0.92592593]
|
|
|
|
mean value: 0.9383246106054097
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.89285714 0.96296296 0.83333333 0.92592593 0.92592593 0.96
|
|
0.92307692 0.92592593 0.96296296 0.89285714]
|
|
|
|
mean value: 0.9205828245828246
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96153846 0.96296296 0.96153846 0.96153846 0.96153846 0.92307692
|
|
0.92307692 0.96153846 1. 0.96153846]
|
|
|
|
mean value: 0.9578347578347579
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.92521368 0.96225071 0.88461538 0.94230769 0.94230769 0.94230769
|
|
0.92307692 0.94230769 0.98076923 0.92307692]
|
|
|
|
mean value: 0.9368233618233619
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.86206897 0.92857143 0.80645161 0.89285714 0.89285714 0.88888889
|
|
0.85714286 0.89285714 0.96296296 0.86206897]
|
|
|
|
mean value: 0.8846727110075274
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.87
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.12310386 0.12195277 0.1237545 0.12339377 0.12358236 0.12080479
|
|
0.12089777 0.12545443 0.12270761 0.12189317]
|
|
|
|
mean value: 0.12275450229644776
|
|
|
|
key: score_time
|
|
value: [0.01917434 0.01886702 0.01860046 0.01904464 0.01814556 0.01823425
|
|
0.0187676 0.018188 0.01808381 0.01885533]
|
|
|
|
mean value: 0.01859610080718994
|
|
|
|
key: test_mcc
|
|
value: [0.85122386 0.70692282 0.50336201 0.88527041 0.85634884 0.81312325
|
|
0.84615385 0.88527041 0.82305489 0.89056356]
|
|
|
|
mean value: 0.8061293898911333
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9245283 0.8490566 0.75 0.94230769 0.92307692 0.90384615
|
|
0.92307692 0.94230769 0.90384615 0.94230769]
|
|
|
|
mean value: 0.9004354136429609
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.92 0.84 0.76363636 0.94339623 0.92857143 0.89795918
|
|
0.92307692 0.94339623 0.9122807 0.94545455]
|
|
|
|
mean value: 0.9017771598997305
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.95833333 0.91304348 0.72413793 0.92592593 0.86666667 0.95652174
|
|
0.92307692 0.92592593 0.83870968 0.89655172]
|
|
|
|
mean value: 0.8928893324911849
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.88461538 0.77777778 0.80769231 0.96153846 1. 0.84615385
|
|
0.92307692 0.96153846 1. 1. ]
|
|
|
|
mean value: 0.9162393162393162
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.92378917 0.85042735 0.75 0.94230769 0.92307692 0.90384615
|
|
0.92307692 0.94230769 0.90384615 0.94230769]
|
|
|
|
mean value: 0.9004985754985755
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.85185185 0.72413793 0.61764706 0.89285714 0.86666667 0.81481481
|
|
0.85714286 0.89285714 0.83870968 0.89655172]
|
|
|
|
mean value: 0.8253236867605774
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.81
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0104301 0.01049423 0.01046538 0.01045418 0.01051521 0.01134562
|
|
0.01081777 0.01171875 0.01168561 0.01107073]
|
|
|
|
mean value: 0.010899758338928223
|
|
|
|
key: score_time
|
|
value: [0.00904679 0.00917101 0.00891232 0.0090096 0.00909114 0.00902176
|
|
0.00980735 0.00962543 0.00923777 0.00924516]
|
|
|
|
mean value: 0.009216833114624023
|
|
|
|
key: test_mcc
|
|
value: [0.69957726 0.44368795 0.38575837 0.55339859 0.70064905 0.54494926
|
|
0.43112399 0.57735027 0.34641016 0.34848139]
|
|
|
|
mean value: 0.5031386296029412
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.8490566 0.71698113 0.69230769 0.76923077 0.84615385 0.76923077
|
|
0.71153846 0.78846154 0.67307692 0.67307692]
|
|
|
|
mean value: 0.7489114658925979
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.84 0.69387755 0.7037037 0.73913043 0.85714286 0.75
|
|
0.68085106 0.79245283 0.66666667 0.69090909]
|
|
|
|
mean value: 0.7414734198243802
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.875 0.77272727 0.67857143 0.85 0.8 0.81818182
|
|
0.76190476 0.77777778 0.68 0.65517241]
|
|
|
|
mean value: 0.7669335472956162
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.80769231 0.62962963 0.73076923 0.65384615 0.92307692 0.69230769
|
|
0.61538462 0.80769231 0.65384615 0.73076923]
|
|
|
|
mean value: 0.7245014245014245
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.8482906 0.71866097 0.69230769 0.76923077 0.84615385 0.76923077
|
|
0.71153846 0.78846154 0.67307692 0.67307692]
|
|
|
|
mean value: 0.749002849002849
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.72413793 0.53125 0.54285714 0.5862069 0.75 0.6
|
|
0.51612903 0.65625 0.5 0.52777778]
|
|
|
|
mean value: 0.5934608780479191
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.85724831 1.80352688 1.76003718 1.77659369 1.78128886 1.81115103
|
|
1.7989316 1.79322004 1.84129739 1.78404427]
|
|
|
|
mean value: 1.8007339239120483
|
|
|
|
key: score_time
|
|
value: [0.09947538 0.09356856 0.09668612 0.09286475 0.10176277 0.10127997
|
|
0.10069108 0.09549236 0.09286833 0.10066128]
|
|
|
|
mean value: 0.09753506183624268
|
|
|
|
key: test_mcc
|
|
value: [0.92450142 0.88746439 0.84615385 0.88527041 0.96225045 0.9258201
|
|
0.88527041 0.92307692 0.9258201 1. ]
|
|
|
|
mean value: 0.9165628054905913
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96226415 0.94339623 0.92307692 0.94230769 0.98076923 0.96153846
|
|
0.94230769 0.96153846 0.96153846 1. ]
|
|
|
|
mean value: 0.9578737300435414
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96153846 0.94339623 0.92307692 0.94339623 0.98113208 0.96
|
|
0.94117647 0.96153846 0.96296296 1. ]
|
|
|
|
mean value: 0.9578217808006931
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96153846 0.96153846 0.92307692 0.92592593 0.96296296 1.
|
|
0.96 0.96153846 0.92857143 1. ]
|
|
|
|
mean value: 0.9585152625152625
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96153846 0.92592593 0.92307692 0.96153846 1. 0.92307692
|
|
0.92307692 0.96153846 1. 1. ]
|
|
|
|
mean value: 0.957977207977208
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96225071 0.94373219 0.92307692 0.94230769 0.98076923 0.96153846
|
|
0.94230769 0.96153846 0.96153846 1. ]
|
|
|
|
mean value: 0.957905982905983
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.92592593 0.89285714 0.85714286 0.89285714 0.96296296 0.92307692
|
|
0.88888889 0.92592593 0.92857143 1. ]
|
|
|
|
mean value: 0.9198209198209198
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.81
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC0...05', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.93789101 0.9519825 0.9610076 0.95415044 1.06602931 0.96349025
|
|
0.95963335 1.00183535 0.93617368 0.97349167]
|
|
|
|
mean value: 0.9705685138702392
|
|
|
|
key: score_time
|
|
value: [0.27528143 0.22235203 0.27172637 0.26692533 0.2285161 0.21771598
|
|
0.26816726 0.23777795 0.24989247 0.26056457]
|
|
|
|
mean value: 0.24989194869995118
|
|
|
|
key: test_mcc
|
|
value: [0.92450142 0.78307508 0.80829038 0.88527041 0.9258201 0.9258201
|
|
0.88527041 0.92307692 0.9258201 1. ]
|
|
|
|
mean value: 0.8986944924445692
|
|
|
|
key: train_mcc
|
|
value: [0.95309971 0.94457073 0.95744681 0.95326917 0.94893617 0.95320012
|
|
0.96171083 0.95748148 0.94893617 0.94897054]
|
|
|
|
mean value: 0.9527621739445413
|
|
|
|
key: test_accuracy
|
|
value: [0.96226415 0.88679245 0.90384615 0.94230769 0.96153846 0.96153846
|
|
0.94230769 0.96153846 0.96153846 1. ]
|
|
|
|
mean value: 0.948367198838897
|
|
|
|
key: train_accuracy
|
|
value: [0.97654584 0.97228145 0.9787234 0.97659574 0.97446809 0.97659574
|
|
0.98085106 0.9787234 0.97446809 0.97446809]
|
|
|
|
mean value: 0.9763720909132151
|
|
|
|
key: test_fscore
|
|
value: [0.96153846 0.88 0.90566038 0.94339623 0.96296296 0.96
|
|
0.94117647 0.96153846 0.96296296 1. ]
|
|
|
|
mean value: 0.947923592336467
|
|
|
|
key: train_fscore
|
|
value: [0.97664544 0.97216274 0.9787234 0.9764454 0.97446809 0.97654584
|
|
0.98081023 0.97863248 0.97446809 0.97435897]
|
|
|
|
mean value: 0.9763260676507729
|
|
|
|
key: test_precision
|
|
value: [0.96153846 0.95652174 0.88888889 0.92592593 0.92857143 1.
|
|
0.96 0.96153846 0.92857143 1. ]
|
|
|
|
mean value: 0.951155633416503
|
|
|
|
key: train_precision
|
|
value: [0.97457627 0.97424893 0.9787234 0.98275862 0.97446809 0.97863248
|
|
0.98290598 0.98283262 0.97446809 0.97854077]
|
|
|
|
mean value: 0.9782155245479209
|
|
|
|
key: test_recall
|
|
value: [0.96153846 0.81481481 0.92307692 0.96153846 1. 0.92307692
|
|
0.92307692 0.96153846 1. 1. ]
|
|
|
|
mean value: 0.9468660968660969
|
|
|
|
key: train_recall
|
|
value: [0.9787234 0.97008547 0.9787234 0.97021277 0.97446809 0.97446809
|
|
0.9787234 0.97446809 0.97446809 0.97021277]
|
|
|
|
mean value: 0.9744553555191853
|
|
|
|
key: test_roc_auc
|
|
value: [0.96225071 0.88817664 0.90384615 0.94230769 0.96153846 0.96153846
|
|
0.94230769 0.96153846 0.96153846 1. ]
|
|
|
|
mean value: 0.9485042735042736
|
|
|
|
key: train_roc_auc
|
|
value: [0.97654119 0.97227678 0.9787234 0.97659574 0.97446809 0.97659574
|
|
0.98085106 0.9787234 0.97446809 0.97446809]
|
|
|
|
mean value: 0.9763711583924349
|
|
|
|
key: test_jcc
|
|
value: [0.92592593 0.78571429 0.82758621 0.89285714 0.92857143 0.92307692
|
|
0.88888889 0.92592593 0.92857143 1. ]
|
|
|
|
mean value: 0.9027118156428502
|
|
|
|
key: train_jcc
|
|
value: [0.95435685 0.94583333 0.95833333 0.9539749 0.95020747 0.95416667
|
|
0.9623431 0.958159 0.95020747 0.95 ]
|
|
|
|
mean value: 0.9537582105013397
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02532554 0.01064587 0.01129389 0.0115211 0.01132441 0.0113101
|
|
0.01137424 0.0111444 0.01132274 0.01128101]
|
|
|
|
mean value: 0.012654328346252441
|
|
|
|
key: score_time
|
|
value: [0.01154637 0.00929785 0.00982523 0.00981331 0.00931716 0.00963163
|
|
0.00972509 0.00960636 0.00930762 0.00981116]
|
|
|
|
mean value: 0.009788179397583007
|
|
|
|
key: test_mcc
|
|
value: [0.92704716 0.59688314 0.50336201 0.73131034 0.70064905 0.69436507
|
|
0.57735027 0.76923077 0.66628253 0.65433031]
|
|
|
|
mean value: 0.6820810650750807
|
|
|
|
key: train_mcc
|
|
value: [0.73987525 0.70625194 0.78298581 0.71521098 0.7745312 0.74043224
|
|
0.69894261 0.74470782 0.76195052 0.74048587]
|
|
|
|
mean value: 0.7405374249725031
|
|
|
|
key: test_accuracy
|
|
value: [0.96226415 0.79245283 0.75 0.86538462 0.84615385 0.84615385
|
|
0.78846154 0.88461538 0.82692308 0.82692308]
|
|
|
|
mean value: 0.8389332365747459
|
|
|
|
key: train_accuracy
|
|
value: [0.86993603 0.85287846 0.89148936 0.85744681 0.88723404 0.87021277
|
|
0.84893617 0.87234043 0.88085106 0.87021277]
|
|
|
|
mean value: 0.8701537903189221
|
|
|
|
key: test_fscore
|
|
value: [0.96 0.7755102 0.76363636 0.86792453 0.85714286 0.84
|
|
0.78431373 0.88461538 0.84210526 0.82352941]
|
|
|
|
mean value: 0.8398777738190921
|
|
|
|
key: train_fscore
|
|
value: [0.87048832 0.8496732 0.89171975 0.85529158 0.88794926 0.87048832
|
|
0.84463895 0.87179487 0.87931034 0.86937901]
|
|
|
|
mean value: 0.8690733611272227
|
|
|
|
key: test_precision
|
|
value: [1. 0.86363636 0.72413793 0.85185185 0.8 0.875
|
|
0.8 0.88461538 0.77419355 0.84 ]
|
|
|
|
mean value: 0.841343507952518
|
|
|
|
key: train_precision
|
|
value: [0.86864407 0.86666667 0.88983051 0.86842105 0.88235294 0.86864407
|
|
0.86936937 0.87553648 0.89082969 0.875 ]
|
|
|
|
mean value: 0.8755294848921722
|
|
|
|
key: test_recall
|
|
value: [0.92307692 0.7037037 0.80769231 0.88461538 0.92307692 0.80769231
|
|
0.76923077 0.88461538 0.92307692 0.80769231]
|
|
|
|
mean value: 0.8434472934472934
|
|
|
|
key: train_recall
|
|
value: [0.87234043 0.83333333 0.89361702 0.84255319 0.89361702 0.87234043
|
|
0.8212766 0.86808511 0.86808511 0.86382979]
|
|
|
|
mean value: 0.8629078014184397
|
|
|
|
key: test_roc_auc
|
|
value: [0.96153846 0.79415954 0.75 0.86538462 0.84615385 0.84615385
|
|
0.78846154 0.88461538 0.82692308 0.82692308]
|
|
|
|
mean value: 0.8390313390313391
|
|
|
|
key: train_roc_auc
|
|
value: [0.8699309 0.85283688 0.89148936 0.85744681 0.88723404 0.87021277
|
|
0.84893617 0.87234043 0.88085106 0.87021277]
|
|
|
|
mean value: 0.8701491180214584
|
|
|
|
key: test_jcc
|
|
value: [0.92307692 0.63333333 0.61764706 0.76666667 0.75 0.72413793
|
|
0.64516129 0.79310345 0.72727273 0.7 ]
|
|
|
|
mean value: 0.7280399378806105
|
|
|
|
key: train_jcc
|
|
value: [0.77067669 0.73863636 0.8045977 0.74716981 0.79847909 0.77067669
|
|
0.73106061 0.77272727 0.78461538 0.76893939]
|
|
|
|
mean value: 0.7687579004360319
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC0...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.10077667 0.09779096 0.08207917 0.07680631 0.07786369 0.07240462
|
|
0.07249331 0.07102299 0.07772779 0.081285 ]
|
|
|
|
mean value: 0.08102505207061768
|
|
|
|
key: score_time
|
|
value: [0.0129652 0.01151061 0.01169777 0.01227379 0.01141953 0.01082182
|
|
0.01121521 0.01066947 0.01165438 0.01114535]
|
|
|
|
mean value: 0.011537313461303711
|
|
|
|
key: test_mcc
|
|
value: [0.85164138 1. 0.84866842 0.9258201 0.96225045 0.9258201
|
|
0.88527041 0.88527041 0.9258201 1. ]
|
|
|
|
mean value: 0.9210561378370106
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9245283 1. 0.92307692 0.96153846 0.98076923 0.96153846
|
|
0.94230769 0.94230769 0.96153846 1. ]
|
|
|
|
mean value: 0.9597605224963716
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.92592593 1. 0.92592593 0.96296296 0.98113208 0.96
|
|
0.94117647 0.94117647 0.96296296 1. ]
|
|
|
|
mean value: 0.9601262794425947
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.89285714 1. 0.89285714 0.92857143 0.96296296 1.
|
|
0.96 0.96 0.92857143 1. ]
|
|
|
|
mean value: 0.9525820105820106
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96153846 1. 0.96153846 1. 1. 0.92307692
|
|
0.92307692 0.92307692 1. 1. ]
|
|
|
|
mean value: 0.9692307692307692
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.92521368 1. 0.92307692 0.96153846 0.98076923 0.96153846
|
|
0.94230769 0.94230769 0.96153846 1. ]
|
|
|
|
mean value: 0.9598290598290599
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.86206897 1. 0.86206897 0.92857143 0.96296296 0.92307692
|
|
0.88888889 0.88888889 0.92857143 1. ]
|
|
|
|
mean value: 0.9245098451995004
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.86
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.05597258 0.06626916 0.07366967 0.09542322 0.04672813 0.05279183
|
|
0.08359885 0.06806421 0.04606438 0.07742214]
|
|
|
|
mean value: 0.06660041809082032
|
|
|
|
key: score_time
|
|
value: [0.02560306 0.01884866 0.01223159 0.02448392 0.0155127 0.01884794
|
|
0.02686143 0.01222873 0.01251125 0.01234651]
|
|
|
|
mean value: 0.017947578430175783
|
|
|
|
key: test_mcc
|
|
value: [0.89227454 0.85164138 0.61538462 0.81312325 0.74466871 0.73568294
|
|
0.65433031 0.80829038 0.73568294 0.81312325]
|
|
|
|
mean value: 0.7664202294856968
|
|
|
|
key: train_mcc
|
|
value: [0.90647462 0.90621761 0.91955698 0.90233192 0.90233192 0.90667855
|
|
0.91502618 0.91071251 0.91492675 0.90220118]
|
|
|
|
mean value: 0.9086458224458123
|
|
|
|
key: test_accuracy
|
|
value: [0.94339623 0.9245283 0.80769231 0.90384615 0.86538462 0.86538462
|
|
0.82692308 0.90384615 0.86538462 0.90384615]
|
|
|
|
mean value: 0.881023222060958
|
|
|
|
key: train_accuracy
|
|
value: [0.95309168 0.95309168 0.95957447 0.95106383 0.95106383 0.95319149
|
|
0.95744681 0.95531915 0.95744681 0.95106383]
|
|
|
|
mean value: 0.9542353581635894
|
|
|
|
key: test_fscore
|
|
value: [0.93877551 0.92307692 0.80769231 0.90909091 0.87719298 0.85714286
|
|
0.83018868 0.90196078 0.87272727 0.90909091]
|
|
|
|
mean value: 0.8826939135040409
|
|
|
|
key: train_fscore
|
|
value: [0.95378151 0.95319149 0.96016771 0.95157895 0.95157895 0.95378151
|
|
0.95780591 0.95560254 0.95762712 0.95137421]
|
|
|
|
mean value: 0.9546489894196434
|
|
|
|
key: test_precision
|
|
value: [1. 0.96 0.80769231 0.86206897 0.80645161 0.91304348
|
|
0.81481481 0.92 0.82758621 0.86206897]
|
|
|
|
mean value: 0.8773726351602252
|
|
|
|
key: train_precision
|
|
value: [0.94190871 0.94915254 0.94628099 0.94166667 0.94166667 0.94190871
|
|
0.94979079 0.94957983 0.9535865 0.94537815]
|
|
|
|
mean value: 0.9460919570890296
|
|
|
|
key: test_recall
|
|
value: [0.88461538 0.88888889 0.80769231 0.96153846 0.96153846 0.80769231
|
|
0.84615385 0.88461538 0.92307692 0.96153846]
|
|
|
|
mean value: 0.8927350427350428
|
|
|
|
key: train_recall
|
|
value: [0.96595745 0.95726496 0.97446809 0.96170213 0.96170213 0.96595745
|
|
0.96595745 0.96170213 0.96170213 0.95744681]
|
|
|
|
mean value: 0.9633860701945808
|
|
|
|
key: test_roc_auc
|
|
value: [0.94230769 0.92521368 0.80769231 0.90384615 0.86538462 0.86538462
|
|
0.82692308 0.90384615 0.86538462 0.90384615]
|
|
|
|
mean value: 0.8809829059829061
|
|
|
|
key: train_roc_auc
|
|
value: [0.95306419 0.95310056 0.95957447 0.95106383 0.95106383 0.95319149
|
|
0.95744681 0.95531915 0.95744681 0.95106383]
|
|
|
|
mean value: 0.9542334969994545
|
|
|
|
key: test_jcc
|
|
value: [0.88461538 0.85714286 0.67741935 0.83333333 0.78125 0.75
|
|
0.70967742 0.82142857 0.77419355 0.83333333]
|
|
|
|
mean value: 0.7922393802434124
|
|
|
|
key: train_jcc
|
|
value: [0.91164659 0.91056911 0.9233871 0.90763052 0.90763052 0.91164659
|
|
0.91902834 0.91497976 0.91869919 0.90725806]
|
|
|
|
mean value: 0.9132475768006711
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.0344193 0.01032948 0.01007748 0.0096755 0.00967216 0.00974727
|
|
0.0099268 0.01015282 0.01100254 0.01081896]
|
|
|
|
mean value: 0.012582230567932128
|
|
|
|
key: score_time
|
|
value: [0.01461482 0.00918388 0.00888848 0.00873566 0.00871611 0.00898671
|
|
0.00880671 0.00905466 0.00932407 0.00937629]
|
|
|
|
mean value: 0.009568738937377929
|
|
|
|
key: test_mcc
|
|
value: [0.89227454 0.57616505 0.54494926 0.80829038 0.70064905 0.71151247
|
|
0.65824263 0.73568294 0.70064905 0.65824263]
|
|
|
|
mean value: 0.6986657991505085
|
|
|
|
key: train_mcc
|
|
value: [0.71462102 0.66795337 0.76214388 0.66895783 0.74910575 0.70654292
|
|
0.70690158 0.71128258 0.73659716 0.68550371]
|
|
|
|
mean value: 0.7109609798884714
|
|
|
|
key: test_accuracy
|
|
value: [0.94339623 0.77358491 0.76923077 0.90384615 0.84615385 0.84615385
|
|
0.82692308 0.86538462 0.84615385 0.82692308]
|
|
|
|
mean value: 0.8447750362844703
|
|
|
|
key: train_accuracy
|
|
value: [0.85714286 0.8336887 0.88085106 0.83404255 0.87446809 0.85319149
|
|
0.85319149 0.85531915 0.86808511 0.84255319]
|
|
|
|
mean value: 0.8552533684162773
|
|
|
|
key: test_fscore
|
|
value: [0.93877551 0.73913043 0.78571429 0.90566038 0.85714286 0.82608696
|
|
0.81632653 0.85714286 0.85714286 0.81632653]
|
|
|
|
mean value: 0.8399449197234267
|
|
|
|
key: train_fscore
|
|
value: [0.85529158 0.82969432 0.87878788 0.82969432 0.87311828 0.8516129
|
|
0.85032538 0.85217391 0.86580087 0.83982684]
|
|
|
|
mean value: 0.8526326282826382
|
|
|
|
key: test_precision
|
|
value: [1. 0.89473684 0.73333333 0.88888889 0.8 0.95
|
|
0.86956522 0.91304348 0.8 0.86956522]
|
|
|
|
mean value: 0.8719132977370964
|
|
|
|
key: train_precision
|
|
value: [0.86842105 0.84821429 0.89427313 0.85201794 0.8826087 0.86086957
|
|
0.86725664 0.87111111 0.88105727 0.85462555]
|
|
|
|
mean value: 0.8680455231850978
|
|
|
|
key: test_recall
|
|
value: [0.88461538 0.62962963 0.84615385 0.92307692 0.92307692 0.73076923
|
|
0.76923077 0.80769231 0.92307692 0.76923077]
|
|
|
|
mean value: 0.8206552706552707
|
|
|
|
key: train_recall
|
|
value: [0.84255319 0.81196581 0.86382979 0.80851064 0.86382979 0.84255319
|
|
0.83404255 0.83404255 0.85106383 0.82553191]
|
|
|
|
mean value: 0.8377923258774322
|
|
|
|
key: test_roc_auc
|
|
value: [0.94230769 0.77635328 0.76923077 0.90384615 0.84615385 0.84615385
|
|
0.82692308 0.86538462 0.84615385 0.82692308]
|
|
|
|
mean value: 0.8449430199430199
|
|
|
|
key: train_roc_auc
|
|
value: [0.85717403 0.83364248 0.88085106 0.83404255 0.87446809 0.85319149
|
|
0.85319149 0.85531915 0.86808511 0.84255319]
|
|
|
|
mean value: 0.8552518639752682
|
|
|
|
key: test_jcc
|
|
value: [0.88461538 0.5862069 0.64705882 0.82758621 0.75 0.7037037
|
|
0.68965517 0.75 0.75 0.68965517]
|
|
|
|
mean value: 0.7278481360124363
|
|
|
|
key: train_jcc
|
|
value: [0.74716981 0.70895522 0.78378378 0.70895522 0.77480916 0.74157303
|
|
0.73962264 0.74242424 0.76335878 0.7238806 ]
|
|
|
|
mean value: 0.7434532496453498
|
|
|
|
MCC on Blind test: 0.76
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01762056 0.01975036 0.01909256 0.01915002 0.0176785 0.02011013
|
|
0.02195501 0.01924419 0.02043986 0.01871943]
|
|
|
|
mean value: 0.019376063346862794
|
|
|
|
key: score_time
|
|
value: [0.01073146 0.01117897 0.01201153 0.0118525 0.01197219 0.01188111
|
|
0.01207972 0.01185107 0.01192999 0.01186323]
|
|
|
|
mean value: 0.011735177040100098
|
|
|
|
key: test_mcc
|
|
value: [0.81688878 0.18759297 0.65433031 0.6789146 0.88527041 0.76923077
|
|
0.80829038 0.74466871 0.72760688 0.80829038]
|
|
|
|
mean value: 0.7081084179489021
|
|
|
|
key: train_mcc
|
|
value: [0.86416967 0.43722856 0.90213583 0.72315664 0.83806613 0.83960257
|
|
0.90233192 0.76845352 0.80635665 0.85958225]
|
|
|
|
mean value: 0.7941083731200913
|
|
|
|
key: test_accuracy
|
|
value: [0.90566038 0.54716981 0.82692308 0.82692308 0.94230769 0.88461538
|
|
0.90384615 0.86538462 0.84615385 0.90384615]
|
|
|
|
mean value: 0.8452830188679246
|
|
|
|
key: train_accuracy
|
|
value: [0.92963753 0.66098081 0.95106383 0.84468085 0.91702128 0.91702128
|
|
0.95106383 0.87234043 0.89787234 0.92978723]
|
|
|
|
mean value: 0.887146940071678
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.25 0.82352941 0.8 0.94339623 0.88461538
|
|
0.90196078 0.85106383 0.86666667 0.90566038]
|
|
|
|
mean value: 0.813598359001221
|
|
|
|
key: train_fscore
|
|
value: [0.93333333 0.48543689 0.95116773 0.81704261 0.91275168 0.92152918
|
|
0.95157895 0.85436893 0.90551181 0.92993631]
|
|
|
|
mean value: 0.8662657410357313
|
|
|
|
key: test_precision
|
|
value: [0.86206897 0.8 0.84 0.94736842 0.92592593 0.88461538
|
|
0.92 0.95238095 0.76470588 0.88888889]
|
|
|
|
mean value: 0.8785954420733966
|
|
|
|
key: train_precision
|
|
value: [0.88846154 1. 0.94915254 0.99390244 0.96226415 0.8740458
|
|
0.94166667 0.99435028 0.84249084 0.9279661 ]
|
|
|
|
mean value: 0.9374300365667224
|
|
|
|
key: test_recall
|
|
value: [0.96153846 0.14814815 0.80769231 0.69230769 0.96153846 0.88461538
|
|
0.88461538 0.76923077 1. 0.92307692]
|
|
|
|
mean value: 0.8032763532763533
|
|
|
|
key: train_recall
|
|
value: [0.98297872 0.32051282 0.95319149 0.69361702 0.86808511 0.97446809
|
|
0.96170213 0.74893617 0.9787234 0.93191489]
|
|
|
|
mean value: 0.8414129841789416
|
|
|
|
key: test_roc_auc
|
|
value: [0.90669516 0.5548433 0.82692308 0.82692308 0.94230769 0.88461538
|
|
0.90384615 0.86538462 0.84615385 0.90384615]
|
|
|
|
mean value: 0.8461538461538461
|
|
|
|
key: train_roc_auc
|
|
value: [0.92952355 0.66025641 0.95106383 0.84468085 0.91702128 0.91702128
|
|
0.95106383 0.87234043 0.89787234 0.92978723]
|
|
|
|
mean value: 0.8870631023822513
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.14285714 0.7 0.66666667 0.89285714 0.79310345
|
|
0.82142857 0.74074074 0.76470588 0.82758621]
|
|
|
|
mean value: 0.7183279135408953
|
|
|
|
key: train_jcc
|
|
value: [0.875 0.32051282 0.90688259 0.69067797 0.83950617 0.85447761
|
|
0.90763052 0.74576271 0.82733813 0.86904762]
|
|
|
|
mean value: 0.7836836144984219
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01870584 0.01970434 0.01880717 0.01866412 0.02053785 0.02331161
|
|
0.02050233 0.0209446 0.02116704 0.02013636]
|
|
|
|
mean value: 0.02024812698364258
|
|
|
|
key: score_time
|
|
value: [0.01103997 0.01269507 0.0119617 0.01208353 0.01191449 0.01273751
|
|
0.01195216 0.0127418 0.01198077 0.01185441]
|
|
|
|
mean value: 0.012096142768859864
|
|
|
|
key: test_mcc
|
|
value: [0.75007832 0.59347897 0.65824263 0.66666667 0.9258201 0.80829038
|
|
0.80829038 0.84866842 0.74466871 0.75878691]
|
|
|
|
mean value: 0.7562991488419109
|
|
|
|
key: train_mcc
|
|
value: [0.82318874 0.79500161 0.85288412 0.78776807 0.89946992 0.88344643
|
|
0.89198214 0.86302723 0.90351119 0.77446957]
|
|
|
|
mean value: 0.8474749027324484
|
|
|
|
key: test_accuracy
|
|
value: [0.86792453 0.77358491 0.82692308 0.80769231 0.96153846 0.90384615
|
|
0.90384615 0.92307692 0.86538462 0.86538462]
|
|
|
|
mean value: 0.8699201741654572
|
|
|
|
key: train_accuracy
|
|
value: [0.90618337 0.8891258 0.92340426 0.88723404 0.94893617 0.94042553
|
|
0.94468085 0.92978723 0.95106383 0.8787234 ]
|
|
|
|
mean value: 0.9199564487592433
|
|
|
|
key: test_fscore
|
|
value: [0.87719298 0.72727273 0.81632653 0.83870968 0.96296296 0.90196078
|
|
0.90196078 0.92 0.87719298 0.88135593]
|
|
|
|
mean value: 0.8704935364010411
|
|
|
|
key: train_fscore
|
|
value: [0.91338583 0.87619048 0.91855204 0.89668616 0.94736842 0.94262295
|
|
0.94672131 0.92650334 0.95238095 0.89017341]
|
|
|
|
mean value: 0.9210584885895807
|
|
|
|
key: test_precision
|
|
value: [0.80645161 0.94117647 0.86956522 0.72222222 0.92857143 0.92
|
|
0.92 0.95833333 0.80645161 0.78787879]
|
|
|
|
mean value: 0.8660650685791763
|
|
|
|
key: train_precision
|
|
value: [0.84981685 0.98924731 0.98067633 0.82733813 0.97737557 0.90909091
|
|
0.91304348 0.97196262 0.92741935 0.81338028]
|
|
|
|
mean value: 0.9159350825957544
|
|
|
|
key: test_recall
|
|
value: [0.96153846 0.59259259 0.76923077 1. 1. 0.88461538
|
|
0.88461538 0.88461538 0.96153846 1. ]
|
|
|
|
mean value: 0.8938746438746439
|
|
|
|
key: train_recall
|
|
value: [0.98723404 0.78632479 0.86382979 0.9787234 0.91914894 0.9787234
|
|
0.98297872 0.88510638 0.9787234 0.98297872]
|
|
|
|
mean value: 0.9343771594835424
|
|
|
|
key: test_roc_auc
|
|
value: [0.86965812 0.77706553 0.82692308 0.80769231 0.96153846 0.90384615
|
|
0.90384615 0.92307692 0.86538462 0.86538462]
|
|
|
|
mean value: 0.8704415954415955
|
|
|
|
key: train_roc_auc
|
|
value: [0.90601018 0.88890707 0.92340426 0.88723404 0.94893617 0.94042553
|
|
0.94468085 0.92978723 0.95106383 0.8787234 ]
|
|
|
|
mean value: 0.9199172576832151
|
|
|
|
key: test_jcc
|
|
value: [0.78125 0.57142857 0.68965517 0.72222222 0.92857143 0.82142857
|
|
0.82142857 0.85185185 0.78125 0.78787879]
|
|
|
|
mean value: 0.7756965177223798
|
|
|
|
key: train_jcc
|
|
value: [0.84057971 0.77966102 0.84937238 0.81272085 0.9 0.89147287
|
|
0.89883268 0.86307054 0.90909091 0.80208333]
|
|
|
|
mean value: 0.8546884294973143
|
|
|
|
MCC on Blind test: 0.82
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.18638968 0.18391204 0.18381119 0.18072701 0.18269753 0.18260503
|
|
0.18581295 0.18405628 0.18478727 0.18364573]
|
|
|
|
mean value: 0.1838444709777832
|
|
|
|
key: score_time
|
|
value: [0.0155859 0.01653624 0.01634264 0.01569414 0.01681447 0.01641345
|
|
0.01549268 0.01697898 0.01570892 0.01532364]
|
|
|
|
mean value: 0.016089105606079103
|
|
|
|
key: test_mcc
|
|
value: [0.88730475 0.96296296 0.84866842 0.9258201 0.96225045 0.96225045
|
|
0.81312325 0.92307692 0.96225045 1. ]
|
|
|
|
mean value: 0.9247707758320183
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.94339623 0.98113208 0.92307692 0.96153846 0.98076923 0.98076923
|
|
0.90384615 0.96153846 0.98076923 1. ]
|
|
|
|
mean value: 0.9616835994194485
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94117647 0.98113208 0.92592593 0.96296296 0.98113208 0.98039216
|
|
0.89795918 0.96153846 0.98113208 1. ]
|
|
|
|
mean value: 0.9613351387966895
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96 1. 0.89285714 0.92857143 0.96296296 1.
|
|
0.95652174 0.96153846 0.96296296 1. ]
|
|
|
|
mean value: 0.9625414698023393
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.92307692 0.96296296 0.96153846 1. 1. 0.96153846
|
|
0.84615385 0.96153846 1. 1. ]
|
|
|
|
mean value: 0.9616809116809117
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.94301994 0.98148148 0.92307692 0.96153846 0.98076923 0.98076923
|
|
0.90384615 0.96153846 0.98076923 1. ]
|
|
|
|
mean value: 0.9616809116809117
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.88888889 0.96296296 0.86206897 0.92857143 0.96296296 0.96153846
|
|
0.81481481 0.92592593 0.96296296 1. ]
|
|
|
|
mean value: 0.927069737414565
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.95
|
|
|
|
Accuracy on Blind test: 0.98
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0607388 0.07653952 0.08514023 0.07402754 0.08832049 0.07226753
|
|
0.07678175 0.06374931 0.08635044 0.07893109]
|
|
|
|
mean value: 0.07628467082977294
|
|
|
|
key: score_time
|
|
value: [0.01762295 0.02642894 0.02865911 0.04618526 0.03816152 0.0224731
|
|
0.02270222 0.03807664 0.03926802 0.03778052]
|
|
|
|
mean value: 0.03173582553863526
|
|
|
|
key: test_mcc
|
|
value: [0.92450142 0.96296296 0.84866842 0.9258201 0.96225045 0.96225045
|
|
0.88527041 0.88527041 0.84615385 1. ]
|
|
|
|
mean value: 0.9203148480995895
|
|
|
|
key: train_mcc
|
|
value: [0.99150739 0.98721563 0.98301432 0.99152527 0.9957537 0.98724298
|
|
0.9957537 1. 0.9873145 0.99152527]
|
|
|
|
mean value: 0.991085275446824
|
|
|
|
key: test_accuracy
|
|
value: [0.96226415 0.98113208 0.92307692 0.96153846 0.98076923 0.98076923
|
|
0.94230769 0.94230769 0.92307692 1. ]
|
|
|
|
mean value: 0.9597242380261248
|
|
|
|
key: train_accuracy
|
|
value: [0.99573561 0.99360341 0.99148936 0.99574468 0.99787234 0.99361702
|
|
0.99787234 1. 0.99361702 0.99574468]
|
|
|
|
mean value: 0.9955296465998276
|
|
|
|
key: test_fscore
|
|
value: [0.96153846 0.98113208 0.92592593 0.96296296 0.98113208 0.98039216
|
|
0.94117647 0.94117647 0.92307692 1. ]
|
|
|
|
mean value: 0.9598513522486886
|
|
|
|
key: train_fscore
|
|
value: [0.9957265 0.99357602 0.99145299 0.9957265 0.9978678 0.99363057
|
|
0.99787686 1. 0.99357602 0.9957265 ]
|
|
|
|
mean value: 0.9955159747729551
|
|
|
|
key: test_precision
|
|
value: [0.96153846 1. 0.89285714 0.92857143 0.96296296 1.
|
|
0.96 0.96 0.92307692 1. ]
|
|
|
|
mean value: 0.9589006919006919
|
|
|
|
key: train_precision
|
|
value: [1. 0.99570815 0.99570815 1. 1. 0.99152542
|
|
0.99576271 1. 1. 1. ]
|
|
|
|
mean value: 0.9978704444606096
|
|
|
|
key: test_recall
|
|
value: [0.96153846 0.96296296 0.96153846 1. 1. 0.96153846
|
|
0.92307692 0.92307692 0.92307692 1. ]
|
|
|
|
mean value: 0.9616809116809117
|
|
|
|
key: train_recall
|
|
value: [0.99148936 0.99145299 0.98723404 0.99148936 0.99574468 0.99574468
|
|
1. 1. 0.98723404 0.99148936]
|
|
|
|
mean value: 0.9931878523367885
|
|
|
|
key: test_roc_auc
|
|
value: [0.96225071 0.98148148 0.92307692 0.96153846 0.98076923 0.98076923
|
|
0.94230769 0.94230769 0.92307692 1. ]
|
|
|
|
mean value: 0.9597578347578348
|
|
|
|
key: train_roc_auc
|
|
value: [0.99574468 0.99359884 0.99148936 0.99574468 0.99787234 0.99361702
|
|
0.99787234 1. 0.99361702 0.99574468]
|
|
|
|
mean value: 0.9955300963811602
|
|
|
|
key: test_jcc
|
|
value: [0.92592593 0.96296296 0.86206897 0.92857143 0.96296296 0.96153846
|
|
0.88888889 0.88888889 0.85714286 1. ]
|
|
|
|
mean value: 0.9238951342399618
|
|
|
|
key: train_jcc
|
|
value: [0.99148936 0.98723404 0.98305085 0.99148936 0.99574468 0.98734177
|
|
0.99576271 1. 0.98723404 0.99148936]
|
|
|
|
mean value: 0.9910836182537762
|
|
|
|
MCC on Blind test: 0.9
|
|
|
|
Accuracy on Blind test: 0.95
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.14152217 0.1800046 0.18905711 0.15365958 0.15383959 0.17124987
|
|
0.15433812 0.15641809 0.15753102 0.1591301 ]
|
|
|
|
mean value: 0.16167502403259276
|
|
|
|
key: score_time
|
|
value: [0.02448511 0.02515721 0.02925134 0.0243063 0.02417588 0.02826023
|
|
0.02418399 0.02447724 0.02408624 0.02411389]
|
|
|
|
mean value: 0.025249743461608888
|
|
|
|
key: test_mcc
|
|
value: [0.82552431 0.66524218 0.3086067 0.65433031 0.73568294 0.70064905
|
|
0.76923077 0.69230769 0.76923077 0.54006172]
|
|
|
|
mean value: 0.6660866435725556
|
|
|
|
key: train_mcc
|
|
value: [0.99150739 0.99150708 0.9873145 0.98312115 0.9873145 0.99152527
|
|
0.9873145 0.9873145 0.9873145 0.9873145 ]
|
|
|
|
mean value: 0.9881547880923972
|
|
|
|
key: test_accuracy
|
|
value: [0.90566038 0.83018868 0.65384615 0.82692308 0.86538462 0.84615385
|
|
0.88461538 0.84615385 0.88461538 0.76923077]
|
|
|
|
mean value: 0.831277213352685
|
|
|
|
key: train_accuracy
|
|
value: [0.99573561 0.99573561 0.99361702 0.99148936 0.99361702 0.99574468
|
|
0.99361702 0.99361702 0.99361702 0.99361702]
|
|
|
|
mean value: 0.9940407385564578
|
|
|
|
key: test_fscore
|
|
value: [0.89361702 0.82352941 0.66666667 0.83018868 0.87272727 0.83333333
|
|
0.88461538 0.84615385 0.88461538 0.76 ]
|
|
|
|
mean value: 0.8295447000398473
|
|
|
|
key: train_fscore
|
|
value: [0.9957265 0.99570815 0.99357602 0.99141631 0.99357602 0.9957265
|
|
0.99357602 0.99357602 0.99357602 0.99357602]
|
|
|
|
mean value: 0.9940033557756031
|
|
|
|
key: test_precision
|
|
value: [1. 0.875 0.64285714 0.81481481 0.82758621 0.90909091
|
|
0.88461538 0.84615385 0.88461538 0.79166667]
|
|
|
|
mean value: 0.84764003557107
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.80769231 0.77777778 0.69230769 0.84615385 0.92307692 0.76923077
|
|
0.88461538 0.84615385 0.88461538 0.73076923]
|
|
|
|
mean value: 0.8162393162393162
|
|
|
|
key: train_recall
|
|
value: [0.99148936 0.99145299 0.98723404 0.98297872 0.98723404 0.99148936
|
|
0.98723404 0.98723404 0.98723404 0.98723404]
|
|
|
|
mean value: 0.9880814693580651
|
|
|
|
key: test_roc_auc
|
|
value: [0.90384615 0.83119658 0.65384615 0.82692308 0.86538462 0.84615385
|
|
0.88461538 0.84615385 0.88461538 0.76923077]
|
|
|
|
mean value: 0.8311965811965811
|
|
|
|
key: train_roc_auc
|
|
value: [0.99574468 0.9957265 0.99361702 0.99148936 0.99361702 0.99574468
|
|
0.99361702 0.99361702 0.99361702 0.99361702]
|
|
|
|
mean value: 0.9940407346790325
|
|
|
|
key: test_jcc
|
|
value: [0.80769231 0.7 0.5 0.70967742 0.77419355 0.71428571
|
|
0.79310345 0.73333333 0.79310345 0.61290323]
|
|
|
|
mean value: 0.7138292445411467
|
|
|
|
key: train_jcc
|
|
value: [0.99148936 0.99145299 0.98723404 0.98297872 0.98723404 0.99148936
|
|
0.98723404 0.98723404 0.98723404 0.98723404]
|
|
|
|
mean value: 0.9880814693580651
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.79
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.73881483 0.72350812 0.73274493 0.72523093 0.73023582 0.74169731
|
|
0.7289629 0.72415876 0.72671056 0.72561407]
|
|
|
|
mean value: 0.7297678232192993
|
|
|
|
key: score_time
|
|
value: [0.00964332 0.00934935 0.00935745 0.00994277 0.00966001 0.00946784
|
|
0.00924683 0.00931072 0.00933957 0.00921845]
|
|
|
|
mean value: 0.009453630447387696
|
|
|
|
key: test_mcc
|
|
value: [0.85164138 1. 0.81312325 0.9258201 0.96225045 0.96225045
|
|
0.88527041 0.92307692 0.9258201 1. ]
|
|
|
|
mean value: 0.9249253060070406
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.9245283 1. 0.90384615 0.96153846 0.98076923 0.98076923
|
|
0.94230769 0.96153846 0.96153846 1. ]
|
|
|
|
mean value: 0.9616835994194485
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.92592593 1. 0.90909091 0.96296296 0.98113208 0.98039216
|
|
0.94117647 0.96153846 0.96296296 1. ]
|
|
|
|
mean value: 0.9625181925403901
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.89285714 1. 0.86206897 0.92857143 0.96296296 1.
|
|
0.96 0.96153846 0.92857143 1. ]
|
|
|
|
mean value: 0.9496570390018666
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96153846 1. 0.96153846 1. 1. 0.96153846
|
|
0.92307692 0.96153846 1. 1. ]
|
|
|
|
mean value: 0.9769230769230769
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.92521368 1. 0.90384615 0.96153846 0.98076923 0.98076923
|
|
0.94230769 0.96153846 0.96153846 1. ]
|
|
|
|
mean value: 0.9617521367521368
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.86206897 1. 0.83333333 0.92857143 0.96296296 0.96153846
|
|
0.88888889 0.92592593 0.92857143 1. ]
|
|
|
|
mean value: 0.9291861395309671
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.86
|
|
|
|
Accuracy on Blind test: 0.93
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.03048921 0.02674842 0.03247237 0.03025341 0.03184032 0.04581976
|
|
0.09403157 0.06438708 0.03938842 0.05673742]
|
|
|
|
mean value: 0.04521679878234863
|
|
|
|
key: score_time
|
|
value: [0.01284647 0.01716471 0.02299953 0.01456118 0.03505945 0.02486992
|
|
0.02550769 0.02261448 0.01530409 0.01726437]
|
|
|
|
mean value: 0.02081918716430664
|
|
|
|
key: test_mcc
|
|
value: [0.29676375 0.3960114 0.50951017 0.34684399 0.45095603 0.6172134
|
|
0.27386128 0.36896403 0.4233902 0.13323468]
|
|
|
|
mean value: 0.3816748916887629
|
|
|
|
key: train_mcc
|
|
value: [0.80521616 0.96592046 0.96609741 0.68800744 0.84577093 0.97880317
|
|
0.74239822 0.59537119 0.89871703 0.63481105]
|
|
|
|
mean value: 0.8121113061220316
|
|
|
|
key: test_accuracy
|
|
value: [0.64150943 0.69811321 0.75 0.65384615 0.71153846 0.80769231
|
|
0.61538462 0.67307692 0.71153846 0.55769231]
|
|
|
|
mean value: 0.6820391872278665
|
|
|
|
key: train_accuracy
|
|
value: [0.89339019 0.98294243 0.98297872 0.8212766 0.91702128 0.9893617
|
|
0.85531915 0.76170213 0.94680851 0.78723404]
|
|
|
|
mean value: 0.8938034750260854
|
|
|
|
key: test_fscore
|
|
value: [0.6779661 0.7037037 0.77192982 0.71875 0.75409836 0.8
|
|
0.6969697 0.72131148 0.71698113 0.64615385]
|
|
|
|
mean value: 0.7207864141224611
|
|
|
|
key: train_fscore
|
|
value: [0.90384615 0.98297872 0.98312236 0.84837545 0.92337917 0.98942918
|
|
0.87360595 0.80756014 0.94382022 0.8245614 ]
|
|
|
|
mean value: 0.9080678755351793
|
|
|
|
key: test_precision
|
|
value: [0.60606061 0.7037037 0.70967742 0.60526316 0.65714286 0.83333333
|
|
0.575 0.62857143 0.7037037 0.53846154]
|
|
|
|
mean value: 0.6560917748226747
|
|
|
|
key: train_precision
|
|
value: [0.8245614 0.97881356 0.9748954 0.73667712 0.85766423 0.98319328
|
|
0.77557756 0.67723343 1. 0.70149254]
|
|
|
|
mean value: 0.8510108511659394
|
|
|
|
key: test_recall
|
|
value: [0.76923077 0.7037037 0.84615385 0.88461538 0.88461538 0.76923077
|
|
0.88461538 0.84615385 0.73076923 0.80769231]
|
|
|
|
mean value: 0.8126780626780626
|
|
|
|
key: train_recall
|
|
value: [1. 0.98717949 0.99148936 1. 1. 0.99574468
|
|
1. 1. 0.89361702 1. ]
|
|
|
|
mean value: 0.9868030551009275
|
|
|
|
key: test_roc_auc
|
|
value: [0.64387464 0.6980057 0.75 0.65384615 0.71153846 0.80769231
|
|
0.61538462 0.67307692 0.71153846 0.55769231]
|
|
|
|
mean value: 0.6822649572649573
|
|
|
|
key: train_roc_auc
|
|
value: [0.89316239 0.98295145 0.98297872 0.8212766 0.91702128 0.9893617
|
|
0.85531915 0.76170213 0.94680851 0.78723404]
|
|
|
|
mean value: 0.8937815966539371
|
|
|
|
key: test_jcc
|
|
value: [0.51282051 0.54285714 0.62857143 0.56097561 0.60526316 0.66666667
|
|
0.53488372 0.56410256 0.55882353 0.47727273]
|
|
|
|
mean value: 0.5652237060283873
|
|
|
|
key: train_jcc
|
|
value: [0.8245614 0.9665272 0.96680498 0.73667712 0.85766423 0.9790795
|
|
0.77557756 0.67723343 0.89361702 0.70149254]
|
|
|
|
mean value: 0.8379234972627273
|
|
|
|
MCC on Blind test: 0.41
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02924728 0.04173183 0.03825188 0.03950167 0.05261588 0.02752686
|
|
0.03700113 0.03158307 0.03224421 0.03502917]
|
|
|
|
mean value: 0.03647329807281494
|
|
|
|
key: score_time
|
|
value: [0.02261496 0.01868033 0.01876545 0.0188148 0.02677727 0.0213027
|
|
0.01892281 0.01888084 0.0189209 0.01876068]
|
|
|
|
mean value: 0.020244073867797852
|
|
|
|
key: test_mcc
|
|
value: [0.92704716 0.88746439 0.61538462 0.88527041 0.79056942 0.73568294
|
|
0.80829038 0.84615385 0.74466871 0.84866842]
|
|
|
|
mean value: 0.8089200292799028
|
|
|
|
key: train_mcc
|
|
value: [0.86799458 0.8681985 0.85559807 0.85559807 0.85113319 0.86433077
|
|
0.88136192 0.85559807 0.85958225 0.85113319]
|
|
|
|
mean value: 0.8610528586884763
|
|
|
|
key: test_accuracy
|
|
value: [0.96226415 0.94339623 0.80769231 0.94230769 0.88461538 0.86538462
|
|
0.90384615 0.92307692 0.86538462 0.92307692]
|
|
|
|
mean value: 0.9021044992743106
|
|
|
|
key: train_accuracy
|
|
value: [0.93390192 0.93390192 0.92765957 0.92765957 0.92553191 0.93191489
|
|
0.94042553 0.92765957 0.92978723 0.92553191]
|
|
|
|
mean value: 0.9303974050719049
|
|
|
|
key: test_fscore
|
|
value: [0.96 0.94339623 0.80769231 0.94339623 0.89655172 0.85714286
|
|
0.90196078 0.92307692 0.87719298 0.92592593]
|
|
|
|
mean value: 0.9036335957575999
|
|
|
|
key: train_fscore
|
|
value: [0.93473684 0.93473684 0.92857143 0.92857143 0.92600423 0.93305439
|
|
0.94142259 0.92857143 0.92993631 0.92600423]
|
|
|
|
mean value: 0.9311609719764614
|
|
|
|
key: test_precision
|
|
value: [1. 0.96153846 0.80769231 0.92592593 0.8125 0.91304348
|
|
0.92 0.92307692 0.80645161 0.89285714]
|
|
|
|
mean value: 0.8963085852254856
|
|
|
|
key: train_precision
|
|
value: [0.925 0.92116183 0.91701245 0.91701245 0.92016807 0.91769547
|
|
0.92592593 0.91701245 0.9279661 0.92016807]
|
|
|
|
mean value: 0.9209122805450133
|
|
|
|
key: test_recall
|
|
value: [0.92307692 0.92592593 0.80769231 0.96153846 1. 0.80769231
|
|
0.88461538 0.92307692 0.96153846 0.96153846]
|
|
|
|
mean value: 0.9156695156695157
|
|
|
|
key: train_recall
|
|
value: [0.94468085 0.94871795 0.94042553 0.94042553 0.93191489 0.94893617
|
|
0.95744681 0.94042553 0.93191489 0.93191489]
|
|
|
|
mean value: 0.9416803055100927
|
|
|
|
key: test_roc_auc
|
|
value: [0.96153846 0.94373219 0.80769231 0.94230769 0.88461538 0.86538462
|
|
0.90384615 0.92307692 0.86538462 0.92307692]
|
|
|
|
mean value: 0.9020655270655271
|
|
|
|
key: train_roc_auc
|
|
value: [0.93387889 0.93393344 0.92765957 0.92765957 0.92553191 0.93191489
|
|
0.94042553 0.92765957 0.92978723 0.92553191]
|
|
|
|
mean value: 0.9303982542280415
|
|
|
|
key: test_jcc
|
|
value: [0.92307692 0.89285714 0.67741935 0.89285714 0.8125 0.75
|
|
0.82142857 0.85714286 0.78125 0.86206897]
|
|
|
|
mean value: 0.8270600957718588
|
|
|
|
key: train_jcc
|
|
value: [0.87747036 0.87747036 0.86666667 0.86666667 0.86220472 0.8745098
|
|
0.88932806 0.86666667 0.86904762 0.86220472]
|
|
|
|
mean value: 0.8712235646491643
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'mcsm_ppi2_affinity',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=169)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.2604568 0.26721501 0.30715322 0.33151817 0.27433705 0.28127027
|
|
0.27576518 0.27336073 0.27455401 0.30313301]
|
|
|
|
mean value: 0.28487634658813477
|
|
|
|
key: score_time
|
|
value: [0.02250957 0.01868081 0.02003098 0.01883531 0.01878572 0.01876616
|
|
0.01875806 0.01878333 0.01887512 0.01968741]
|
|
|
|
mean value: 0.01937124729156494
|
|
|
|
key: test_mcc
|
|
value: [0.92704716 0.88746439 0.61538462 0.88527041 0.79056942 0.73568294
|
|
0.80829038 0.84615385 0.74466871 0.84866842]
|
|
|
|
mean value: 0.8089200292799028
|
|
|
|
key: train_mcc
|
|
value: [0.86799458 0.8681985 0.85559807 0.85559807 0.85113319 0.86433077
|
|
0.88136192 0.85559807 0.85958225 0.85113319]
|
|
|
|
mean value: 0.8610528586884763
|
|
|
|
key: test_accuracy
|
|
value: [0.96226415 0.94339623 0.80769231 0.94230769 0.88461538 0.86538462
|
|
0.90384615 0.92307692 0.86538462 0.92307692]
|
|
|
|
mean value: 0.9021044992743106
|
|
|
|
key: train_accuracy
|
|
value: [0.93390192 0.93390192 0.92765957 0.92765957 0.92553191 0.93191489
|
|
0.94042553 0.92765957 0.92978723 0.92553191]
|
|
|
|
mean value: 0.9303974050719049
|
|
|
|
key: test_fscore
|
|
value: [0.96 0.94339623 0.80769231 0.94339623 0.89655172 0.85714286
|
|
0.90196078 0.92307692 0.87719298 0.92592593]
|
|
|
|
mean value: 0.9036335957575999
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_sl.py:188: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rouC_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./rpob_sl.py:191: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rouC_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[0.93473684 0.93473684 0.92857143 0.92857143 0.92600423 0.93305439
|
|
0.94142259 0.92857143 0.92993631 0.92600423]
|
|
|
|
mean value: 0.9311609719764614
|
|
|
|
key: test_precision
|
|
value: [1. 0.96153846 0.80769231 0.92592593 0.8125 0.91304348
|
|
0.92 0.92307692 0.80645161 0.89285714]
|
|
|
|
mean value: 0.8963085852254856
|
|
|
|
key: train_precision
|
|
value: [0.925 0.92116183 0.91701245 0.91701245 0.92016807 0.91769547
|
|
0.92592593 0.91701245 0.9279661 0.92016807]
|
|
|
|
mean value: 0.9209122805450133
|
|
|
|
key: test_recall
|
|
value: [0.92307692 0.92592593 0.80769231 0.96153846 1. 0.80769231
|
|
0.88461538 0.92307692 0.96153846 0.96153846]
|
|
|
|
mean value: 0.9156695156695157
|
|
|
|
key: train_recall
|
|
value: [0.94468085 0.94871795 0.94042553 0.94042553 0.93191489 0.94893617
|
|
0.95744681 0.94042553 0.93191489 0.93191489]
|
|
|
|
mean value: 0.9416803055100927
|
|
|
|
key: test_roc_auc
|
|
value: [0.96153846 0.94373219 0.80769231 0.94230769 0.88461538 0.86538462
|
|
0.90384615 0.92307692 0.86538462 0.92307692]
|
|
|
|
mean value: 0.9020655270655271
|
|
|
|
key: train_roc_auc
|
|
value: [0.93387889 0.93393344 0.92765957 0.92765957 0.92553191 0.93191489
|
|
0.94042553 0.92765957 0.92978723 0.92553191]
|
|
|
|
mean value: 0.9303982542280415
|
|
|
|
key: test_jcc
|
|
value: [0.92307692 0.89285714 0.67741935 0.89285714 0.8125 0.75
|
|
0.82142857 0.85714286 0.78125 0.86206897]
|
|
|
|
mean value: 0.8270600957718588
|
|
|
|
key: train_jcc
|
|
value: [0.87747036 0.87747036 0.86666667 0.86666667 0.86220472 0.8745098
|
|
0.88932806 0.86666667 0.86904762 0.86220472]
|
|
|
|
mean value: 0.8712235646491643
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.88
|