19707 lines
973 KiB
Text
19707 lines
973 KiB
Text
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_data_sl.py:549: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
mask_check.sort_values(by = ['ligand_distance'], ascending = True, inplace = True)
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
|
|
from pandas import MultiIndex, Int64Index
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
1.22.4
|
|
1.4.1
|
|
|
|
aaindex_df contains non-numerical data
|
|
|
|
Total no. of non-numerial columns: 2
|
|
|
|
Selecting numerical data only
|
|
|
|
PASS: successfully selected numerical columns only for aaindex_df
|
|
|
|
Now checking for NA in the remaining aaindex_cols
|
|
|
|
Counting aaindex_df cols with NA
|
|
ncols with NA: 4 columns
|
|
Dropping these...
|
|
Original ncols: 127
|
|
|
|
Revised df ncols: 123
|
|
|
|
Checking NA in revised df...
|
|
|
|
PASS: cols with NA successfully dropped from aaindex_df
|
|
Proceeding with combining aa_df with other features_df
|
|
|
|
PASS: ncols match
|
|
Expected ncols: 123
|
|
Got: 123
|
|
|
|
Total no. of columns in clean aa_df: 123
|
|
|
|
Proceeding to merge, expected nrows in merged_df: 817
|
|
|
|
PASS: my_features_df and aa_df successfully combined
|
|
nrows: 817
|
|
ncols: 269
|
|
count of NULL values before imputation
|
|
|
|
or_mychisq 244
|
|
log10_or_mychisq 244
|
|
dtype: int64
|
|
count of NULL values AFTER imputation
|
|
|
|
mutationinformation 0
|
|
or_rawI 0
|
|
logorI 0
|
|
dtype: int64
|
|
|
|
PASS: OR values imputed, data ready for ML
|
|
|
|
Total no. of features for aaindex: 123
|
|
|
|
No. of numerical features: 168
|
|
No. of categorical features: 7
|
|
|
|
PASS: x_features has no target variable
|
|
|
|
No. of columns for x_features: 175
|
|
|
|
-------------------------------------------------------------
|
|
Successfully split data according to scaling law: 1/np.sqrt(x_ncols)
|
|
Train data size: (431, 175)
|
|
Test data size: 0.07559289460184544 (36, 175)
|
|
y_train numbers: Counter({1: 285, 0: 146})
|
|
y_train ratio: 0.512280701754386
|
|
|
|
y_test_numbers: Counter({1: 24, 0: 12})
|
|
y_test ratio: 0.5
|
|
-------------------------------------------------------------
|
|
|
|
Simple Random OverSampling
|
|
Counter({1: 285, 0: 285})
|
|
(570, 175)
|
|
|
|
Simple Random UnderSampling
|
|
Counter({0: 146, 1: 146})
|
|
(292, 175)
|
|
|
|
Simple Combined Over and UnderSampling
|
|
Counter({0: 285, 1: 285})
|
|
(570, 175)
|
|
|
|
SMOTE_NC OverSampling
|
|
Counter({1: 285, 0: 285})
|
|
(570, 175)
|
|
|
|
#####################################################################
|
|
|
|
Running ML analysis: scaling law split
|
|
Gene name: katG
|
|
Drug name: isoniazid
|
|
|
|
Output directory: /home/tanu/git/Data/isoniazid/output/ml/tts_sl/
|
|
Sanity checks:
|
|
ML source data size: (467, 175)
|
|
Total input features: (431, 175)
|
|
Target feature numbers: Counter({1: 285, 0: 146})
|
|
Target features ratio: 0.512280701754386
|
|
|
|
#####################################################################
|
|
|
|
|
|
================================================================
|
|
|
|
Strucutral features (n): 36
|
|
These are:
|
|
Common stablity features: ['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_ppi2_affinity', 'interface_dist']
|
|
FoldX columns: ['electro_rr', 'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr', 'disulfide_mm', 'disulfide_sm', 'disulfide_ss', 'hbonds_rr', 'hbonds_mm', 'hbonds_sm', 'hbonds_ss', 'partcov_rr', 'partcov_mm', 'partcov_sm', 'partcov_ss', 'vdwclashes_rr', 'vdwclashes_mm', 'vdwclashes_sm', 'vdwclashes_ss', 'volumetric_rr', 'volumetric_mm', 'volumetric_ss']
|
|
Other struc columns: ['rsa', 'kd_values', 'rd_values']
|
|
================================================================
|
|
|
|
AAindex features (n): 123
|
|
================================================================
|
|
|
|
Evolutionary features (n): 3
|
|
These are:
|
|
['consurf_score', 'snap2_score', 'provean_score']
|
|
================================================================
|
|
|
|
Genomic features (n): 6
|
|
These are:
|
|
['maf', 'logorI']
|
|
['lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique']
|
|
================================================================
|
|
|
|
Categorical features (n): 7
|
|
These are:
|
|
['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site']
|
|
================================================================
|
|
|
|
|
|
Pass: No. of features match
|
|
|
|
#####################################################################
|
|
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03372955 0.03613591 0.03644919 0.03534794 0.03659058 0.03599262
|
|
0.03608441 0.04660845 0.03674316 0.03684807]
|
|
|
|
mean value: 0.03705298900604248
|
|
|
|
key: score_time
|
|
value: [0.01260567 0.01232028 0.01445866 0.0145154 0.01467729 0.0162003
|
|
0.01573014 0.01584959 0.01475835 0.014889 ]
|
|
|
|
mean value: 0.014600467681884766
|
|
|
|
key: test_mcc
|
|
value: [0.58131836 0.74048587 0.78481149 0.89408867 0.53276418 0.8993825
|
|
0.80104099 0.63660014 0.95079854 0.79313677]
|
|
|
|
mean value: 0.7614427513528801
|
|
|
|
key: train_mcc
|
|
value: [0.86656096 0.8442975 0.838564 0.85620977 0.85050655 0.83787173
|
|
0.86186304 0.84413863 0.83211139 0.86186304]
|
|
|
|
mean value: 0.8493986619628706
|
|
|
|
key: test_accuracy
|
|
value: [0.81818182 0.88372093 0.90697674 0.95348837 0.79069767 0.95348837
|
|
0.90697674 0.8372093 0.97674419 0.90697674]
|
|
|
|
mean value: 0.893446088794926
|
|
|
|
key: train_accuracy
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[0.94056848 0.93041237 0.92783505 0.93556701 0.93298969 0.92783505
|
|
0.93814433 0.93041237 0.92525773 0.93814433]
|
|
|
|
mean value: 0.9327166413596526
|
|
|
|
key: test_fscore
|
|
value: [0.87096774 0.92063492 0.93333333 0.96551724 0.84210526 0.96551724
|
|
0.93333333 0.8852459 0.98181818 0.93103448]
|
|
|
|
mean value: 0.9229507641369733
|
|
|
|
key: train_fscore
|
|
value: [0.95619048 0.9489603 0.94716981 0.95274102 0.9509434 0.94736842
|
|
0.95488722 0.94934334 0.94559099 0.95488722]
|
|
|
|
mean value: 0.9508082198090645
|
|
|
|
key: test_precision
|
|
value: [0.81818182 0.85294118 0.90322581 0.96551724 0.85714286 0.93333333
|
|
0.875 0.81818182 1. 0.9 ]
|
|
|
|
mean value: 0.8923524051141338
|
|
|
|
key: train_precision
|
|
value: [0.9330855 0.91941392 0.91605839 0.92307692 0.91970803 0.91636364
|
|
0.92363636 0.91666667 0.91304348 0.92363636]
|
|
|
|
mean value: 0.9204689276271143
|
|
|
|
key: test_recall
|
|
value: [0.93103448 1. 0.96551724 0.96551724 0.82758621 1.
|
|
1. 0.96428571 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9582512315270936
|
|
|
|
key: train_recall
|
|
value: [0.98046875 0.98046875 0.98046875 0.984375 0.984375 0.98054475
|
|
0.98832685 0.9844358 0.98054475 0.98832685]
|
|
|
|
mean value: 0.9832335238326848
|
|
|
|
key: test_roc_auc
|
|
value: [0.76551724 0.82142857 0.87561576 0.94704433 0.77093596 0.93333333
|
|
0.86666667 0.78214286 0.98214286 0.88214286]
|
|
|
|
mean value: 0.8626970443349754
|
|
|
|
key: train_roc_auc
|
|
value: [0.92153208 0.90690104 0.90311316 0.91264205 0.90885417 0.90248611
|
|
0.91401075 0.90443164 0.89866932 0.91401075]
|
|
|
|
mean value: 0.908665107972322
|
|
|
|
key: test_jcc
|
|
value: [0.77142857 0.85294118 0.875 0.93333333 0.72727273 0.93333333
|
|
0.875 0.79411765 0.96428571 0.87096774]
|
|
|
|
mean value: 0.8597680245118575
|
|
|
|
key: train_jcc
|
|
value: [0.91605839 0.9028777 0.89964158 0.90974729 0.90647482 0.9
|
|
0.91366906 0.90357143 0.89679715 0.91366906]
|
|
|
|
mean value: 0.9062506492718643
|
|
|
|
MCC on Blind test: 0.82
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.95363975 0.8347497 0.98288345 0.84292221 0.82918787 0.90894675
|
|
0.84772086 0.95350075 0.82799745 0.85421014]
|
|
|
|
mean value: 0.8835758924484253
|
|
|
|
key: score_time
|
|
value: [0.01479244 0.01366854 0.0137701 0.01498795 0.01960564 0.01608729
|
|
0.01490307 0.01511002 0.01493359 0.01592612]
|
|
|
|
mean value: 0.015378475189208984
|
|
|
|
key: test_mcc
|
|
value: [0.85146932 0.84515772 0.81883947 0.84515772 0.9025825 0.94928891
|
|
0.85004744 0.7412616 0.95079854 0.6479516 ]
|
|
|
|
mean value: 0.8402554820628604
|
|
|
|
key: train_mcc
|
|
value: [0.98846016 0.97701629 1. 0.98276159 0.98854135 0.9769295
|
|
0.98847536 1. 0.9769295 1. ]
|
|
|
|
mean value: 0.9879113755069235
|
|
|
|
key: test_accuracy
|
|
value: [0.93181818 0.93023256 0.90697674 0.93023256 0.95348837 0.97674419
|
|
0.93023256 0.88372093 0.97674419 0.8372093 ]
|
|
|
|
mean value: 0.9257399577167019
|
|
|
|
key: train_accuracy
|
|
value: [0.99483204 0.98969072 1. 0.99226804 0.99484536 0.98969072
|
|
0.99484536 1. 0.98969072 1. ]
|
|
|
|
mean value: 0.994586296917872
|
|
|
|
key: test_fscore
|
|
value: [0.95081967 0.94736842 0.92592593 0.94736842 0.96428571 0.98245614
|
|
0.94915254 0.91525424 0.98181818 0.87272727]
|
|
|
|
mean value: 0.94371765290054
|
|
|
|
key: train_fscore
|
|
value: [0.99609375 0.9922179 1. 0.99415205 0.99610895 0.99224806
|
|
0.99610895 1. 0.99224806 1. ]
|
|
|
|
mean value: 0.9959177718480003
|
|
|
|
key: test_precision
|
|
value: [0.90625 0.96428571 1. 0.96428571 1. 0.96551724
|
|
0.90322581 0.87096774 1. 0.88888889]
|
|
|
|
mean value: 0.9463421107226725
|
|
|
|
key: train_precision
|
|
value: [0.99609375 0.98837209 1. 0.9922179 0.99224806 0.98841699
|
|
0.99610895 1. 0.98841699 1. ]
|
|
|
|
mean value: 0.9941874730121764
|
|
|
|
key: test_recall
|
|
value: [1. 0.93103448 0.86206897 0.93103448 0.93103448 1.
|
|
1. 0.96428571 0.96428571 0.85714286]
|
|
|
|
mean value: 0.9440886699507389
|
|
|
|
key: train_recall
|
|
value: [0.99609375 0.99609375 1. 0.99609375 1. 0.99610895
|
|
0.99610895 1. 0.99610895 1. ]
|
|
|
|
mean value: 0.9976608098249027
|
|
|
|
key: test_roc_auc
|
|
value: [0.9 0.92980296 0.93103448 0.92980296 0.96551724 0.96666667
|
|
0.9 0.84880952 0.98214286 0.82857143]
|
|
|
|
mean value: 0.9182348111658457
|
|
|
|
key: train_roc_auc
|
|
value: [0.99423008 0.98668324 1. 0.99047112 0.99242424 0.98660409
|
|
0.99423768 1. 0.98660409 1. ]
|
|
|
|
mean value: 0.9931254546464323
|
|
|
|
key: test_jcc
|
|
value: [0.90625 0.9 0.86206897 0.9 0.93103448 0.96551724
|
|
0.90322581 0.84375 0.96428571 0.77419355]
|
|
|
|
mean value: 0.8950325758779596
|
|
|
|
key: train_jcc
|
|
value: [0.9922179 0.98455598 1. 0.98837209 0.99224806 0.98461538
|
|
0.99224806 1. 0.98461538 1. ]
|
|
|
|
mean value: 0.9918872869673703
|
|
|
|
MCC on Blind test: 0.82
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01457953 0.01114249 0.01063085 0.01069617 0.01100039 0.0102458
|
|
0.01035047 0.0098381 0.01021338 0.01029086]
|
|
|
|
mean value: 0.010898804664611817
|
|
|
|
key: score_time
|
|
value: [0.01245618 0.01024842 0.01079679 0.00953054 0.00977015 0.00972557
|
|
0.00895238 0.00899076 0.00913095 0.00907373]
|
|
|
|
mean value: 0.009867548942565918
|
|
|
|
key: test_mcc
|
|
value: [0.45305024 0.56055699 0.51517946 0.57635468 0.35173219 0.7412616
|
|
0.40616479 0.40241617 0.74102654 0.5892454 ]
|
|
|
|
mean value: 0.5336988065510012
|
|
|
|
key: train_mcc
|
|
value: [0.62582364 0.58240004 0.54248922 0.58757265 0.59314497 0.58075708
|
|
0.63502039 0.58214114 0.60024604 0.61735991]
|
|
|
|
mean value: 0.5946955073947634
|
|
|
|
key: test_accuracy
|
|
value: [0.75 0.81395349 0.79069767 0.81395349 0.69767442 0.88372093
|
|
0.72093023 0.74418605 0.88372093 0.79069767]
|
|
|
|
mean value: 0.788953488372093
|
|
|
|
key: train_accuracy
|
|
value: [0.83462532 0.81443299 0.79639175 0.81701031 0.81701031 0.81443299
|
|
0.83762887 0.81443299 0.82216495 0.81701031]
|
|
|
|
mean value: 0.8185140786914942
|
|
|
|
key: test_fscore
|
|
value: [0.80701754 0.875 0.84745763 0.86206897 0.76363636 0.91525424
|
|
0.77777778 0.82539683 0.9122807 0.82352941]
|
|
|
|
mean value: 0.8409419454113729
|
|
|
|
key: train_fscore
|
|
value: [0.87692308 0.86100386 0.84719536 0.86319846 0.86105675 0.86153846
|
|
0.87814313 0.86100386 0.86653772 0.85420945]
|
|
|
|
mean value: 0.8630810124993853
|
|
|
|
key: test_precision
|
|
value: [0.82142857 0.8 0.83333333 0.86206897 0.80769231 0.87096774
|
|
0.80769231 0.74285714 0.89655172 0.91304348]
|
|
|
|
mean value: 0.8355635572855189
|
|
|
|
key: train_precision
|
|
value: [0.86363636 0.85114504 0.83908046 0.85171103 0.8627451 0.85171103
|
|
0.87307692 0.85440613 0.86153846 0.90434783]
|
|
|
|
mean value: 0.8613398353816113
|
|
|
|
key: test_recall
|
|
value: [0.79310345 0.96551724 0.86206897 0.86206897 0.72413793 0.96428571
|
|
0.75 0.92857143 0.92857143 0.75 ]
|
|
|
|
mean value: 0.8528325123152709
|
|
|
|
key: train_recall
|
|
value: [0.890625 0.87109375 0.85546875 0.875 0.859375 0.87159533
|
|
0.88326848 0.86770428 0.87159533 0.80933852]
|
|
|
|
mean value: 0.8655064445525292
|
|
|
|
key: test_roc_auc
|
|
value: [0.72988506 0.73275862 0.75246305 0.78817734 0.68349754 0.84880952
|
|
0.70833333 0.66428571 0.86428571 0.80833333]
|
|
|
|
mean value: 0.7580829228243021
|
|
|
|
key: train_roc_auc
|
|
value: [0.80790792 0.7878196 0.76864347 0.78977273 0.79711174 0.7869427
|
|
0.81568004 0.78881397 0.79839309 0.8206998 ]
|
|
|
|
mean value: 0.796178505644296
|
|
|
|
key: test_jcc
|
|
value: [0.67647059 0.77777778 0.73529412 0.75757576 0.61764706 0.84375
|
|
0.63636364 0.7027027 0.83870968 0.7 ]
|
|
|
|
mean value: 0.7286291316545112
|
|
|
|
key: train_jcc
|
|
value: [0.78082192 0.7559322 0.73489933 0.75932203 0.75601375 0.75675676
|
|
0.78275862 0.7559322 0.76450512 0.74551971]
|
|
|
|
mean value: 0.7592461643211699
|
|
|
|
MCC on Blind test: 0.64
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01165366 0.01043344 0.01005793 0.01009703 0.00994444 0.00998759
|
|
0.01104665 0.011199 0.01103497 0.01107717]
|
|
|
|
mean value: 0.010653185844421386
|
|
|
|
key: score_time
|
|
value: [0.00956798 0.00930119 0.00931573 0.00896025 0.00888276 0.009022
|
|
0.00979853 0.00989366 0.00984979 0.00986552]
|
|
|
|
mean value: 0.009445738792419434
|
|
|
|
key: test_mcc
|
|
value: [0.49425287 0.55732713 0.67416594 0.74102654 0.36453202 0.79313677
|
|
0.63660014 0.46554006 0.63247577 0.52368994]
|
|
|
|
mean value: 0.5882747173578298
|
|
|
|
key: train_mcc
|
|
value: [0.65753645 0.65942575 0.67797225 0.64082272 0.65903595 0.64878733
|
|
0.64546126 0.66416419 0.6450794 0.6828107 ]
|
|
|
|
mean value: 0.6581095999527902
|
|
|
|
key: test_accuracy
|
|
value: [0.77272727 0.81395349 0.86046512 0.88372093 0.72093023 0.90697674
|
|
0.8372093 0.76744186 0.8372093 0.79069767]
|
|
|
|
mean value: 0.8191331923890064
|
|
|
|
key: train_accuracy
|
|
value: [0.8501292 0.85051546 0.85824742 0.84278351 0.85051546 0.84536082
|
|
0.84536082 0.85309278 0.84536082 0.86082474]
|
|
|
|
mean value: 0.8502191054636511
|
|
|
|
key: test_fscore
|
|
value: [0.82758621 0.87096774 0.9 0.9122807 0.79310345 0.93103448
|
|
0.8852459 0.83333333 0.88135593 0.84745763]
|
|
|
|
mean value: 0.8682365375915616
|
|
|
|
key: train_fscore
|
|
value: [0.89056604 0.89056604 0.89563567 0.88555347 0.89097744 0.88549618
|
|
0.88764045 0.89265537 0.8880597 0.89772727]
|
|
|
|
mean value: 0.8904877637720091
|
|
|
|
key: test_precision
|
|
value: [0.82758621 0.81818182 0.87096774 0.92857143 0.79310345 0.9
|
|
0.81818182 0.78125 0.83870968 0.80645161]
|
|
|
|
mean value: 0.8383003752365543
|
|
|
|
key: train_precision
|
|
value: [0.86131387 0.86131387 0.87084871 0.85198556 0.85869565 0.86891386
|
|
0.85559567 0.8649635 0.85304659 0.87453875]
|
|
|
|
mean value: 0.8621216027021169
|
|
|
|
key: test_recall
|
|
value: [0.82758621 0.93103448 0.93103448 0.89655172 0.79310345 0.96428571
|
|
0.96428571 0.89285714 0.92857143 0.89285714]
|
|
|
|
mean value: 0.9022167487684729
|
|
|
|
key: train_recall
|
|
value: [0.921875 0.921875 0.921875 0.921875 0.92578125 0.90272374
|
|
0.92217899 0.92217899 0.92607004 0.92217899]
|
|
|
|
mean value: 0.9208611989299611
|
|
|
|
key: test_roc_auc
|
|
value: [0.74712644 0.75123153 0.8226601 0.87684729 0.68226601 0.88214286
|
|
0.78214286 0.71309524 0.79761905 0.74642857]
|
|
|
|
mean value: 0.7801559934318555
|
|
|
|
key: train_roc_auc
|
|
value: [0.81589933 0.81699811 0.82836174 0.80563447 0.81516335 0.81777408
|
|
0.80841774 0.81986812 0.80654647 0.8313185 ]
|
|
|
|
mean value: 0.8165981914150152
|
|
|
|
key: test_jcc
|
|
value: [0.70588235 0.77142857 0.81818182 0.83870968 0.65714286 0.87096774
|
|
0.79411765 0.71428571 0.78787879 0.73529412]
|
|
|
|
mean value: 0.7693889285919646
|
|
|
|
key: train_jcc
|
|
value: [0.80272109 0.80272109 0.81099656 0.79461279 0.80338983 0.79452055
|
|
0.7979798 0.80612245 0.79865772 0.81443299]
|
|
|
|
mean value: 0.8026154868282023
|
|
|
|
MCC on Blind test: 0.52
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00989199 0.0111444 0.01048923 0.0103128 0.010391 0.01045656
|
|
0.01054001 0.01035523 0.01011539 0.01033711]
|
|
|
|
mean value: 0.01040337085723877
|
|
|
|
key: score_time
|
|
value: [0.05038977 0.01279974 0.01173329 0.01209927 0.01200771 0.01221824
|
|
0.01208305 0.01429176 0.01597929 0.01330972]
|
|
|
|
mean value: 0.016691184043884276
|
|
|
|
key: test_mcc
|
|
value: [0.21152604 0.43985131 0.57635468 0.40711743 0.27510532 0.58298976
|
|
0.52368994 0.24187277 0.63247577 0.34309924]
|
|
|
|
mean value: 0.42340822539565004
|
|
|
|
key: train_mcc
|
|
value: [0.69961993 0.65931339 0.67101659 0.6346185 0.66497487 0.63246023
|
|
0.68748675 0.64455576 0.62607402 0.65075438]
|
|
|
|
mean value: 0.6570874415706287
|
|
|
|
key: test_accuracy
|
|
value: [0.68181818 0.76744186 0.81395349 0.74418605 0.69767442 0.81395349
|
|
0.79069767 0.6744186 0.8372093 0.72093023]
|
|
|
|
mean value: 0.7542283298097252
|
|
|
|
key: train_accuracy
|
|
value: [0.86821705 0.85051546 0.8556701 0.84020619 0.85309278 0.84020619
|
|
0.86340206 0.84536082 0.83762887 0.84793814]
|
|
|
|
mean value: 0.8502237672820266
|
|
|
|
key: test_fscore
|
|
value: [0.78787879 0.83870968 0.86206897 0.81355932 0.78688525 0.87096774
|
|
0.84745763 0.76666667 0.88135593 0.80645161]
|
|
|
|
mean value: 0.8262001579578332
|
|
|
|
key: train_fscore
|
|
value: [0.90538033 0.89377289 0.89667897 0.88686131 0.89502762 0.88686131
|
|
0.90130354 0.88929889 0.88482633 0.89134438]
|
|
|
|
mean value: 0.8931355586193345
|
|
|
|
key: test_precision
|
|
value: [0.7027027 0.78787879 0.86206897 0.8 0.75 0.79411765
|
|
0.80645161 0.71875 0.83870968 0.73529412]
|
|
|
|
mean value: 0.7795973511127194
|
|
|
|
key: train_precision
|
|
value: [0.86219081 0.84137931 0.84965035 0.83219178 0.8466899 0.83505155
|
|
0.86428571 0.84561404 0.83448276 0.84615385]
|
|
|
|
mean value: 0.8457690049548049
|
|
|
|
key: test_recall
|
|
value: [0.89655172 0.89655172 0.86206897 0.82758621 0.82758621 0.96428571
|
|
0.89285714 0.82142857 0.92857143 0.89285714]
|
|
|
|
mean value: 0.8810344827586207
|
|
|
|
key: train_recall
|
|
value: [0.953125 0.953125 0.94921875 0.94921875 0.94921875 0.94552529
|
|
0.94163424 0.93774319 0.94163424 0.94163424]
|
|
|
|
mean value: 0.946207745622568
|
|
|
|
key: test_roc_auc
|
|
value: [0.5816092 0.69827586 0.78817734 0.69950739 0.62807882 0.74880952
|
|
0.74642857 0.61071429 0.79761905 0.64642857]
|
|
|
|
mean value: 0.6945648604269294
|
|
|
|
key: train_roc_auc
|
|
value: [0.82770754 0.80232008 0.81173059 0.78900331 0.80794271 0.78955654
|
|
0.82577895 0.80093266 0.78761101 0.80287819]
|
|
|
|
mean value: 0.8045461582612031
|
|
|
|
key: test_jcc
|
|
value: [0.65 0.72222222 0.75757576 0.68571429 0.64864865 0.77142857
|
|
0.73529412 0.62162162 0.78787879 0.67567568]
|
|
|
|
mean value: 0.7056059688412629
|
|
|
|
key: train_jcc
|
|
value: [0.82711864 0.80794702 0.81270903 0.79672131 0.81 0.79672131
|
|
0.82033898 0.80066445 0.79344262 0.80398671]
|
|
|
|
mean value: 0.8069650085778866
|
|
|
|
MCC on Blind test: 0.33
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02155566 0.02098107 0.01806235 0.01775026 0.01926017 0.02073121
|
|
0.01955342 0.02025127 0.01990175 0.01973605]
|
|
|
|
mean value: 0.01977832317352295
|
|
|
|
key: score_time
|
|
value: [0.01182008 0.01138163 0.01180887 0.01104093 0.01332617 0.01129508
|
|
0.01093888 0.01217222 0.01113939 0.01160002]
|
|
|
|
mean value: 0.011652326583862305
|
|
|
|
key: test_mcc
|
|
value: [0.52525148 0.57957513 0.7300872 0.84383267 0.49188359 0.8993825
|
|
0.60246408 0.4630445 0.79313677 0.58298976]
|
|
|
|
mean value: 0.651164767326194
|
|
|
|
key: train_mcc
|
|
value: [0.74953754 0.69029892 0.71608944 0.72111098 0.71127097 0.71860826
|
|
0.70727056 0.72056414 0.67681039 0.75638254]
|
|
|
|
mean value: 0.7167943743978092
|
|
|
|
key: test_accuracy
|
|
value: [0.79545455 0.81395349 0.88372093 0.93023256 0.76744186 0.95348837
|
|
0.81395349 0.76744186 0.90697674 0.81395349]
|
|
|
|
mean value: 0.844661733615222
|
|
|
|
key: train_accuracy
|
|
value: [0.88888889 0.86340206 0.87371134 0.87628866 0.87113402 0.87628866
|
|
0.87113402 0.87628866 0.85824742 0.89175258]
|
|
|
|
mean value: 0.8747136311569301
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 0.87878788 0.91803279 0.95081967 0.82142857 0.96551724
|
|
0.875 0.83870968 0.93103448 0.87096774]
|
|
|
|
mean value: 0.890744090986847
|
|
|
|
key: train_fscore
|
|
value: [0.92051756 0.90275229 0.91042048 0.91176471 0.90909091 0.91143911
|
|
0.90842491 0.91240876 0.89981785 0.92279412]
|
|
|
|
mean value: 0.9109430694169829
|
|
|
|
key: test_precision
|
|
value: [0.79411765 0.78378378 0.875 0.90625 0.85185185 0.93333333
|
|
0.77777778 0.76470588 0.9 0.79411765]
|
|
|
|
mean value: 0.8380937923217335
|
|
|
|
key: train_precision
|
|
value: [0.87368421 0.85121107 0.8556701 0.86111111 0.85034014 0.86666667
|
|
0.85813149 0.85910653 0.84589041 0.87456446]
|
|
|
|
mean value: 0.8596376188103771
|
|
|
|
key: test_recall
|
|
value: [0.93103448 1. 0.96551724 1. 0.79310345 1.
|
|
1. 0.92857143 0.96428571 0.96428571]
|
|
|
|
mean value: 0.954679802955665
|
|
|
|
key: train_recall
|
|
value: [0.97265625 0.9609375 0.97265625 0.96875 0.9765625 0.96108949
|
|
0.96498054 0.97276265 0.96108949 0.9766537 ]
|
|
|
|
mean value: 0.9688138375486381
|
|
|
|
key: test_roc_auc
|
|
value: [0.73218391 0.71428571 0.83990148 0.89285714 0.75369458 0.93333333
|
|
0.73333333 0.69761905 0.88214286 0.74880952]
|
|
|
|
mean value: 0.792816091954023
|
|
|
|
key: train_roc_auc
|
|
value: [0.84892354 0.81758996 0.82723722 0.83285985 0.82161458 0.83550658
|
|
0.82600172 0.82989277 0.80878902 0.85092227]
|
|
|
|
mean value: 0.829933751991992
|
|
|
|
key: test_jcc
|
|
value: [0.75 0.78378378 0.84848485 0.90625 0.6969697 0.93333333
|
|
0.77777778 0.72222222 0.87096774 0.77142857]
|
|
|
|
mean value: 0.8061217975935718
|
|
|
|
key: train_jcc
|
|
value: [0.85273973 0.82274247 0.83557047 0.83783784 0.83333333 0.83728814
|
|
0.83221477 0.83892617 0.81788079 0.85665529]
|
|
|
|
mean value: 0.8365189001908526
|
|
|
|
MCC on Blind test: 0.75
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.38460779 1.60607076 1.54561806 1.4221251 1.57054877 1.54352975
|
|
1.41555548 1.55845952 1.54474926 1.41102624]
|
|
|
|
mean value: 1.5002290725708007
|
|
|
|
key: score_time
|
|
value: [0.01263404 0.01581979 0.01433444 0.01546478 0.01272631 0.01477623
|
|
0.0147686 0.01492023 0.01476097 0.01733851]
|
|
|
|
mean value: 0.014754390716552735
|
|
|
|
key: test_mcc
|
|
value: [0.80277297 0.78481149 0.68226601 0.78817734 0.65625201 0.94928891
|
|
0.80104099 0.79313677 0.80536675 0.63689536]
|
|
|
|
mean value: 0.7700008592726074
|
|
|
|
key: train_mcc
|
|
value: [1. 0.98854135 0.99426489 0.99426489 0.99426489 0.99424345
|
|
0.99424345 0.98847536 0.98849821 1. ]
|
|
|
|
mean value: 0.99367964807764
|
|
|
|
key: test_accuracy
|
|
value: [0.90909091 0.90697674 0.86046512 0.90697674 0.8372093 0.97674419
|
|
0.90697674 0.90697674 0.90697674 0.8372093 ]
|
|
|
|
mean value: 0.8955602536997885
|
|
|
|
key: train_accuracy
|
|
value: [1. 0.99484536 0.99742268 0.99742268 0.99742268 0.99742268
|
|
0.99742268 0.99484536 0.99484536 1. ]
|
|
|
|
mean value: 0.9971649484536083
|
|
|
|
key: test_fscore
|
|
value: [0.93548387 0.93333333 0.89655172 0.93103448 0.87272727 0.98245614
|
|
0.93333333 0.93103448 0.92592593 0.87719298]
|
|
|
|
mean value: 0.9219073548749797
|
|
|
|
key: train_fscore
|
|
value: [1. 0.99610895 0.99805068 0.99805068 0.99805068 0.99805825
|
|
0.99805825 0.99610895 0.99612403 1. ]
|
|
|
|
mean value: 0.9978610481478432
|
|
|
|
key: test_precision
|
|
value: [0.87878788 0.90322581 0.89655172 0.93103448 0.92307692 0.96551724
|
|
0.875 0.9 0.96153846 0.86206897]
|
|
|
|
mean value: 0.9096801483647979
|
|
|
|
key: train_precision
|
|
value: [1. 0.99224806 0.99610895 0.99610895 0.99610895 0.99612403
|
|
0.99612403 0.99610895 0.99227799 1. ]
|
|
|
|
mean value: 0.9961209913974369
|
|
|
|
key: test_recall
|
|
value: [1. 0.96551724 0.89655172 0.93103448 0.82758621 1.
|
|
1. 0.96428571 0.89285714 0.89285714]
|
|
|
|
mean value: 0.9370689655172414
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 0.99610895 1. 1. ]
|
|
|
|
mean value: 0.9996108949416342
|
|
|
|
key: test_roc_auc
|
|
value: [0.86666667 0.87561576 0.841133 0.89408867 0.84236453 0.96666667
|
|
0.86666667 0.88214286 0.91309524 0.81309524]
|
|
|
|
mean value: 0.8761535303776683
|
|
|
|
key: train_roc_auc
|
|
value: [1. 0.99242424 0.99621212 0.99621212 0.99621212 0.99618321
|
|
0.99618321 0.99423768 0.99236641 1. ]
|
|
|
|
mean value: 0.9960031111303128
|
|
|
|
key: test_jcc
|
|
value: [0.87878788 0.875 0.8125 0.87096774 0.77419355 0.96551724
|
|
0.875 0.87096774 0.86206897 0.78125 ]
|
|
|
|
mean value: 0.8566253117942495
|
|
|
|
key: train_jcc
|
|
value: [1. 0.99224806 0.99610895 0.99610895 0.99610895 0.99612403
|
|
0.99612403 0.99224806 0.99227799 1. ]
|
|
|
|
mean value: 0.9957349026573531
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02398276 0.02131772 0.01714778 0.02014828 0.02126026 0.01924372
|
|
0.01785946 0.02201104 0.02174163 0.02186418]
|
|
|
|
mean value: 0.020657682418823244
|
|
|
|
key: score_time
|
|
value: [0.01223946 0.00949287 0.00887823 0.00909805 0.00917006 0.00899124
|
|
0.00920415 0.00913548 0.00919437 0.00912642]
|
|
|
|
mean value: 0.009453034400939942
|
|
|
|
key: test_mcc
|
|
value: [0.84691397 0.94928891 1. 0.84515772 0.94928891 0.80536675
|
|
0.84515772 0.86258195 0.95079854 0.8993825 ]
|
|
|
|
mean value: 0.8953936967542797
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.93181818 0.97674419 1. 0.93023256 0.97674419 0.90697674
|
|
0.93023256 0.93023256 0.97674419 0.95348837]
|
|
|
|
mean value: 0.9513213530655391
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94915254 0.98245614 1. 0.94736842 0.98245614 0.92592593
|
|
0.94736842 0.94339623 0.98181818 0.96551724]
|
|
|
|
mean value: 0.9625459240718411
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.93333333 1. 1. 0.96428571 1. 0.96153846
|
|
0.93103448 1. 1. 0.93333333]
|
|
|
|
mean value: 0.9723525325249464
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96551724 0.96551724 1. 0.93103448 0.96551724 0.89285714
|
|
0.96428571 0.89285714 0.96428571 1. ]
|
|
|
|
mean value: 0.9541871921182267
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.91609195 0.98275862 1. 0.92980296 0.98275862 0.91309524
|
|
0.91547619 0.94642857 0.98214286 0.93333333]
|
|
|
|
mean value: 0.9501888341543514
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.90322581 0.96551724 1. 0.9 0.96551724 0.86206897
|
|
0.9 0.89285714 0.96428571 0.93333333]
|
|
|
|
mean value: 0.9286805445203665
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.89
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.1146183 0.11344624 0.11548758 0.11227894 0.11485791 0.11730623
|
|
0.11329865 0.11277914 0.11396575 0.11266923]
|
|
|
|
mean value: 0.11407079696655273
|
|
|
|
key: score_time
|
|
value: [0.01814842 0.01810622 0.01765132 0.01797366 0.01852298 0.01800585
|
|
0.01813388 0.01781464 0.01797915 0.01783133]
|
|
|
|
mean value: 0.018016743659973144
|
|
|
|
key: test_mcc
|
|
value: [0.58621892 0.67480294 0.84383267 0.84515772 0.62324149 0.8993825
|
|
0.8993825 0.5773737 0.94928891 0.68920734]
|
|
|
|
mean value: 0.7587888702546957
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.81818182 0.86046512 0.93023256 0.93023256 0.8372093 0.95348837
|
|
0.95348837 0.81395349 0.97674419 0.86046512]
|
|
|
|
mean value: 0.893446088794926
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.875 0.90322581 0.95081967 0.94736842 0.88135593 0.96551724
|
|
0.96551724 0.86666667 0.98245614 0.9 ]
|
|
|
|
mean value: 0.9237927121614946
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.8 0.84848485 0.90625 0.96428571 0.86666667 0.93333333
|
|
0.93333333 0.8125 0.96551724 0.84375 ]
|
|
|
|
mean value: 0.8874121137483206
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96551724 0.96551724 1. 0.93103448 0.89655172 1.
|
|
1. 0.92857143 1. 0.96428571]
|
|
|
|
mean value: 0.9651477832512315
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.74942529 0.80418719 0.89285714 0.92980296 0.80541872 0.93333333
|
|
0.93333333 0.76428571 0.96666667 0.81547619]
|
|
|
|
mean value: 0.8594786535303777
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.77777778 0.82352941 0.90625 0.9 0.78787879 0.93333333
|
|
0.93333333 0.76470588 0.96551724 0.81818182]
|
|
|
|
mean value: 0.8610507586002007
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.89
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01006985 0.01007056 0.01002431 0.01016045 0.01020622 0.00985289
|
|
0.01042724 0.00990152 0.01044726 0.01008797]
|
|
|
|
mean value: 0.010124826431274414
|
|
|
|
key: score_time
|
|
value: [0.00898814 0.00896454 0.00909829 0.00897264 0.00896001 0.00890684
|
|
0.00891232 0.00885296 0.00876856 0.00931239]
|
|
|
|
mean value: 0.00897367000579834
|
|
|
|
key: test_mcc
|
|
value: [0.41084026 0.19099336 0.40711743 0.55732713 0.23158372 0.60576577
|
|
0.53276418 0.13003912 0.38571429 0.36815383]
|
|
|
|
mean value: 0.3820299081033254
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.75 0.65116279 0.74418605 0.81395349 0.6744186 0.81395349
|
|
0.79069767 0.62790698 0.72093023 0.72093023]
|
|
|
|
mean value: 0.7308139534883721
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.82539683 0.74576271 0.81355932 0.87096774 0.76666667 0.85185185
|
|
0.84210526 0.73333333 0.78571429 0.79310345]
|
|
|
|
mean value: 0.8028461450230509
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.76470588 0.73333333 0.8 0.81818182 0.74193548 0.88461538
|
|
0.82758621 0.6875 0.78571429 0.76666667]
|
|
|
|
mean value: 0.781023906163195
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.89655172 0.75862069 0.82758621 0.93103448 0.79310345 0.82142857
|
|
0.85714286 0.78571429 0.78571429 0.82142857]
|
|
|
|
mean value: 0.8278325123152709
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.6816092 0.59359606 0.69950739 0.75123153 0.61083744 0.81071429
|
|
0.76190476 0.55952381 0.69285714 0.67738095]
|
|
|
|
mean value: 0.6839162561576355
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.7027027 0.59459459 0.68571429 0.77142857 0.62162162 0.74193548
|
|
0.72727273 0.57894737 0.64705882 0.65714286]
|
|
|
|
mean value: 0.6728419036298793
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.08
|
|
|
|
Accuracy on Blind test: 0.58
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.62458539 1.63666844 1.60537386 1.65038395 1.66536379 1.63138604
|
|
1.636024 1.60371423 1.62825584 1.61648202]
|
|
|
|
mean value: 1.6298237562179565
|
|
|
|
key: score_time
|
|
value: [0.09931064 0.09000063 0.09789276 0.0941186 0.09213281 0.10003638
|
|
0.09792519 0.09421635 0.09055543 0.09021354]
|
|
|
|
mean value: 0.09464023113250733
|
|
|
|
key: test_mcc
|
|
value: [0.79532948 0.94742759 0.84383267 0.94742759 0.83936556 0.94928891
|
|
0.8993825 0.74102654 1. 0.85004744]
|
|
|
|
mean value: 0.8813128291534343
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.90909091 0.97674419 0.93023256 0.97674419 0.93023256 0.97674419
|
|
0.95348837 0.88372093 1. 0.93023256]
|
|
|
|
mean value: 0.946723044397463
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.98305085 0.95081967 0.98305085 0.94915254 0.98245614
|
|
0.96551724 0.9122807 1. 0.94915254]
|
|
|
|
mean value: 0.9608813868610071
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.90322581 0.96666667 0.90625 0.96666667 0.93333333 0.96551724
|
|
0.93333333 0.89655172 1. 0.90322581]
|
|
|
|
mean value: 0.9374770578420467
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96551724 1. 1. 1. 0.96551724 1.
|
|
1. 0.92857143 1. 1. ]
|
|
|
|
mean value: 0.9859605911330049
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.88275862 0.96428571 0.89285714 0.96428571 0.91133005 0.96666667
|
|
0.93333333 0.86428571 1. 0.9 ]
|
|
|
|
mean value: 0.9279802955665025
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.96666667 0.90625 0.96666667 0.90322581 0.96551724
|
|
0.93333333 0.83870968 1. 0.90322581]
|
|
|
|
mean value: 0.9258595198368558
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.94
|
|
|
|
Accuracy on Blind test: 0.97
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
|
|
key: fit_time
|
|
value: [1.85724068 0.91922164 1.05045414 0.93645144 0.97065115 0.97242212
|
|
0.9203217 0.949862 0.94439721 0.943084 ]
|
|
|
|
mean value: 1.046410608291626
|
|
|
|
key: score_time
|
|
value: [0.23833275 0.25026321 0.27845311 0.22460794 0.25519276 0.24637866
|
|
0.28193164 0.22761822 0.26391268 0.16427517]
|
|
|
|
mean value: 0.24309661388397216
|
|
|
|
key: test_mcc
|
|
value: [0.74381228 0.79227876 0.84383267 0.94742759 0.83936556 0.94928891
|
|
0.8993825 0.7412616 1. 0.85004744]
|
|
|
|
mean value: 0.8606697309329067
|
|
|
|
key: train_mcc
|
|
value: [0.94850869 0.93708276 0.9487737 0.9544252 0.9374974 0.94824314
|
|
0.94824314 0.92545561 0.94290506 0.93724717]
|
|
|
|
mean value: 0.9428381861445998
|
|
|
|
key: test_accuracy
|
|
value: [0.88636364 0.90697674 0.93023256 0.97674419 0.93023256 0.97674419
|
|
0.95348837 0.88372093 1. 0.93023256]
|
|
|
|
mean value: 0.9374735729386892
|
|
|
|
key: train_accuracy
|
|
value: [0.97674419 0.97164948 0.97680412 0.97938144 0.97164948 0.97680412
|
|
0.97680412 0.96649485 0.9742268 0.97164948]
|
|
|
|
mean value: 0.9742208103572285
|
|
|
|
key: test_fscore
|
|
value: [0.91803279 0.93548387 0.95081967 0.98305085 0.94915254 0.98245614
|
|
0.96551724 0.91525424 1. 0.94915254]
|
|
|
|
mean value: 0.9548919881205848
|
|
|
|
key: train_fscore
|
|
value: [0.98272553 0.97888676 0.98272553 0.98461538 0.9789675 0.98272553
|
|
0.98272553 0.9752381 0.98091603 0.97904762]
|
|
|
|
mean value: 0.9808573492217716
|
|
|
|
key: test_precision
|
|
value: [0.875 0.87878788 0.90625 0.96666667 0.93333333 0.96551724
|
|
0.93333333 0.87096774 1. 0.90322581]
|
|
|
|
mean value: 0.9233082001887619
|
|
|
|
key: train_precision
|
|
value: [0.96603774 0.96226415 0.96603774 0.96969697 0.9588015 0.96969697
|
|
0.96969697 0.95522388 0.96254682 0.95895522]
|
|
|
|
mean value: 0.9638957950816772
|
|
|
|
key: test_recall
|
|
value: [0.96551724 1. 1. 1. 0.96551724 1.
|
|
1. 0.96428571 1. 1. ]
|
|
|
|
mean value: 0.9895320197044335
|
|
|
|
key: train_recall
|
|
value: [1. 0.99609375 1. 1. 1. 0.99610895
|
|
0.99610895 0.99610895 1. 1. ]
|
|
|
|
mean value: 0.9984420598249028
|
|
|
|
key: test_roc_auc
|
|
value: [0.84942529 0.85714286 0.89285714 0.96428571 0.91133005 0.96666667
|
|
0.93333333 0.84880952 1. 0.9 ]
|
|
|
|
mean value: 0.9123850574712644
|
|
|
|
key: train_roc_auc
|
|
value: [0.96564885 0.96016809 0.96590909 0.96969697 0.95833333 0.96752012
|
|
0.96752012 0.95225295 0.96183206 0.95801527]
|
|
|
|
mean value: 0.9626896859383592
|
|
|
|
key: test_jcc
|
|
value: [0.84848485 0.87878788 0.90625 0.96666667 0.90322581 0.96551724
|
|
0.93333333 0.84375 1. 0.90322581]
|
|
|
|
mean value: 0.9149241581555263
|
|
|
|
key: train_jcc
|
|
value: [0.96603774 0.95864662 0.96603774 0.96969697 0.9588015 0.96603774
|
|
0.96603774 0.95167286 0.96254682 0.95895522]
|
|
|
|
mean value: 0.962447093057542
|
|
|
|
MCC on Blind test: 0.94
|
|
|
|
Accuracy on Blind test: 0.97
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02334285 0.00983024 0.00985956 0.00978518 0.00993013 0.00973535
|
|
0.00983548 0.00996304 0.0098989 0.00982642]
|
|
|
|
mean value: 0.011200714111328124
|
|
|
|
key: score_time
|
|
value: [0.01340628 0.00875068 0.00893688 0.00888753 0.00888848 0.00883889
|
|
0.00882483 0.00888896 0.00888062 0.00899267]
|
|
|
|
mean value: 0.009329581260681152
|
|
|
|
key: test_mcc
|
|
value: [0.49425287 0.55732713 0.67416594 0.74102654 0.36453202 0.79313677
|
|
0.63660014 0.46554006 0.63247577 0.52368994]
|
|
|
|
mean value: 0.5882747173578298
|
|
|
|
key: train_mcc
|
|
value: [0.65753645 0.65942575 0.67797225 0.64082272 0.65903595 0.64878733
|
|
0.64546126 0.66416419 0.6450794 0.6828107 ]
|
|
|
|
mean value: 0.6581095999527902
|
|
|
|
key: test_accuracy
|
|
value: [0.77272727 0.81395349 0.86046512 0.88372093 0.72093023 0.90697674
|
|
0.8372093 0.76744186 0.8372093 0.79069767]
|
|
|
|
mean value: 0.8191331923890064
|
|
|
|
key: train_accuracy
|
|
value: [0.8501292 0.85051546 0.85824742 0.84278351 0.85051546 0.84536082
|
|
0.84536082 0.85309278 0.84536082 0.86082474]
|
|
|
|
mean value: 0.8502191054636511
|
|
|
|
key: test_fscore
|
|
value: [0.82758621 0.87096774 0.9 0.9122807 0.79310345 0.93103448
|
|
0.8852459 0.83333333 0.88135593 0.84745763]
|
|
|
|
mean value: 0.8682365375915616
|
|
|
|
key: train_fscore
|
|
value: [0.89056604 0.89056604 0.89563567 0.88555347 0.89097744 0.88549618
|
|
0.88764045 0.89265537 0.8880597 0.89772727]
|
|
|
|
mean value: 0.8904877637720091
|
|
|
|
key: test_precision
|
|
value: [0.82758621 0.81818182 0.87096774 0.92857143 0.79310345 0.9
|
|
0.81818182 0.78125 0.83870968 0.80645161]
|
|
|
|
mean value: 0.8383003752365543
|
|
|
|
key: train_precision
|
|
value: [0.86131387 0.86131387 0.87084871 0.85198556 0.85869565 0.86891386
|
|
0.85559567 0.8649635 0.85304659 0.87453875]
|
|
|
|
mean value: 0.8621216027021169
|
|
|
|
key: test_recall
|
|
value: [0.82758621 0.93103448 0.93103448 0.89655172 0.79310345 0.96428571
|
|
0.96428571 0.89285714 0.92857143 0.89285714]
|
|
|
|
mean value: 0.9022167487684729
|
|
|
|
key: train_recall
|
|
value: [0.921875 0.921875 0.921875 0.921875 0.92578125 0.90272374
|
|
0.92217899 0.92217899 0.92607004 0.92217899]
|
|
|
|
mean value: 0.9208611989299611
|
|
|
|
key: test_roc_auc
|
|
value: [0.74712644 0.75123153 0.8226601 0.87684729 0.68226601 0.88214286
|
|
0.78214286 0.71309524 0.79761905 0.74642857]
|
|
|
|
mean value: 0.7801559934318555
|
|
|
|
key: train_roc_auc
|
|
value: [0.81589933 0.81699811 0.82836174 0.80563447 0.81516335 0.81777408
|
|
0.80841774 0.81986812 0.80654647 0.8313185 ]
|
|
|
|
mean value: 0.8165981914150152
|
|
|
|
key: test_jcc
|
|
value: [0.70588235 0.77142857 0.81818182 0.83870968 0.65714286 0.87096774
|
|
0.79411765 0.71428571 0.78787879 0.73529412]
|
|
|
|
mean value: 0.7693889285919646
|
|
|
|
key: train_jcc
|
|
value: [0.80272109 0.80272109 0.81099656 0.79461279 0.80338983 0.79452055
|
|
0.7979798 0.80612245 0.79865772 0.81443299]
|
|
|
|
mean value: 0.8026154868282023
|
|
|
|
MCC on Blind test: 0.52
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.25924444 0.05974245 0.06969905 0.06822252 0.05877757 0.06371927
|
|
0.06096601 0.07231355 0.07287455 0.25041699]
|
|
|
|
mean value: 0.10359764099121094
|
|
|
|
key: score_time
|
|
value: [0.01188898 0.01114655 0.010535 0.01076055 0.01073456 0.01050711
|
|
0.01052427 0.01171732 0.01074386 0.01410055]
|
|
|
|
mean value: 0.011265873908996582
|
|
|
|
key: test_mcc
|
|
value: [0.84691397 1. 0.94742759 0.89408867 0.94928891 0.89761905
|
|
0.8993825 0.89761905 1. 0.8993825 ]
|
|
|
|
mean value: 0.9231722243931844
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.93181818 1. 0.97674419 0.95348837 0.97674419 0.95348837
|
|
0.95348837 0.95348837 1. 0.95348837]
|
|
|
|
mean value: 0.9652748414376321
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94915254 1. 0.98305085 0.96551724 0.98245614 0.96428571
|
|
0.96551724 0.96428571 1. 0.96551724]
|
|
|
|
mean value: 0.9739782682890745
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.93333333 1. 0.96666667 0.96551724 1. 0.96428571
|
|
0.93333333 0.96428571 1. 0.93333333]
|
|
|
|
mean value: 0.9660755336617406
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96551724 1. 1. 0.96551724 0.96551724 0.96428571
|
|
1. 0.96428571 1. 1. ]
|
|
|
|
mean value: 0.982512315270936
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.91609195 1. 0.96428571 0.94704433 0.98275862 0.94880952
|
|
0.93333333 0.94880952 1. 0.93333333]
|
|
|
|
mean value: 0.9574466338259442
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.90322581 1. 0.96666667 0.93333333 0.96551724 0.93103448
|
|
0.93333333 0.93103448 1. 0.93333333]
|
|
|
|
mean value: 0.9497478680014831
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.0659194 0.04976654 0.11305714 0.05395603 0.12562203 0.07698584
|
|
0.08231831 0.0750525 0.04251218 0.07967854]
|
|
|
|
mean value: 0.07648684978485107
|
|
|
|
key: score_time
|
|
value: [0.03880215 0.02202821 0.02107215 0.01819706 0.02468801 0.02061653
|
|
0.02234888 0.01253152 0.02122045 0.02172065]
|
|
|
|
mean value: 0.022322559356689455
|
|
|
|
key: test_mcc
|
|
value: [0.69655172 0.84515772 0.84515772 0.78817734 0.84515772 0.89761905
|
|
0.94928891 0.63689536 1. 0.69285714]
|
|
|
|
mean value: 0.8196862694382097
|
|
|
|
key: train_mcc
|
|
value: [0.97111276 0.96555771 0.95973448 0.97128177 0.97712771 0.95958106
|
|
0.97117251 0.9769295 0.9653608 0.96542571]
|
|
|
|
mean value: 0.9683284032432156
|
|
|
|
key: test_accuracy
|
|
value: [0.86363636 0.93023256 0.93023256 0.90697674 0.93023256 0.95348837
|
|
0.97674419 0.8372093 1. 0.86046512]
|
|
|
|
mean value: 0.9189217758985201
|
|
|
|
key: train_accuracy
|
|
value: [0.9870801 0.98453608 0.98195876 0.9871134 0.98969072 0.98195876
|
|
0.9871134 0.98969072 0.98453608 0.98453608]
|
|
|
|
mean value: 0.985821412397773
|
|
|
|
key: test_fscore
|
|
value: [0.89655172 0.94736842 0.94736842 0.93103448 0.94736842 0.96428571
|
|
0.98245614 0.87719298 1. 0.89285714]
|
|
|
|
mean value: 0.9386483450004321
|
|
|
|
key: train_fscore
|
|
value: [0.99025341 0.98837209 0.98640777 0.99029126 0.99224806 0.98646035
|
|
0.99032882 0.99224806 0.98837209 0.98841699]
|
|
|
|
mean value: 0.9893398907205294
|
|
|
|
key: test_precision
|
|
value: [0.89655172 0.96428571 0.96428571 0.93103448 0.96428571 0.96428571
|
|
0.96551724 0.86206897 1. 0.89285714]
|
|
|
|
mean value: 0.9405172413793104
|
|
|
|
key: train_precision
|
|
value: [0.98832685 0.98076923 0.98069498 0.98455598 0.98461538 0.98076923
|
|
0.98461538 0.98841699 0.98455598 0.98084291]
|
|
|
|
mean value: 0.9838162929119592
|
|
|
|
key: test_recall
|
|
value: [0.89655172 0.93103448 0.93103448 0.93103448 0.93103448 0.96428571
|
|
1. 0.89285714 1. 0.89285714]
|
|
|
|
mean value: 0.9370689655172414
|
|
|
|
key: train_recall
|
|
value: [0.9921875 0.99609375 0.9921875 0.99609375 1. 0.9922179
|
|
0.99610895 0.99610895 0.9922179 0.99610895]
|
|
|
|
mean value: 0.9949325145914397
|
|
|
|
key: test_roc_auc
|
|
value: [0.84827586 0.92980296 0.92980296 0.89408867 0.92980296 0.94880952
|
|
0.96666667 0.81309524 1. 0.84642857]
|
|
|
|
mean value: 0.9106773399014779
|
|
|
|
key: train_roc_auc
|
|
value: [0.98464337 0.97910748 0.97715436 0.98289536 0.98484848 0.97702498
|
|
0.9827873 0.98660409 0.98084177 0.97897051]
|
|
|
|
mean value: 0.9814877701340265
|
|
|
|
key: test_jcc
|
|
value: [0.8125 0.9 0.9 0.87096774 0.9 0.93103448
|
|
0.96551724 0.78125 1. 0.80645161]
|
|
|
|
mean value: 0.886772107897664
|
|
|
|
key: train_jcc
|
|
value: [0.98069498 0.97701149 0.97318008 0.98076923 0.98461538 0.97328244
|
|
0.98084291 0.98461538 0.97701149 0.97709924]
|
|
|
|
mean value: 0.9789122637095788
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02146697 0.01106024 0.00995302 0.0117507 0.01304126 0.01269269
|
|
0.01129675 0.01066136 0.01041627 0.01036024]
|
|
|
|
mean value: 0.012269949913024903
|
|
|
|
key: score_time
|
|
value: [0.01147509 0.01147795 0.01027989 0.00955701 0.00969648 0.010355
|
|
0.00968647 0.01100302 0.00903153 0.00946951]
|
|
|
|
mean value: 0.010203194618225098
|
|
|
|
key: test_mcc
|
|
value: [0.48006374 0.63464776 0.61634173 0.84383267 0.38115218 0.74102654
|
|
0.63660014 0.4630445 0.7952381 0.63247577]
|
|
|
|
mean value: 0.6224423130557006
|
|
|
|
key: train_mcc
|
|
value: [0.65864868 0.65370117 0.62501937 0.62426504 0.63554065 0.63458211
|
|
0.64594126 0.64794192 0.63340179 0.66958019]
|
|
|
|
mean value: 0.6428622185866101
|
|
|
|
key: test_accuracy
|
|
value: [0.77272727 0.8372093 0.8372093 0.93023256 0.69767442 0.88372093
|
|
0.8372093 0.76744186 0.90697674 0.8372093 ]
|
|
|
|
mean value: 0.8307610993657506
|
|
|
|
key: train_accuracy
|
|
value: [0.8501292 0.84793814 0.83505155 0.83505155 0.84020619 0.84020619
|
|
0.84536082 0.84536082 0.84020619 0.8556701 ]
|
|
|
|
mean value: 0.8435180745358161
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.89230769 0.8852459 0.95081967 0.75471698 0.9122807
|
|
0.8852459 0.83870968 0.92857143 0.88135593]
|
|
|
|
mean value: 0.8762587222131497
|
|
|
|
key: train_fscore
|
|
value: [0.88973384 0.88846881 0.878327 0.87878788 0.88301887 0.88301887
|
|
0.88721805 0.88593156 0.88389513 0.89513109]
|
|
|
|
mean value: 0.8853531081489168
|
|
|
|
key: test_precision
|
|
value: [0.80645161 0.80555556 0.84375 0.90625 0.83333333 0.89655172
|
|
0.81818182 0.76470588 0.92857143 0.83870968]
|
|
|
|
mean value: 0.8442061032455589
|
|
|
|
key: train_precision
|
|
value: [0.86666667 0.86080586 0.85555556 0.85294118 0.8540146 0.85714286
|
|
0.85818182 0.866171 0.85198556 0.86281588]
|
|
|
|
mean value: 0.8586280981124286
|
|
|
|
key: test_recall
|
|
value: [0.86206897 1. 0.93103448 1. 0.68965517 0.92857143
|
|
0.96428571 0.92857143 0.92857143 0.92857143]
|
|
|
|
mean value: 0.9161330049261084
|
|
|
|
key: train_recall
|
|
value: [0.9140625 0.91796875 0.90234375 0.90625 0.9140625 0.91050584
|
|
0.91828794 0.90661479 0.91828794 0.92996109]
|
|
|
|
mean value: 0.9138345087548638
|
|
|
|
key: test_roc_auc
|
|
value: [0.73103448 0.75 0.78694581 0.89285714 0.70197044 0.86428571
|
|
0.78214286 0.69761905 0.89761905 0.79761905]
|
|
|
|
mean value: 0.7902093596059113
|
|
|
|
key: train_roc_auc
|
|
value: [0.81962667 0.81504498 0.8034446 0.80160985 0.8055161 0.80639796
|
|
0.81028901 0.81590281 0.80265542 0.81994238]
|
|
|
|
mean value: 0.8100429772550632
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.80555556 0.79411765 0.90625 0.60606061 0.83870968
|
|
0.79411765 0.72222222 0.86666667 0.78787879]
|
|
|
|
mean value: 0.7835864524206555
|
|
|
|
key: train_jcc
|
|
value: [0.80136986 0.79931973 0.78305085 0.78378378 0.79054054 0.79054054
|
|
0.7972973 0.79522184 0.79194631 0.81016949]
|
|
|
|
mean value: 0.7943240243778313
|
|
|
|
MCC on Blind test: 0.82
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01843333 0.02376246 0.02306414 0.02690268 0.0255127 0.01901436
|
|
0.02600908 0.02429175 0.02094388 0.02146673]
|
|
|
|
mean value: 0.02294011116027832
|
|
|
|
key: score_time
|
|
value: [0.00926924 0.01239872 0.01190901 0.01186895 0.01200056 0.01189756
|
|
0.01200986 0.01208138 0.01191568 0.0124774 ]
|
|
|
|
mean value: 0.0117828369140625
|
|
|
|
key: test_mcc
|
|
value: [0.80277297 0.81883947 0.84383267 0.79990777 0.84515772 0.85004744
|
|
0.74102654 0.63660014 0.79313677 0.84515772]
|
|
|
|
mean value: 0.7976479225553117
|
|
|
|
key: train_mcc
|
|
value: [0.9653815 0.83710153 0.85834721 0.9713635 0.96575651 0.80728276
|
|
0.98276159 0.94253379 0.82968701 0.94253379]
|
|
|
|
mean value: 0.9102749176546568
|
|
|
|
key: test_accuracy
|
|
value: [0.90909091 0.90697674 0.93023256 0.90697674 0.93023256 0.93023256
|
|
0.88372093 0.8372093 0.90697674 0.93023256]
|
|
|
|
mean value: 0.9071881606765327
|
|
|
|
key: train_accuracy
|
|
value: [0.98449612 0.91752577 0.93556701 0.9871134 0.98453608 0.91237113
|
|
0.99226804 0.9742268 0.92268041 0.9742268 ]
|
|
|
|
mean value: 0.9585011587948533
|
|
|
|
key: test_fscore
|
|
value: [0.93548387 0.92592593 0.95081967 0.92857143 0.94736842 0.94915254
|
|
0.9122807 0.8852459 0.93103448 0.94736842]
|
|
|
|
mean value: 0.9313251368226739
|
|
|
|
key: train_fscore
|
|
value: [0.98837209 0.93360996 0.95327103 0.99021526 0.98841699 0.93772894
|
|
0.99415205 0.98084291 0.94464945 0.98084291]
|
|
|
|
mean value: 0.9692101586933536
|
|
|
|
key: test_precision
|
|
value: [0.87878788 1. 0.90625 0.96296296 0.96428571 0.90322581
|
|
0.89655172 0.81818182 0.9 0.93103448]
|
|
|
|
mean value: 0.9161280387566539
|
|
|
|
key: train_precision
|
|
value: [0.98076923 0.99557522 0.91397849 0.99215686 0.97709924 0.88581315
|
|
0.99609375 0.96603774 0.89824561 0.96603774]
|
|
|
|
mean value: 0.9571807030540272
|
|
|
|
key: test_recall
|
|
value: [1. 0.86206897 1. 0.89655172 0.93103448 1.
|
|
0.92857143 0.96428571 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9511083743842365
|
|
|
|
key: train_recall
|
|
value: [0.99609375 0.87890625 0.99609375 0.98828125 1. 0.99610895
|
|
0.9922179 0.99610895 0.99610895 0.99610895]
|
|
|
|
mean value: 0.9836028696498055
|
|
|
|
key: test_roc_auc
|
|
value: [0.86666667 0.93103448 0.89285714 0.91256158 0.92980296 0.9
|
|
0.86428571 0.78214286 0.88214286 0.91547619]
|
|
|
|
mean value: 0.8876970443349754
|
|
|
|
key: train_roc_auc
|
|
value: [0.97896291 0.93566525 0.90713778 0.98656487 0.97727273 0.87210028
|
|
0.99229216 0.96370333 0.88736745 0.96370333]
|
|
|
|
mean value: 0.9464770073439868
|
|
|
|
key: test_jcc
|
|
value: [0.87878788 0.86206897 0.90625 0.86666667 0.9 0.90322581
|
|
0.83870968 0.79411765 0.87096774 0.9 ]
|
|
|
|
mean value: 0.8720794383837062
|
|
|
|
key: train_jcc
|
|
value: [0.97701149 0.87548638 0.91071429 0.98062016 0.97709924 0.88275862
|
|
0.98837209 0.96240602 0.8951049 0.96240602]
|
|
|
|
mean value: 0.9411979191863091
|
|
|
|
MCC on Blind test: 0.82
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01731038 0.02026916 0.01957202 0.03974581 0.02020097 0.02520275
|
|
0.01930141 0.02426219 0.01893759 0.02047873]
|
|
|
|
mean value: 0.02252810001373291
|
|
|
|
key: score_time
|
|
value: [0.01059461 0.01298666 0.02482533 0.0290792 0.01322126 0.01507759
|
|
0.01336336 0.01371551 0.01302266 0.01305866]
|
|
|
|
mean value: 0.015894484519958497
|
|
|
|
key: test_mcc
|
|
value: [0.55250625 0.67480294 0.84383267 0.79990777 0.62324149 0.80104099
|
|
0.7952381 0.68920734 0.86258195 0.75210143]
|
|
|
|
mean value: 0.73944609392203
|
|
|
|
key: train_mcc
|
|
value: [0.5134357 0.86489996 0.92010222 0.93185204 0.90318906 0.80728276
|
|
0.91707893 0.89149349 0.95379209 0.75859788]
|
|
|
|
mean value: 0.8461724129400746
|
|
|
|
key: test_accuracy
|
|
value: [0.79545455 0.86046512 0.93023256 0.90697674 0.8372093 0.90697674
|
|
0.90697674 0.86046512 0.93023256 0.88372093]
|
|
|
|
mean value: 0.8818710359408034
|
|
|
|
key: train_accuracy
|
|
value: [0.78036176 0.93814433 0.96391753 0.96907216 0.95618557 0.91237113
|
|
0.96134021 0.95103093 0.97938144 0.88917526]
|
|
|
|
mean value: 0.9300980313806974
|
|
|
|
key: test_fscore
|
|
value: [0.86567164 0.90322581 0.95081967 0.92857143 0.88135593 0.93333333
|
|
0.92857143 0.9 0.94339623 0.91803279]
|
|
|
|
mean value: 0.9152978256353725
|
|
|
|
key: train_fscore
|
|
value: [0.85762144 0.95522388 0.97328244 0.97637795 0.96774194 0.93772894
|
|
0.97017893 0.96421846 0.98449612 0.92280072]
|
|
|
|
mean value: 0.9509670814198928
|
|
|
|
key: test_precision
|
|
value: [0.76315789 0.84848485 0.90625 0.96296296 0.86666667 0.875
|
|
0.92857143 0.84375 1. 0.84848485]
|
|
|
|
mean value: 0.8843328649907597
|
|
|
|
key: train_precision
|
|
value: [0.75073314 0.91428571 0.95149254 0.98412698 0.94095941 0.88581315
|
|
0.99186992 0.93430657 0.98069498 0.85666667]
|
|
|
|
mean value: 0.9190949067342966
|
|
|
|
key: test_recall
|
|
value: [1. 0.96551724 1. 0.89655172 0.89655172 1.
|
|
0.92857143 0.96428571 0.89285714 1. ]
|
|
|
|
mean value: 0.9544334975369458
|
|
|
|
key: train_recall
|
|
value: [1. 1. 0.99609375 0.96875 0.99609375 0.99610895
|
|
0.94941634 0.99610895 0.98832685 1. ]
|
|
|
|
mean value: 0.9890898589494164
|
|
|
|
key: test_roc_auc
|
|
value: [0.7 0.80418719 0.89285714 0.91256158 0.80541872 0.86666667
|
|
0.89761905 0.81547619 0.94642857 0.83333333]
|
|
|
|
mean value: 0.8474548440065681
|
|
|
|
key: train_roc_auc
|
|
value: [0.67557252 0.90909091 0.94880445 0.96922348 0.93744081 0.87210028
|
|
0.96707458 0.92935218 0.97507945 0.83587786]
|
|
|
|
mean value: 0.9019616539715853
|
|
|
|
key: test_jcc
|
|
value: [0.76315789 0.82352941 0.90625 0.86666667 0.78787879 0.875
|
|
0.86666667 0.81818182 0.89285714 0.84848485]
|
|
|
|
mean value: 0.8448673237237478
|
|
|
|
key: train_jcc
|
|
value: [0.75073314 0.91428571 0.94795539 0.95384615 0.9375 0.88275862
|
|
0.94208494 0.93090909 0.96946565 0.85666667]
|
|
|
|
mean value: 0.908620536550167
|
|
|
|
MCC on Blind test: 0.82
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.18523073 0.16147161 0.16436124 0.15916276 0.16037703 0.16199589
|
|
0.15805674 0.15851617 0.15941811 0.16083217]
|
|
|
|
mean value: 0.1629422426223755
|
|
|
|
key: score_time
|
|
value: [0.01647878 0.01538658 0.01684141 0.01527929 0.01533222 0.01516366
|
|
0.01532793 0.01527905 0.01649475 0.01618528]
|
|
|
|
mean value: 0.015776896476745607
|
|
|
|
key: test_mcc
|
|
value: [0.84691397 1. 0.94742759 0.89545704 0.89408867 0.94928891
|
|
0.8993825 0.89761905 1. 0.8993825 ]
|
|
|
|
mean value: 0.9229560240493224
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.93181818 1. 0.97674419 0.95348837 0.95348837 0.97674419
|
|
0.95348837 0.95348837 1. 0.95348837]
|
|
|
|
mean value: 0.9652748414376321
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94915254 1. 0.98305085 0.96666667 0.96551724 0.98245614
|
|
0.96551724 0.96428571 1. 0.96551724]
|
|
|
|
mean value: 0.9742163635271698
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.93333333 1. 0.96666667 0.93548387 0.96551724 0.96551724
|
|
0.93333333 0.96428571 1. 0.93333333]
|
|
|
|
mean value: 0.9597470734678744
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96551724 1. 1. 1. 0.96551724 1.
|
|
1. 0.96428571 1. 1. ]
|
|
|
|
mean value: 0.9895320197044335
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.91609195 1. 0.96428571 0.92857143 0.94704433 0.96666667
|
|
0.93333333 0.94880952 1. 0.93333333]
|
|
|
|
mean value: 0.9538136288998358
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.90322581 1. 0.96666667 0.93548387 0.93333333 0.96551724
|
|
0.93333333 0.93103448 1. 0.93333333]
|
|
|
|
mean value: 0.9501928068223953
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.89
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0493753 0.0488708 0.05318427 0.08073163 0.08004856 0.07434177
|
|
0.05143976 0.06507373 0.07488728 0.07277703]
|
|
|
|
mean value: 0.06507301330566406
|
|
|
|
key: score_time
|
|
value: [0.0248158 0.01857805 0.02386665 0.03749061 0.02310753 0.03106332
|
|
0.020509 0.02412271 0.02441359 0.03884387]
|
|
|
|
mean value: 0.026681113243103027
|
|
|
|
key: test_mcc
|
|
value: [0.84691397 0.94928891 0.94742759 0.89408867 0.9025825 0.84984956
|
|
0.8993825 0.89761905 1. 0.8993825 ]
|
|
|
|
mean value: 0.9086535254177944
|
|
|
|
key: train_mcc
|
|
value: [0.98856835 0.98276159 0.99426489 0.99426489 0.98851799 0.98276159
|
|
1. 0.99426489 0.99424345 0.98849821]
|
|
|
|
mean value: 0.9908145839110009
|
|
|
|
key: test_accuracy
|
|
value: [0.93181818 0.97674419 0.97674419 0.95348837 0.95348837 0.93023256
|
|
0.95348837 0.95348837 1. 0.95348837]
|
|
|
|
mean value: 0.9582980972515857
|
|
|
|
key: train_accuracy
|
|
value: [0.99483204 0.99226804 0.99742268 0.99742268 0.99484536 0.99226804
|
|
1. 0.99742268 0.99742268 0.99484536]
|
|
|
|
mean value: 0.9958749567116865
|
|
|
|
key: test_fscore
|
|
value: [0.94915254 0.98245614 0.98305085 0.96551724 0.96428571 0.94545455
|
|
0.96551724 0.96428571 1. 0.96551724]
|
|
|
|
mean value: 0.9685237228345291
|
|
|
|
key: train_fscore
|
|
value: [0.99607843 0.99415205 0.99805068 0.99805068 0.99609375 0.99415205
|
|
1. 0.99805068 0.99805825 0.99612403]
|
|
|
|
mean value: 0.9968810605158362
|
|
|
|
key: test_precision
|
|
value: [0.93333333 1. 0.96666667 0.96551724 1. 0.96296296
|
|
0.93333333 0.96428571 1. 0.93333333]
|
|
|
|
mean value: 0.9659432585294654
|
|
|
|
key: train_precision
|
|
value: [1. 0.9922179 0.99610895 0.99610895 0.99609375 0.99609375
|
|
1. 1. 0.99612403 0.99227799]
|
|
|
|
mean value: 0.9965025320951114
|
|
|
|
key: test_recall
|
|
value: [0.96551724 0.96551724 1. 0.96551724 0.93103448 0.92857143
|
|
1. 0.96428571 1. 1. ]
|
|
|
|
mean value: 0.9720443349753695
|
|
|
|
key: train_recall
|
|
value: [0.9921875 0.99609375 1. 1. 0.99609375 0.9922179
|
|
1. 0.99610895 1. 1. ]
|
|
|
|
mean value: 0.9972701848249027
|
|
|
|
key: test_roc_auc
|
|
value: [0.91609195 0.98275862 0.96428571 0.94704433 0.96551724 0.93095238
|
|
0.93333333 0.94880952 1. 0.93333333]
|
|
|
|
mean value: 0.9522126436781609
|
|
|
|
key: train_roc_auc
|
|
value: [0.99609375 0.99047112 0.99621212 0.99621212 0.994259 0.99229216
|
|
1. 0.99805447 0.99618321 0.99236641]
|
|
|
|
mean value: 0.9952144354612601
|
|
|
|
key: test_jcc
|
|
value: [0.90322581 0.96551724 0.96666667 0.93333333 0.93103448 0.89655172
|
|
0.93333333 0.93103448 1. 0.93333333]
|
|
|
|
mean value: 0.9394030404152762
|
|
|
|
key: train_jcc
|
|
value: [0.9921875 0.98837209 0.99610895 0.99610895 0.9922179 0.98837209
|
|
1. 0.99610895 0.99612403 0.99227799]
|
|
|
|
mean value: 0.9937878456413968
|
|
|
|
MCC on Blind test: 0.89
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10673809 0.12460399 0.13710308 0.20830345 0.13556242 0.12027812
|
|
0.12287211 0.12001324 0.12852883 0.11976194]
|
|
|
|
mean value: 0.1323765277862549
|
|
|
|
key: score_time
|
|
value: [0.03763318 0.02303171 0.02376223 0.02338004 0.02684474 0.02358699
|
|
0.02333927 0.0231626 0.02329302 0.02722192]
|
|
|
|
mean value: 0.025525569915771484
|
|
|
|
key: test_mcc
|
|
value: [0.34678431 0.36578221 0.61849012 0.68226601 0.49649436 0.68689103
|
|
0.27702563 0.26854231 0.63660014 0.5773737 ]
|
|
|
|
mean value: 0.4956249811050279
|
|
|
|
key: train_mcc
|
|
value: [0.94850869 0.96008603 0.97143696 0.9544252 0.96008603 0.94290506
|
|
0.94290506 0.94290506 0.94857137 0.96562399]
|
|
|
|
mean value: 0.9537453449502975
|
|
|
|
key: test_accuracy
|
|
value: [0.72727273 0.74418605 0.8372093 0.86046512 0.79069767 0.86046512
|
|
0.69767442 0.69767442 0.8372093 0.81395349]
|
|
|
|
mean value: 0.7866807610993658
|
|
|
|
key: train_accuracy
|
|
value: [0.97674419 0.98195876 0.9871134 0.97938144 0.98195876 0.9742268
|
|
0.9742268 0.9742268 0.97680412 0.98453608]
|
|
|
|
mean value: 0.9791177175737233
|
|
|
|
key: test_fscore
|
|
value: [0.82352941 0.83076923 0.88888889 0.89655172 0.85714286 0.89655172
|
|
0.79365079 0.8 0.8852459 0.86666667]
|
|
|
|
mean value: 0.8538997198798349
|
|
|
|
key: train_fscore
|
|
value: [0.98272553 0.98651252 0.99032882 0.98461538 0.98651252 0.98091603
|
|
0.98091603 0.98091603 0.98279159 0.98846154]
|
|
|
|
mean value: 0.984469599779477
|
|
|
|
key: test_precision
|
|
value: [0.71794872 0.75 0.82352941 0.89655172 0.79411765 0.86666667
|
|
0.71428571 0.7027027 0.81818182 0.8125 ]
|
|
|
|
mean value: 0.789648440274708
|
|
|
|
key: train_precision
|
|
value: [0.96603774 0.97338403 0.98084291 0.96969697 0.97338403 0.96254682
|
|
0.96254682 0.96254682 0.96616541 0.97718631]
|
|
|
|
mean value: 0.9694337853019032
|
|
|
|
key: test_recall
|
|
value: [0.96551724 0.93103448 0.96551724 0.89655172 0.93103448 0.92857143
|
|
0.89285714 0.92857143 0.96428571 0.92857143]
|
|
|
|
mean value: 0.9332512315270937
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.61609195 0.64408867 0.76847291 0.841133 0.71551724 0.83095238
|
|
0.61309524 0.59761905 0.78214286 0.76428571]
|
|
|
|
mean value: 0.7173399014778326
|
|
|
|
key: train_roc_auc
|
|
value: [0.96564885 0.97348485 0.98106061 0.96969697 0.97348485 0.96183206
|
|
0.96183206 0.96183206 0.96564885 0.97709924]
|
|
|
|
mean value: 0.9691620402498266
|
|
|
|
key: test_jcc
|
|
value: [0.7 0.71052632 0.8 0.8125 0.75 0.8125
|
|
0.65789474 0.66666667 0.79411765 0.76470588]
|
|
|
|
mean value: 0.7468911248710011
|
|
|
|
key: train_jcc
|
|
value: [0.96603774 0.97338403 0.98084291 0.96969697 0.97338403 0.96254682
|
|
0.96254682 0.96254682 0.96616541 0.97718631]
|
|
|
|
mean value: 0.9694337853019032
|
|
|
|
MCC on Blind test: 0.27
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.61192012 0.59444022 0.59235024 0.59150219 0.58671045 0.59856415
|
|
0.60128736 0.59110236 0.59164453 0.59326339]
|
|
|
|
mean value: 0.5952785015106201
|
|
|
|
key: score_time
|
|
value: [0.0099473 0.00931048 0.00948477 0.00931215 0.0097363 0.00960946
|
|
0.0094378 0.00950861 0.00967026 0.0094254 ]
|
|
|
|
mean value: 0.0095442533493042
|
|
|
|
key: test_mcc
|
|
value: [0.84691397 0.94928891 1. 0.84515772 0.89408867 0.89761905
|
|
0.8993825 0.89761905 1. 0.8993825 ]
|
|
|
|
mean value: 0.9129452373166119
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.93181818 0.97674419 1. 0.93023256 0.95348837 0.95348837
|
|
0.95348837 0.95348837 1. 0.95348837]
|
|
|
|
mean value: 0.9606236786469344
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94915254 0.98245614 1. 0.94736842 0.96551724 0.96428571
|
|
0.96551724 0.96428571 1. 0.96551724]
|
|
|
|
mean value: 0.970410025648575
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.93333333 1. 1. 0.96428571 0.96551724 0.96428571
|
|
0.93333333 0.96428571 1. 0.93333333]
|
|
|
|
mean value: 0.9658374384236453
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96551724 0.96551724 1. 0.93103448 0.96551724 0.96428571
|
|
1. 0.96428571 1. 1. ]
|
|
|
|
mean value: 0.975615763546798
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.91609195 0.98275862 1. 0.92980296 0.94704433 0.94880952
|
|
0.93333333 0.94880952 1. 0.93333333]
|
|
|
|
mean value: 0.9539983579638752
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.90322581 0.96551724 1. 0.9 0.93333333 0.93103448
|
|
0.93333333 0.93103448 1. 0.93333333]
|
|
|
|
mean value: 0.9430812013348164
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.02912045 0.02907252 0.02943158 0.02924204 0.03977633 0.04929161
|
|
0.04781103 0.05106616 0.04255533 0.02873945]
|
|
|
|
mean value: 0.037610650062561035
|
|
|
|
key: score_time
|
|
value: [0.01276016 0.01829052 0.01310325 0.01456141 0.01490211 0.01842403
|
|
0.01963758 0.01851559 0.0146873 0.01568079]
|
|
|
|
mean value: 0.016056275367736815
|
|
|
|
key: test_mcc
|
|
value: [ 0.0446356 0.1779546 0.29311846 0.0045305 0.1993421 0.02624453
|
|
0.07005059 0.15163508 0.1015749 -0.20044593]
|
|
|
|
mean value: 0.08686404365068628
|
|
|
|
key: train_mcc
|
|
value: [0.3513966 0.34143168 0.31600695 0.34143168 0.33312685 0.34340112
|
|
0.34340112 0.36757639 0.35159962 0.35965479]
|
|
|
|
mean value: 0.34490268018852405
|
|
|
|
key: test_accuracy
|
|
value: [0.63636364 0.6744186 0.72093023 0.65116279 0.69767442 0.60465116
|
|
0.65116279 0.65116279 0.65116279 0.58139535]
|
|
|
|
mean value: 0.6520084566596195
|
|
|
|
key: train_accuracy
|
|
value: [0.72093023 0.71649485 0.70876289 0.71649485 0.71391753 0.71907216
|
|
0.71907216 0.72680412 0.72164948 0.7242268 ]
|
|
|
|
mean value: 0.7187425077918964
|
|
|
|
key: test_fscore
|
|
value: [0.76470588 0.78125 0.81818182 0.7826087 0.8115942 0.73015873
|
|
0.7826087 0.76190476 0.7761194 0.73529412]
|
|
|
|
mean value: 0.7744426307433283
|
|
|
|
key: train_fscore
|
|
value: [0.82580645 0.82315113 0.8192 0.82315113 0.82182986 0.82504013
|
|
0.82504013 0.82903226 0.82636656 0.82769726]
|
|
|
|
mean value: 0.824631489480623
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.71428571 0.72972973 0.675 0.7 0.65714286
|
|
0.65853659 0.68571429 0.66666667 0.625 ]
|
|
|
|
mean value: 0.6778742505571774
|
|
|
|
key: train_precision
|
|
value: [0.7032967 0.69945355 0.69376694 0.69945355 0.69754768 0.70218579
|
|
0.70218579 0.70798898 0.70410959 0.70604396]
|
|
|
|
mean value: 0.7016032539215682
|
|
|
|
key: test_recall
|
|
value: [0.89655172 0.86206897 0.93103448 0.93103448 0.96551724 0.82142857
|
|
0.96428571 0.85714286 0.92857143 0.89285714]
|
|
|
|
mean value: 0.9050492610837438
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.51494253 0.57389163 0.60837438 0.50123153 0.55418719 0.51071429
|
|
0.51547619 0.56190476 0.53095238 0.44642857]
|
|
|
|
mean value: 0.5318103448275862
|
|
|
|
key: train_roc_auc
|
|
value: [0.58778626 0.58333333 0.5719697 0.58333333 0.57954545 0.58396947
|
|
0.58396947 0.59541985 0.58778626 0.59160305]
|
|
|
|
mean value: 0.5848716169326856
|
|
|
|
key: test_jcc
|
|
value: [0.61904762 0.64102564 0.69230769 0.64285714 0.68292683 0.575
|
|
0.64285714 0.61538462 0.63414634 0.58139535]
|
|
|
|
mean value: 0.6326948373048771
|
|
|
|
key: train_jcc
|
|
value: [0.7032967 0.69945355 0.69376694 0.69945355 0.69754768 0.70218579
|
|
0.70218579 0.70798898 0.70410959 0.70604396]
|
|
|
|
mean value: 0.7016032539215682
|
|
|
|
MCC on Blind test: -0.21
|
|
|
|
Accuracy on Blind test: 0.58
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03475094 0.03894949 0.03813624 0.03801632 0.03756666 0.03808069
|
|
0.03824639 0.03827929 0.03780437 0.03802657]
|
|
|
|
mean value: 0.037785696983337405
|
|
|
|
key: score_time
|
|
value: [0.02142906 0.02273035 0.02185512 0.02089429 0.02069378 0.02342296
|
|
0.02245855 0.02128291 0.02075934 0.0240798 ]
|
|
|
|
mean value: 0.02196061611175537
|
|
|
|
key: test_mcc
|
|
value: [0.85146932 0.94928891 0.89408867 0.89408867 0.73130353 0.94928891
|
|
0.80104099 0.80104099 0.94928891 0.84515772]
|
|
|
|
mean value: 0.8666056604496469
|
|
|
|
key: train_mcc
|
|
value: [0.93647644 0.94844498 0.93680394 0.95413965 0.9544252 0.95381103
|
|
0.95968877 0.93655577 0.94253379 0.94824314]
|
|
|
|
mean value: 0.9471122705214379
|
|
|
|
key: test_accuracy
|
|
value: [0.93181818 0.97674419 0.95348837 0.95348837 0.88372093 0.97674419
|
|
0.90697674 0.90697674 0.97674419 0.93023256]
|
|
|
|
mean value: 0.9396934460887949
|
|
|
|
key: train_accuracy
|
|
value: [0.97157623 0.97680412 0.97164948 0.97938144 0.97938144 0.97938144
|
|
0.98195876 0.97164948 0.9742268 0.97680412]
|
|
|
|
mean value: 0.9762813340792242
|
|
|
|
key: test_fscore
|
|
value: [0.95081967 0.98245614 0.96551724 0.96551724 0.91525424 0.98245614
|
|
0.93333333 0.93333333 0.98245614 0.94736842]
|
|
|
|
mean value: 0.9558511900949833
|
|
|
|
key: train_fscore
|
|
value: [0.97880539 0.98265896 0.97880539 0.98455598 0.98461538 0.98455598
|
|
0.98651252 0.97888676 0.98084291 0.98272553]
|
|
|
|
mean value: 0.982296482327693
|
|
|
|
key: test_precision
|
|
value: [0.90625 1. 0.96551724 0.96551724 0.9 0.96551724
|
|
0.875 0.875 0.96551724 0.93103448]
|
|
|
|
mean value: 0.9349353448275862
|
|
|
|
key: train_precision
|
|
value: [0.96577947 0.96958175 0.96577947 0.97328244 0.96969697 0.97701149
|
|
0.97709924 0.96590909 0.96603774 0.96969697]
|
|
|
|
mean value: 0.969987462420492
|
|
|
|
key: test_recall
|
|
value: [1. 0.96551724 0.96551724 0.96551724 0.93103448 1.
|
|
1. 1. 1. 0.96428571]
|
|
|
|
mean value: 0.9791871921182266
|
|
|
|
key: train_recall
|
|
value: [0.9921875 0.99609375 0.9921875 0.99609375 1. 0.9922179
|
|
0.99610895 0.9922179 0.99610895 0.99610895]
|
|
|
|
mean value: 0.9949325145914397
|
|
|
|
key: test_roc_auc
|
|
value: [0.9 0.98275862 0.94704433 0.94704433 0.85837438 0.96666667
|
|
0.86666667 0.86666667 0.96666667 0.91547619]
|
|
|
|
mean value: 0.9217364532019705
|
|
|
|
key: train_roc_auc
|
|
value: [0.9617426 0.96774384 0.96200284 0.97153172 0.96969697 0.97320819
|
|
0.97515371 0.9617578 0.96370333 0.96752012]
|
|
|
|
mean value: 0.9674061138767979
|
|
|
|
key: test_jcc
|
|
value: [0.90625 0.96551724 0.93333333 0.93333333 0.84375 0.96551724
|
|
0.875 0.875 0.96551724 0.9 ]
|
|
|
|
mean value: 0.9163218390804598
|
|
|
|
key: train_jcc
|
|
value: [0.95849057 0.96590909 0.95849057 0.96958175 0.96969697 0.96958175
|
|
0.97338403 0.95864662 0.96240602 0.96603774]
|
|
|
|
mean value: 0.9652225088626647
|
|
|
|
MCC on Blind test: 0.88
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.22743607 0.27104139 0.27143574 0.37188888 0.2696569 0.27980232
|
|
0.29186916 0.26531053 0.31099105 0.19231105]
|
|
|
|
mean value: 0.27517430782318114
|
|
|
|
key: score_time
|
|
value: [0.02266073 0.02105594 0.04440308 0.01608038 0.01722169 0.02272081
|
|
0.02047729 0.02023792 0.01908875 0.01224136]
|
|
|
|
mean value: 0.02161879539489746
|
|
|
|
key: test_mcc
|
|
value: [0.79532948 0.94928891 0.89408867 0.89408867 0.73130353 0.94928891
|
|
0.80104099 0.63247577 0.94928891 0.84984956]
|
|
|
|
mean value: 0.8446043383090877
|
|
|
|
key: train_mcc
|
|
value: [0.96531572 0.95984378 0.93680394 0.95413965 0.9544252 0.95381103
|
|
0.95968877 0.95958106 0.94253379 0.95381103]
|
|
|
|
mean value: 0.9539953964174639
|
|
|
|
key: test_accuracy
|
|
value: [0.90909091 0.97674419 0.95348837 0.95348837 0.88372093 0.97674419
|
|
0.90697674 0.8372093 0.97674419 0.93023256]
|
|
|
|
mean value: 0.9304439746300212
|
|
|
|
key: train_accuracy
|
|
value: [0.98449612 0.98195876 0.97164948 0.97938144 0.97938144 0.97938144
|
|
0.98195876 0.98195876 0.9742268 0.97938144]
|
|
|
|
mean value: 0.9793774474546472
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.98245614 0.96551724 0.96551724 0.91525424 0.98245614
|
|
0.93333333 0.88135593 0.98245614 0.94545455]
|
|
|
|
mean value: 0.948713428542399
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./katg_sl.py:107: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
baseline_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./katg_sl.py:110: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
baseline_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[0.98832685 0.98646035 0.97880539 0.98455598 0.98461538 0.98455598
|
|
0.98651252 0.98646035 0.98084291 0.98455598]
|
|
|
|
mean value: 0.9845691713809857
|
|
|
|
key: test_precision
|
|
value: [0.90322581 1. 0.96551724 0.96551724 0.9 0.96551724
|
|
0.875 0.83870968 0.96551724 0.96296296]
|
|
|
|
mean value: 0.9341967412351172
|
|
|
|
key: train_precision
|
|
value: [0.98449612 0.97701149 0.96577947 0.97328244 0.96969697 0.97701149
|
|
0.97709924 0.98076923 0.96603774 0.97701149]
|
|
|
|
mean value: 0.9748195690174807
|
|
|
|
key: test_recall
|
|
value: [0.96551724 0.96551724 0.96551724 0.96551724 0.93103448 1.
|
|
1. 0.92857143 1. 0.92857143]
|
|
|
|
mean value: 0.9650246305418719
|
|
|
|
key: train_recall
|
|
value: [0.9921875 0.99609375 0.9921875 0.99609375 1. 0.9922179
|
|
0.99610895 0.9922179 0.99610895 0.9922179 ]
|
|
|
|
mean value: 0.994543409533074
|
|
|
|
key: test_roc_auc
|
|
value: [0.88275862 0.98275862 0.94704433 0.94704433 0.85837438 0.96666667
|
|
0.86666667 0.79761905 0.96666667 0.93095238]
|
|
|
|
mean value: 0.9146551724137931
|
|
|
|
key: train_roc_auc
|
|
value: [0.98082657 0.9753196 0.96200284 0.97153172 0.96969697 0.97320819
|
|
0.97515371 0.97702498 0.96370333 0.97320819]
|
|
|
|
mean value: 0.9721676103876334
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.96551724 0.93333333 0.93333333 0.84375 0.96551724
|
|
0.875 0.78787879 0.96551724 0.89655172]
|
|
|
|
mean value: 0.9041398902821317
|
|
|
|
key: train_jcc
|
|
value: [0.97692308 0.97328244 0.95849057 0.96958175 0.96969697 0.96958175
|
|
0.97338403 0.97328244 0.96240602 0.96958175]
|
|
|
|
mean value: 0.96962107907581
|
|
|
|
MCC on Blind test: 0.82
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.04111385 0.03918624 0.03823042 0.04130435 0.03554368 0.03858113
|
|
0.03941131 0.03819084 0.03714442 0.03736854]
|
|
|
|
mean value: 0.03860747814178467
|
|
|
|
key: score_time
|
|
value: [0.01241827 0.01450682 0.01431775 0.01575279 0.01216865 0.01636386
|
|
0.01581478 0.01859069 0.01580238 0.01748109]
|
|
|
|
mean value: 0.015321707725524903
|
|
|
|
key: test_mcc
|
|
value: [0.8953202 0.92980296 0.78940887 0.8615634 0.65634573 0.89988258
|
|
0.8953202 0.82490815 0.86189955 0.86189955]
|
|
|
|
mean value: 0.8476351166112527
|
|
|
|
key: train_mcc
|
|
value: [0.89480004 0.9025977 0.89491047 0.89108657 0.90309643 0.90259326
|
|
0.88726363 0.89126481 0.88699724 0.90667624]
|
|
|
|
mean value: 0.8961286395835659
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.96491228 0.89473684 0.92982456 0.8245614 0.94736842
|
|
0.94736842 0.9122807 0.92982456 0.92982456]
|
|
|
|
mean value: 0.9228070175438596
|
|
|
|
key: train_accuracy
|
|
value: [0.94736842 0.95126706 0.94736842 0.9454191 0.95126706 0.95126706
|
|
0.94346979 0.9454191 0.94346979 0.95321637]
|
|
|
|
mean value: 0.947953216374269
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 0.96551724 0.89655172 0.93333333 0.81481481 0.94915254
|
|
0.94736842 0.90909091 0.93103448 0.93103448]
|
|
|
|
mean value: 0.9225266372751685
|
|
|
|
key: train_fscore
|
|
value: [0.94757282 0.95145631 0.94777563 0.94594595 0.95201536 0.9516441
|
|
0.94433781 0.94636015 0.94390716 0.95384615]
|
|
|
|
mean value: 0.9484861432129039
|
|
|
|
key: test_precision
|
|
value: [0.96428571 0.96551724 0.89655172 0.90322581 0.88 0.90322581
|
|
0.93103448 0.92592593 0.9 0.9 ]
|
|
|
|
mean value: 0.9169766701390728
|
|
|
|
key: train_precision
|
|
value: [0.94208494 0.94594595 0.93869732 0.9351145 0.93584906 0.94615385
|
|
0.93181818 0.93207547 0.93846154 0.94296578]
|
|
|
|
mean value: 0.9389166584058478
|
|
|
|
key: test_recall
|
|
value: [0.93103448 0.96551724 0.89655172 0.96551724 0.75862069 1.
|
|
0.96428571 0.89285714 0.96428571 0.96428571]
|
|
|
|
mean value: 0.930295566502463
|
|
|
|
key: train_recall
|
|
value: [0.953125 0.95703125 0.95703125 0.95703125 0.96875 0.95719844
|
|
0.95719844 0.96108949 0.94941634 0.96498054]
|
|
|
|
mean value: 0.9582852018482491
|
|
|
|
key: test_roc_auc
|
|
value: [0.9476601 0.96490148 0.89470443 0.92918719 0.82573892 0.94827586
|
|
0.9476601 0.91194581 0.93041872 0.93041872]
|
|
|
|
mean value: 0.9230911330049262
|
|
|
|
key: train_roc_auc
|
|
value: [0.94737962 0.95127827 0.94738722 0.9454417 0.95130107 0.95125547
|
|
0.94344297 0.9453885 0.94345817 0.9531934 ]
|
|
|
|
mean value: 0.9479526386186771
|
|
|
|
key: test_jcc
|
|
value: [0.9 0.93333333 0.8125 0.875 0.6875 0.90322581
|
|
0.9 0.83333333 0.87096774 0.87096774]
|
|
|
|
mean value: 0.8586827956989247
|
|
|
|
key: train_jcc
|
|
value: [0.900369 0.90740741 0.90073529 0.8974359 0.90842491 0.90774908
|
|
0.89454545 0.89818182 0.89377289 0.91176471]
|
|
|
|
mean value: 0.9020386460949191
|
|
|
|
MCC on Blind test: 0.89
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.85042238 1.0178709 0.85435915 0.93712735 0.84169865 0.8663528
|
|
1.05083084 0.91208911 1.01056862 0.870682 ]
|
|
|
|
mean value: 0.921200180053711
|
|
|
|
key: score_time
|
|
value: [0.01473236 0.01605916 0.01670599 0.01941013 0.01508021 0.01618195
|
|
0.01466751 0.01649761 0.0150547 0.01528406]
|
|
|
|
mean value: 0.015967369079589844
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 0.8953202 0.85960591 0.85960591 0.8953202 0.86189955
|
|
0.8951918 0.92980296 0.8951918 0.86789789]
|
|
|
|
mean value: 0.8925353457793551
|
|
|
|
key: train_mcc
|
|
value: [1. 0.98837192 0.98831165 1. 0.98831165 1.
|
|
0.98443509 0.98443509 1. 1. ]
|
|
|
|
mean value: 0.9933865392883063
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.94736842 0.92982456 0.92982456 0.94736842 0.92982456
|
|
0.94736842 0.96491228 0.94736842 0.92982456]
|
|
|
|
mean value: 0.9456140350877192
|
|
|
|
key: train_accuracy
|
|
value: [1. 0.99415205 0.99415205 1. 0.99415205 1.
|
|
0.99220273 0.99220273 1. 1. ]
|
|
|
|
mean value: 0.9966861598440546
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 0.94736842 0.93103448 0.93103448 0.94736842 0.93103448
|
|
0.94545455 0.96428571 0.94545455 0.92307692]
|
|
|
|
mean value: 0.9448568159003731
|
|
|
|
key: train_fscore
|
|
value: [1. 0.99417476 0.99415205 1. 0.99415205 1.
|
|
0.99224806 0.99224806 1. 1. ]
|
|
|
|
mean value: 0.9966974974879813
|
|
|
|
key: test_precision
|
|
value: [1. 0.96428571 0.93103448 0.93103448 0.96428571 0.9
|
|
0.96296296 0.96428571 0.96296296 1. ]
|
|
|
|
mean value: 0.958085203430031
|
|
|
|
key: train_precision
|
|
value: [1. 0.98841699 0.9922179 1. 0.9922179 1.
|
|
0.98841699 0.98841699 1. 1. ]
|
|
|
|
mean value: 0.9949686762916335
|
|
|
|
key: test_recall
|
|
value: [0.96551724 0.93103448 0.93103448 0.93103448 0.93103448 0.96428571
|
|
0.92857143 0.96428571 0.92857143 0.85714286]
|
|
|
|
mean value: 0.9332512315270935
|
|
|
|
key: train_recall
|
|
value: [1. 1. 0.99609375 1. 0.99609375 1.
|
|
0.99610895 0.99610895 1. 1. ]
|
|
|
|
mean value: 0.9984405398832685
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.9476601 0.92980296 0.92980296 0.9476601 0.93041872
|
|
0.94704433 0.96490148 0.94704433 0.92857143]
|
|
|
|
mean value: 0.9455665024630543
|
|
|
|
key: train_roc_auc
|
|
value: [1. 0.99416342 0.99415582 1. 0.99415582 1.
|
|
0.9921951 0.9921951 1. 1. ]
|
|
|
|
mean value: 0.9966865272373541
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 0.9 0.87096774 0.87096774 0.9 0.87096774
|
|
0.89655172 0.93103448 0.89655172 0.85714286]
|
|
|
|
mean value: 0.8959701255363102
|
|
|
|
key: train_jcc
|
|
value: [1. 0.98841699 0.98837209 1. 0.98837209 1.
|
|
0.98461538 0.98461538 1. 1. ]
|
|
|
|
mean value: 0.993439194369427
|
|
|
|
MCC on Blind test: 0.82
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01497316 0.01095653 0.01136231 0.01050925 0.01016259 0.01015019
|
|
0.01007819 0.0104444 0.0105536 0.01053381]
|
|
|
|
mean value: 0.010972404479980468
|
|
|
|
key: score_time
|
|
value: [0.01388407 0.00955224 0.00952005 0.00899887 0.00899982 0.00913286
|
|
0.00914168 0.00911069 0.00929952 0.00911975]
|
|
|
|
mean value: 0.009675955772399903
|
|
|
|
key: test_mcc
|
|
value: [0.65018988 0.8615634 0.622444 0.46490107 0.64901478 0.72242731
|
|
0.71921182 0.79778885 0.82512315 0.61805122]
|
|
|
|
mean value: 0.6930715492282729
|
|
|
|
key: train_mcc
|
|
value: [0.73363539 0.70504029 0.73152229 0.71648082 0.75803947 0.68918921
|
|
0.71149566 0.7453938 0.70358595 0.72724584]
|
|
|
|
mean value: 0.7221628704440908
|
|
|
|
key: test_accuracy
|
|
value: [0.8245614 0.92982456 0.80701754 0.71929825 0.8245614 0.84210526
|
|
0.85964912 0.89473684 0.9122807 0.80701754]
|
|
|
|
mean value: 0.8421052631578947
|
|
|
|
key: train_accuracy
|
|
value: [0.86354776 0.84990253 0.86354776 0.85575049 0.87719298 0.84210526
|
|
0.85380117 0.87134503 0.84795322 0.85964912]
|
|
|
|
mean value: 0.8584795321637426
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.93333333 0.82539683 0.76470588 0.82758621 0.86153846
|
|
0.85714286 0.9 0.9122807 0.81355932]
|
|
|
|
mean value: 0.8528876923782588
|
|
|
|
key: train_fscore
|
|
value: [0.87179487 0.85819521 0.87037037 0.86346863 0.88268156 0.85137615
|
|
0.86136784 0.87686567 0.85869565 0.86956522]
|
|
|
|
mean value: 0.8664381178218032
|
|
|
|
key: test_precision
|
|
value: [0.80645161 0.90322581 0.76470588 0.66666667 0.82758621 0.75675676
|
|
0.85714286 0.84375 0.89655172 0.77419355]
|
|
|
|
mean value: 0.809703106169564
|
|
|
|
key: train_precision
|
|
value: [0.82068966 0.81184669 0.82746479 0.81818182 0.84341637 0.80555556
|
|
0.82042254 0.84229391 0.80338983 0.81355932]
|
|
|
|
mean value: 0.8206820472208091
|
|
|
|
key: test_recall
|
|
value: [0.86206897 0.96551724 0.89655172 0.89655172 0.82758621 1.
|
|
0.85714286 0.96428571 0.92857143 0.85714286]
|
|
|
|
mean value: 0.9055418719211823
|
|
|
|
key: train_recall
|
|
value: [0.9296875 0.91015625 0.91796875 0.9140625 0.92578125 0.90272374
|
|
0.90661479 0.91439689 0.92217899 0.93385214]
|
|
|
|
mean value: 0.917742278696498
|
|
|
|
key: test_roc_auc
|
|
value: [0.82389163 0.92918719 0.80541872 0.716133 0.82450739 0.84482759
|
|
0.85960591 0.89593596 0.91256158 0.80788177]
|
|
|
|
mean value: 0.8419950738916256
|
|
|
|
key: train_roc_auc
|
|
value: [0.86367643 0.85001976 0.86365364 0.85586393 0.87728751 0.84198687
|
|
0.85369802 0.87126094 0.84780824 0.8595042 ]
|
|
|
|
mean value: 0.8584759545233464
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.875 0.7027027 0.61904762 0.70588235 0.75675676
|
|
0.75 0.81818182 0.83870968 0.68571429]
|
|
|
|
mean value: 0.7466280927049428
|
|
|
|
key: train_jcc
|
|
value: [0.77272727 0.7516129 0.7704918 0.75974026 0.79 0.74121406
|
|
0.75649351 0.7807309 0.75238095 0.76923077]
|
|
|
|
mean value: 0.764462242159521
|
|
|
|
MCC on Blind test: 0.75
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.0110116 0.01083779 0.01080775 0.01078486 0.01069784 0.01067305
|
|
0.01072073 0.01070118 0.01066279 0.01066947]
|
|
|
|
mean value: 0.010756707191467286
|
|
|
|
key: score_time
|
|
value: [0.00927234 0.00937843 0.00921917 0.00925756 0.00914478 0.0092566
|
|
0.00927472 0.00924993 0.0091548 0.00928521]
|
|
|
|
mean value: 0.009249353408813476
|
|
|
|
key: test_mcc
|
|
value: [0.71921182 0.86189955 0.71921182 0.65104858 0.51048128 0.7589669
|
|
0.82942474 0.69581469 0.78940887 0.65018988]
|
|
|
|
mean value: 0.7185658130544192
|
|
|
|
key: train_mcc
|
|
value: [0.75451908 0.73892092 0.74278722 0.75068043 0.77951916 0.75838325
|
|
0.75443104 0.75058523 0.7505054 0.75048638]
|
|
|
|
mean value: 0.7530818106690766
|
|
|
|
key: test_accuracy
|
|
value: [0.85964912 0.92982456 0.85964912 0.8245614 0.75438596 0.87719298
|
|
0.9122807 0.84210526 0.89473684 0.8245614 ]
|
|
|
|
mean value: 0.8578947368421053
|
|
|
|
key: train_accuracy
|
|
value: [0.87719298 0.86939571 0.87134503 0.87524366 0.88888889 0.8791423
|
|
0.87719298 0.87524366 0.87524366 0.87524366]
|
|
|
|
mean value: 0.8764132553606238
|
|
|
|
key: test_fscore
|
|
value: [0.86206897 0.92857143 0.86206897 0.82142857 0.75 0.88135593
|
|
0.91525424 0.85245902 0.89285714 0.81481481]
|
|
|
|
mean value: 0.8580879074591409
|
|
|
|
key: train_fscore
|
|
value: [0.87573964 0.8678501 0.87209302 0.87351779 0.89224953 0.87843137
|
|
0.87814313 0.8745098 0.87596899 0.87548638]
|
|
|
|
mean value: 0.8763989764320921
|
|
|
|
key: test_precision
|
|
value: [0.86206897 0.96296296 0.86206897 0.85185185 0.77777778 0.83870968
|
|
0.87096774 0.78787879 0.89285714 0.84615385]
|
|
|
|
mean value: 0.8553297719871691
|
|
|
|
key: train_precision
|
|
value: [0.88446215 0.87649402 0.86538462 0.884 0.86446886 0.88537549
|
|
0.87307692 0.88142292 0.87258687 0.87548638]
|
|
|
|
mean value: 0.876275825111137
|
|
|
|
key: test_recall
|
|
value: [0.86206897 0.89655172 0.86206897 0.79310345 0.72413793 0.92857143
|
|
0.96428571 0.92857143 0.89285714 0.78571429]
|
|
|
|
mean value: 0.8637931034482759
|
|
|
|
key: train_recall
|
|
value: [0.8671875 0.859375 0.87890625 0.86328125 0.921875 0.87159533
|
|
0.88326848 0.86770428 0.87937743 0.87548638]
|
|
|
|
mean value: 0.8768056906614786
|
|
|
|
key: test_roc_auc
|
|
value: [0.85960591 0.93041872 0.85960591 0.82512315 0.75492611 0.87807882
|
|
0.91317734 0.84359606 0.89470443 0.82389163]
|
|
|
|
mean value: 0.8583128078817734
|
|
|
|
key: train_roc_auc
|
|
value: [0.87717352 0.86937622 0.87135974 0.87522039 0.88895306 0.87915704
|
|
0.87718112 0.87525839 0.87523559 0.87524319]
|
|
|
|
mean value: 0.8764158256322957
|
|
|
|
key: test_jcc
|
|
value: [0.75757576 0.86666667 0.75757576 0.6969697 0.6 0.78787879
|
|
0.84375 0.74285714 0.80645161 0.6875 ]
|
|
|
|
mean value: 0.7547225422427035
|
|
|
|
key: train_jcc
|
|
value: [0.77894737 0.76655052 0.77319588 0.7754386 0.80546075 0.78321678
|
|
0.78275862 0.77700348 0.77931034 0.77854671]
|
|
|
|
mean value: 0.7800429060559617
|
|
|
|
MCC on Blind test: 0.6
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00980592 0.01067448 0.01085329 0.01106048 0.01118255 0.0110743
|
|
0.01100373 0.01079249 0.01016951 0.0098362 ]
|
|
|
|
mean value: 0.010645294189453125
|
|
|
|
key: score_time
|
|
value: [0.01352572 0.01347709 0.01385856 0.01333332 0.01348615 0.01313925
|
|
0.01317501 0.01264691 0.01324797 0.01208138]
|
|
|
|
mean value: 0.01319713592529297
|
|
|
|
key: test_mcc
|
|
value: [0.65104858 0.7589669 0.57881773 0.553659 0.58076493 0.71921182
|
|
0.68736396 0.50862069 0.66268617 0.54377353]
|
|
|
|
mean value: 0.624491332602681
|
|
|
|
key: train_mcc
|
|
value: [0.75057007 0.74270775 0.75884232 0.75083654 0.7548331 0.75486659
|
|
0.77099303 0.73114227 0.77027873 0.75887891]
|
|
|
|
mean value: 0.7543949299234101
|
|
|
|
key: test_accuracy
|
|
value: [0.8245614 0.87719298 0.78947368 0.77192982 0.78947368 0.85964912
|
|
0.84210526 0.75438596 0.8245614 0.77192982]
|
|
|
|
mean value: 0.8105263157894737
|
|
|
|
key: train_accuracy
|
|
value: [0.87524366 0.87134503 0.8791423 0.87524366 0.87719298 0.87719298
|
|
0.88499025 0.86549708 0.88499025 0.8791423 ]
|
|
|
|
mean value: 0.8769980506822612
|
|
|
|
key: test_fscore
|
|
value: [0.82142857 0.87272727 0.79310345 0.75471698 0.78571429 0.85714286
|
|
0.83018868 0.75 0.8 0.76363636]
|
|
|
|
mean value: 0.8028658459302571
|
|
|
|
key: train_fscore
|
|
value: [0.87401575 0.87058824 0.87649402 0.87301587 0.87475149 0.87524752
|
|
0.88223553 0.86444008 0.88362919 0.87698413]
|
|
|
|
mean value: 0.8751401821885225
|
|
|
|
key: test_precision
|
|
value: [0.85185185 0.92307692 0.79310345 0.83333333 0.81481481 0.85714286
|
|
0.88 0.75 0.90909091 0.77777778]
|
|
|
|
mean value: 0.8390191915364329
|
|
|
|
key: train_precision
|
|
value: [0.88095238 0.87401575 0.89430894 0.88709677 0.89068826 0.89112903
|
|
0.9057377 0.87301587 0.896 0.89473684]
|
|
|
|
mean value: 0.8887681557673401
|
|
|
|
key: test_recall
|
|
value: [0.79310345 0.82758621 0.79310345 0.68965517 0.75862069 0.85714286
|
|
0.78571429 0.75 0.71428571 0.75 ]
|
|
|
|
mean value: 0.7719211822660098
|
|
|
|
key: train_recall
|
|
value: [0.8671875 0.8671875 0.859375 0.859375 0.859375 0.85992218
|
|
0.85992218 0.85603113 0.87159533 0.85992218]
|
|
|
|
mean value: 0.861989299610895
|
|
|
|
key: test_roc_auc
|
|
value: [0.82512315 0.87807882 0.78940887 0.77339901 0.79002463 0.85960591
|
|
0.841133 0.75431034 0.8226601 0.77155172]
|
|
|
|
mean value: 0.8105295566502463
|
|
|
|
key: train_roc_auc
|
|
value: [0.87522799 0.87133694 0.87910384 0.87521279 0.87715832 0.87722671
|
|
0.88503921 0.86551556 0.88501642 0.87917984]
|
|
|
|
mean value: 0.8770017631322957
|
|
|
|
key: test_jcc
|
|
value: [0.6969697 0.77419355 0.65714286 0.60606061 0.64705882 0.75
|
|
0.70967742 0.6 0.66666667 0.61764706]
|
|
|
|
mean value: 0.6725416676934703
|
|
|
|
key: train_jcc
|
|
value: [0.77622378 0.77083333 0.78014184 0.77464789 0.77738516 0.77816901
|
|
0.78928571 0.76124567 0.79151943 0.78091873]
|
|
|
|
mean value: 0.778037056551816
|
|
|
|
MCC on Blind test: 0.28
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02196026 0.02291179 0.02152228 0.02228761 0.02168727 0.02183843
|
|
0.02566576 0.02143693 0.02469206 0.0216434 ]
|
|
|
|
mean value: 0.02256457805633545
|
|
|
|
key: score_time
|
|
value: [0.01259232 0.01201963 0.01186705 0.01171136 0.01183677 0.0119884
|
|
0.01219893 0.01177049 0.01200604 0.01177382]
|
|
|
|
mean value: 0.011976480484008789
|
|
|
|
key: test_mcc
|
|
value: [0.8953202 0.96547546 0.72064772 0.82880708 0.54377353 0.7589669
|
|
0.76689254 0.59358067 0.86189955 0.79778885]
|
|
|
|
mean value: 0.7733152498837434
|
|
|
|
key: train_mcc
|
|
value: [0.83278097 0.84616083 0.847201 0.86134265 0.8306883 0.83376616
|
|
0.83513583 0.85135684 0.82366838 0.85557912]
|
|
|
|
mean value: 0.8417680092666877
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.98245614 0.85964912 0.9122807 0.77192982 0.87719298
|
|
0.87719298 0.78947368 0.92982456 0.89473684]
|
|
|
|
mean value: 0.8842105263157894
|
|
|
|
key: train_accuracy
|
|
value: [0.91423002 0.92202729 0.92202729 0.92982456 0.9122807 0.91617934
|
|
0.91617934 0.92397661 0.91033138 0.92592593]
|
|
|
|
mean value: 0.9192982456140351
|
|
|
|
key: test_fscore
|
|
value: [0.94736842 0.98305085 0.86666667 0.91803279 0.77966102 0.88135593
|
|
0.8852459 0.80645161 0.93103448 0.9 ]
|
|
|
|
mean value: 0.8898867668515904
|
|
|
|
key: train_fscore
|
|
value: [0.91821561 0.9245283 0.92509363 0.93181818 0.91712707 0.91871456
|
|
0.91962617 0.9273743 0.9141791 0.92936803]
|
|
|
|
mean value: 0.9226044961753141
|
|
|
|
key: test_precision
|
|
value: [0.96428571 0.96666667 0.83870968 0.875 0.76666667 0.83870968
|
|
0.81818182 0.73529412 0.9 0.84375 ]
|
|
|
|
mean value: 0.8547264338286634
|
|
|
|
key: train_precision
|
|
value: [0.87588652 0.89416058 0.88848921 0.90441176 0.86759582 0.89338235
|
|
0.88489209 0.88928571 0.8781362 0.88967972]
|
|
|
|
mean value: 0.886591997049577
|
|
|
|
key: test_recall
|
|
value: [0.93103448 1. 0.89655172 0.96551724 0.79310345 0.92857143
|
|
0.96428571 0.89285714 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9300492610837439
|
|
|
|
key: train_recall
|
|
value: [0.96484375 0.95703125 0.96484375 0.9609375 0.97265625 0.94552529
|
|
0.95719844 0.9688716 0.95330739 0.97276265]
|
|
|
|
mean value: 0.9617977869649805
|
|
|
|
key: test_roc_auc
|
|
value: [0.9476601 0.98214286 0.85899015 0.91133005 0.77155172 0.87807882
|
|
0.87869458 0.79125616 0.93041872 0.89593596]
|
|
|
|
mean value: 0.8846059113300493
|
|
|
|
key: train_roc_auc
|
|
value: [0.91432849 0.92209539 0.92211059 0.92988509 0.91239816 0.91612202
|
|
0.91609922 0.92388892 0.91024745 0.92583445]
|
|
|
|
mean value: 0.9193009788424125
|
|
|
|
key: test_jcc
|
|
value: [0.9 0.96666667 0.76470588 0.84848485 0.63888889 0.78787879
|
|
0.79411765 0.67567568 0.87096774 0.81818182]
|
|
|
|
mean value: 0.8065567957123935
|
|
|
|
key: train_jcc
|
|
value: [0.84879725 0.85964912 0.86062718 0.87234043 0.84693878 0.84965035
|
|
0.85121107 0.86458333 0.8419244 0.86805556]
|
|
|
|
mean value: 0.856377746223762
|
|
|
|
MCC on Blind test: 0.82
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.87534595 1.95227742 1.74219894 2.32791281 2.08549452 2.0482564
|
|
2.05750775 2.10308409 2.06708074 1.92465091]
|
|
|
|
mean value: 2.018380951881409
|
|
|
|
key: score_time
|
|
value: [0.01258206 0.01540279 0.02229261 0.01906872 0.01492906 0.01484323
|
|
0.01455307 0.01488876 0.01322269 0.01295877]
|
|
|
|
mean value: 0.015474176406860352
|
|
|
|
key: test_mcc
|
|
value: [0.8951918 0.8951918 0.8615634 0.82942474 0.72133224 0.86189955
|
|
0.79161589 0.85960591 0.78940887 0.82512315]
|
|
|
|
mean value: 0.8330357351831268
|
|
|
|
key: train_mcc
|
|
value: [0.98831147 0.99610895 0.99610895 0.99610895 0.99610895 0.9922027
|
|
0.99610889 0.99610895 0.9922027 0.9922027 ]
|
|
|
|
mean value: 0.9941573207007584
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.94736842 0.92982456 0.9122807 0.85964912 0.92982456
|
|
0.89473684 0.92982456 0.89473684 0.9122807 ]
|
|
|
|
mean value: 0.9157894736842105
|
|
|
|
key: train_accuracy
|
|
value: [0.99415205 0.99805068 0.99805068 0.99805068 0.99805068 0.99610136
|
|
0.99805068 0.99805068 0.99610136 0.99610136]
|
|
|
|
mean value: 0.9970760233918128
|
|
|
|
key: test_fscore
|
|
value: [0.94915254 0.94915254 0.93333333 0.90909091 0.85714286 0.93103448
|
|
0.89655172 0.92857143 0.89285714 0.9122807 ]
|
|
|
|
mean value: 0.9159167664392371
|
|
|
|
key: train_fscore
|
|
value: [0.99412916 0.99805068 0.99805068 0.99805068 0.99805068 0.99610895
|
|
0.99805825 0.99805068 0.99610895 0.99610895]
|
|
|
|
mean value: 0.9970767670494975
|
|
|
|
key: test_precision
|
|
value: [0.93333333 0.93333333 0.90322581 0.96153846 0.88888889 0.9
|
|
0.86666667 0.92857143 0.89285714 0.89655172]
|
|
|
|
mean value: 0.91049667857788
|
|
|
|
key: train_precision
|
|
value: [0.99607843 0.99610895 0.99610895 0.99610895 0.99610895 0.99610895
|
|
0.99612403 1. 0.99610895 0.99610895]
|
|
|
|
mean value: 0.9964965108294698
|
|
|
|
key: test_recall
|
|
value: [0.96551724 0.96551724 0.96551724 0.86206897 0.82758621 0.96428571
|
|
0.92857143 0.92857143 0.89285714 0.92857143]
|
|
|
|
mean value: 0.9229064039408867
|
|
|
|
key: train_recall
|
|
value: [0.9921875 1. 1. 1. 1. 0.99610895
|
|
1. 0.99610895 0.99610895 0.99610895]
|
|
|
|
mean value: 0.997662329766537
|
|
|
|
key: test_roc_auc
|
|
value: [0.94704433 0.94704433 0.92918719 0.91317734 0.86022167 0.93041872
|
|
0.8953202 0.92980296 0.89470443 0.91256158]
|
|
|
|
mean value: 0.915948275862069
|
|
|
|
key: train_roc_auc
|
|
value: [0.99414822 0.99805447 0.99805447 0.99805447 0.99805447 0.99610135
|
|
0.99804688 0.99805447 0.99610135 0.99610135]
|
|
|
|
mean value: 0.997077152237354
|
|
|
|
key: test_jcc
|
|
value: [0.90322581 0.90322581 0.875 0.83333333 0.75 0.87096774
|
|
0.8125 0.86666667 0.80645161 0.83870968]
|
|
|
|
mean value: 0.846008064516129
|
|
|
|
key: train_jcc
|
|
value: [0.98832685 0.99610895 0.99610895 0.99610895 0.99610895 0.99224806
|
|
0.99612403 0.99610895 0.99224806 0.99224806]
|
|
|
|
mean value: 0.9941739812385003
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03231049 0.02459979 0.0224762 0.02270222 0.02196383 0.02153134
|
|
0.02374744 0.02357459 0.02356982 0.02394891]
|
|
|
|
mean value: 0.024042463302612303
|
|
|
|
key: score_time
|
|
value: [0.01236773 0.00948834 0.00888968 0.00900292 0.00900364 0.00897312
|
|
0.00892997 0.00917411 0.00890827 0.00901175]
|
|
|
|
mean value: 0.00937495231628418
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 0.93202124 0.92980296 0.92980296 0.8953202 0.85960591
|
|
0.96551724 0.89952865 0.92980296 0.96547546]
|
|
|
|
mean value: 0.9272394802485838
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.96491228 0.96491228 0.96491228 0.94736842 0.92982456
|
|
0.98245614 0.94736842 0.96491228 0.98245614]
|
|
|
|
mean value: 0.9631578947368421
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 0.96666667 0.96551724 0.96551724 0.94736842 0.92857143
|
|
0.98245614 0.94339623 0.96428571 0.98181818]
|
|
|
|
mean value: 0.9628053402270093
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.93548387 0.96551724 0.96551724 0.96428571 0.92857143
|
|
0.96551724 1. 0.96428571 1. ]
|
|
|
|
mean value: 0.968917845224853
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96551724 1. 0.96551724 0.96551724 0.93103448 0.92857143
|
|
1. 0.89285714 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9577586206896552
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.96428571 0.96490148 0.96490148 0.9476601 0.92980296
|
|
0.98275862 0.94642857 0.96490148 0.98214286]
|
|
|
|
mean value: 0.9630541871921183
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 0.93548387 0.93333333 0.93333333 0.9 0.86666667
|
|
0.96551724 0.89285714 0.93103448 0.96428571]
|
|
|
|
mean value: 0.9288029026961174
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.79
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.12038136 0.12122679 0.119277 0.12026787 0.12030101 0.12140536
|
|
0.12146163 0.11938405 0.12679267 0.11983061]
|
|
|
|
mean value: 0.12103283405303955
|
|
|
|
key: score_time
|
|
value: [0.01790071 0.01808691 0.01780415 0.01798964 0.01912498 0.01788235
|
|
0.01818609 0.01802516 0.01791906 0.01787663]
|
|
|
|
mean value: 0.018079566955566406
|
|
|
|
key: test_mcc
|
|
value: [0.8951918 0.8953202 0.8615634 0.82490815 0.68850906 0.89988258
|
|
0.82942474 0.78940887 0.9321832 0.89988258]
|
|
|
|
mean value: 0.851627457182217
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.94736842 0.92982456 0.9122807 0.84210526 0.94736842
|
|
0.9122807 0.89473684 0.96491228 0.94736842]
|
|
|
|
mean value: 0.9245614035087719
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94915254 0.94736842 0.93333333 0.91525424 0.83636364 0.94915254
|
|
0.91525424 0.89285714 0.96551724 0.94915254]
|
|
|
|
mean value: 0.925340587668097
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.93333333 0.96428571 0.90322581 0.9 0.88461538 0.90322581
|
|
0.87096774 0.89285714 0.93333333 0.90322581]
|
|
|
|
mean value: 0.908907006971523
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96551724 0.93103448 0.96551724 0.93103448 0.79310345 1.
|
|
0.96428571 0.89285714 1. 1. ]
|
|
|
|
mean value: 0.9443349753694581
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.94704433 0.9476601 0.92918719 0.91194581 0.8429803 0.94827586
|
|
0.91317734 0.89470443 0.96551724 0.94827586]
|
|
|
|
mean value: 0.9248768472906403
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.90322581 0.9 0.875 0.84375 0.71875 0.90322581
|
|
0.84375 0.80645161 0.93333333 0.90322581]
|
|
|
|
mean value: 0.8630712365591398
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.75
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01026869 0.01040196 0.01036501 0.01076531 0.01043916 0.01040483
|
|
0.01052999 0.01043129 0.01045632 0.01034832]
|
|
|
|
mean value: 0.010441088676452636
|
|
|
|
key: score_time
|
|
value: [0.00883389 0.00883102 0.00883794 0.00973797 0.00897551 0.00892496
|
|
0.00891161 0.0089035 0.00878906 0.00879788]
|
|
|
|
mean value: 0.008954334259033202
|
|
|
|
key: test_mcc
|
|
value: [0.54592083 0.54433498 0.58076493 0.54592083 0.30745722 0.33621986
|
|
0.71921182 0.26802813 0.72133224 0.54592083]
|
|
|
|
mean value: 0.5115111682505243
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.77192982 0.77192982 0.78947368 0.77192982 0.64912281 0.66666667
|
|
0.85964912 0.63157895 0.85964912 0.77192982]
|
|
|
|
mean value: 0.7543859649122807
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.78688525 0.77192982 0.78571429 0.78688525 0.61538462 0.6779661
|
|
0.85714286 0.57142857 0.86206897 0.75471698]
|
|
|
|
mean value: 0.7470122694379244
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.75 0.78571429 0.81481481 0.75 0.69565217 0.64516129
|
|
0.85714286 0.66666667 0.83333333 0.8 ]
|
|
|
|
mean value: 0.7598485421907581
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.82758621 0.75862069 0.75862069 0.82758621 0.55172414 0.71428571
|
|
0.85714286 0.5 0.89285714 0.71428571]
|
|
|
|
mean value: 0.7402709359605911
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.77093596 0.77216749 0.79002463 0.77093596 0.65086207 0.66748768
|
|
0.85960591 0.62931034 0.86022167 0.77093596]
|
|
|
|
mean value: 0.7542487684729063
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.64864865 0.62857143 0.64705882 0.64864865 0.44444444 0.51282051
|
|
0.75 0.4 0.75757576 0.60606061]
|
|
|
|
mean value: 0.6043828870299459
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.36
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.86249709 1.87444353 1.87393451 1.88897467 1.92013979 1.87955761
|
|
1.87557983 1.8870728 1.8653574 1.87198281]
|
|
|
|
mean value: 1.8799540042877196
|
|
|
|
key: score_time
|
|
value: [0.09142756 0.09387064 0.09172773 0.09915471 0.09149933 0.09835124
|
|
0.0914166 0.09191871 0.09128451 0.0958066 ]
|
|
|
|
mean value: 0.0936457633972168
|
|
|
|
key: test_mcc
|
|
value: [1. 0.93202124 0.93202124 0.8951918 0.8951918 0.89988258
|
|
0.9321832 0.92980296 0.96551724 0.96551724]
|
|
|
|
mean value: 0.9347329303798759
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.96491228 0.96491228 0.94736842 0.94736842 0.94736842
|
|
0.96491228 0.96491228 0.98245614 0.98245614]
|
|
|
|
mean value: 0.9666666666666667
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.96666667 0.96666667 0.94915254 0.94915254 0.94915254
|
|
0.96551724 0.96428571 0.98245614 0.98245614]
|
|
|
|
mean value: 0.9675506196818756
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.93548387 0.93548387 0.93333333 0.93333333 0.90322581
|
|
0.93333333 0.96428571 0.96551724 0.96551724]
|
|
|
|
mean value: 0.9469513745431432
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 0.96551724 0.96551724 1.
|
|
1. 0.96428571 1. 1. ]
|
|
|
|
mean value: 0.9895320197044335
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.96428571 0.96428571 0.94704433 0.94704433 0.94827586
|
|
0.96551724 0.96490148 0.98275862 0.98275862]
|
|
|
|
mean value: 0.9666871921182266
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.93548387 0.93548387 0.90322581 0.90322581 0.90322581
|
|
0.93333333 0.93103448 0.96551724 0.96551724]
|
|
|
|
mean value: 0.9376047460140897
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.07170582 0.97667527 1.01005721 0.98215485 0.97630405 0.96141648
|
|
0.98975873 1.01708627 0.98110008 1.01926374]
|
|
|
|
mean value: 0.9985522508621216
|
|
|
|
key: score_time
|
|
value: [0.24334216 0.19834042 0.2711153 0.20498848 0.18867207 0.28040957
|
|
0.27722764 0.26618004 0.23420763 0.25020599]
|
|
|
|
mean value: 0.2414689302444458
|
|
|
|
key: test_mcc
|
|
value: [0.96547546 0.96547546 0.93202124 0.8615634 0.8615634 0.86851042
|
|
0.9321832 0.8953202 0.96551724 0.96551724]
|
|
|
|
mean value: 0.9213147255333994
|
|
|
|
key: train_mcc
|
|
value: [0.9652735 0.96907736 0.97289533 0.96907736 0.965509 0.98057338
|
|
0.96526984 0.96907457 0.96526984 0.96907457]
|
|
|
|
mean value: 0.9691094739529907
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.98245614 0.96491228 0.92982456 0.92982456 0.92982456
|
|
0.96491228 0.94736842 0.98245614 0.98245614]
|
|
|
|
mean value: 0.9596491228070175
|
|
|
|
key: train_accuracy
|
|
value: [0.98245614 0.98440546 0.98635478 0.98440546 0.98245614 0.99025341
|
|
0.98245614 0.98440546 0.98245614 0.98440546]
|
|
|
|
mean value: 0.9844054580896686
|
|
|
|
key: test_fscore
|
|
value: [0.98305085 0.98305085 0.96666667 0.93333333 0.93333333 0.93333333
|
|
0.96551724 0.94736842 0.98245614 0.98245614]
|
|
|
|
mean value: 0.9610566304715618
|
|
|
|
key: train_fscore
|
|
value: [0.98265896 0.98455598 0.98646035 0.98455598 0.98272553 0.99032882
|
|
0.98272553 0.98461538 0.98272553 0.98461538]
|
|
|
|
mean value: 0.9845967449652123
|
|
|
|
key: test_precision
|
|
value: [0.96666667 0.96666667 0.93548387 0.90322581 0.90322581 0.875
|
|
0.93333333 0.93103448 0.96551724 0.96551724]
|
|
|
|
mean value: 0.9345671116054876
|
|
|
|
key: train_precision
|
|
value: [0.96958175 0.97328244 0.97701149 0.97328244 0.96603774 0.98461538
|
|
0.96969697 0.97338403 0.96969697 0.97338403]
|
|
|
|
mean value: 0.9729973249493369
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 0.96551724 0.96551724 1.
|
|
1. 0.96428571 1. 1. ]
|
|
|
|
mean value: 0.9895320197044335
|
|
|
|
key: train_recall
|
|
value: [0.99609375 0.99609375 0.99609375 0.99609375 1. 0.99610895
|
|
0.99610895 0.99610895 0.99610895 0.99610895]
|
|
|
|
mean value: 0.9964919747081712
|
|
|
|
key: test_roc_auc
|
|
value: [0.98214286 0.98214286 0.96428571 0.92918719 0.92918719 0.93103448
|
|
0.96551724 0.9476601 0.98275862 0.98275862]
|
|
|
|
mean value: 0.9596674876847291
|
|
|
|
key: train_roc_auc
|
|
value: [0.98248267 0.9844282 0.98637372 0.9844282 0.98249027 0.99024197
|
|
0.98242947 0.9843826 0.98242947 0.9843826 ]
|
|
|
|
mean value: 0.9844069187743191
|
|
|
|
key: test_jcc
|
|
value: [0.96666667 0.96666667 0.93548387 0.875 0.875 0.875
|
|
0.93333333 0.9 0.96551724 0.96551724]
|
|
|
|
mean value: 0.9258185020393029
|
|
|
|
key: train_jcc
|
|
value: [0.96590909 0.96958175 0.97328244 0.96958175 0.96603774 0.98084291
|
|
0.96603774 0.96969697 0.96603774 0.96969697]
|
|
|
|
mean value: 0.9696705090574546
|
|
|
|
MCC on Blind test: 0.94
|
|
|
|
Accuracy on Blind test: 0.97
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02531004 0.01198626 0.01208639 0.01204205 0.01192236 0.01199722
|
|
0.01211691 0.01198721 0.01262665 0.0113945 ]
|
|
|
|
mean value: 0.01334695816040039
|
|
|
|
key: score_time
|
|
value: [0.01192689 0.00937462 0.01031423 0.00997972 0.01006103 0.01001763
|
|
0.01013184 0.01009941 0.00948906 0.00999475]
|
|
|
|
mean value: 0.010138916969299316
|
|
|
|
key: test_mcc
|
|
value: [0.71921182 0.86189955 0.71921182 0.65104858 0.51048128 0.7589669
|
|
0.82942474 0.69581469 0.78940887 0.65018988]
|
|
|
|
mean value: 0.7185658130544192
|
|
|
|
key: train_mcc
|
|
value: [0.75451908 0.73892092 0.74278722 0.75068043 0.77951916 0.75838325
|
|
0.75443104 0.75058523 0.7505054 0.75048638]
|
|
|
|
mean value: 0.7530818106690766
|
|
|
|
key: test_accuracy
|
|
value: [0.85964912 0.92982456 0.85964912 0.8245614 0.75438596 0.87719298
|
|
0.9122807 0.84210526 0.89473684 0.8245614 ]
|
|
|
|
mean value: 0.8578947368421053
|
|
|
|
key: train_accuracy
|
|
value: [0.87719298 0.86939571 0.87134503 0.87524366 0.88888889 0.8791423
|
|
0.87719298 0.87524366 0.87524366 0.87524366]
|
|
|
|
mean value: 0.8764132553606238
|
|
|
|
key: test_fscore
|
|
value: [0.86206897 0.92857143 0.86206897 0.82142857 0.75 0.88135593
|
|
0.91525424 0.85245902 0.89285714 0.81481481]
|
|
|
|
mean value: 0.8580879074591409
|
|
|
|
key: train_fscore
|
|
value: [0.87573964 0.8678501 0.87209302 0.87351779 0.89224953 0.87843137
|
|
0.87814313 0.8745098 0.87596899 0.87548638]
|
|
|
|
mean value: 0.8763989764320921
|
|
|
|
key: test_precision
|
|
value: [0.86206897 0.96296296 0.86206897 0.85185185 0.77777778 0.83870968
|
|
0.87096774 0.78787879 0.89285714 0.84615385]
|
|
|
|
mean value: 0.8553297719871691
|
|
|
|
key: train_precision
|
|
value: [0.88446215 0.87649402 0.86538462 0.884 0.86446886 0.88537549
|
|
0.87307692 0.88142292 0.87258687 0.87548638]
|
|
|
|
mean value: 0.876275825111137
|
|
|
|
key: test_recall
|
|
value: [0.86206897 0.89655172 0.86206897 0.79310345 0.72413793 0.92857143
|
|
0.96428571 0.92857143 0.89285714 0.78571429]
|
|
|
|
mean value: 0.8637931034482759
|
|
|
|
key: train_recall
|
|
value: [0.8671875 0.859375 0.87890625 0.86328125 0.921875 0.87159533
|
|
0.88326848 0.86770428 0.87937743 0.87548638]
|
|
|
|
mean value: 0.8768056906614786
|
|
|
|
key: test_roc_auc
|
|
value: [0.85960591 0.93041872 0.85960591 0.82512315 0.75492611 0.87807882
|
|
0.91317734 0.84359606 0.89470443 0.82389163]
|
|
|
|
mean value: 0.8583128078817734
|
|
|
|
key: train_roc_auc
|
|
value: [0.87717352 0.86937622 0.87135974 0.87522039 0.88895306 0.87915704
|
|
0.87718112 0.87525839 0.87523559 0.87524319]
|
|
|
|
mean value: 0.8764158256322957
|
|
|
|
key: test_jcc
|
|
value: [0.75757576 0.86666667 0.75757576 0.6969697 0.6 0.78787879
|
|
0.84375 0.74285714 0.80645161 0.6875 ]
|
|
|
|
mean value: 0.7547225422427035
|
|
|
|
key: train_jcc
|
|
value: [0.77894737 0.76655052 0.77319588 0.7754386 0.80546075 0.78321678
|
|
0.78275862 0.77700348 0.77931034 0.77854671]
|
|
|
|
mean value: 0.7800429060559617
|
|
|
|
MCC on Blind test: 0.6
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.10502219 0.07858324 0.08344984 0.08857894 0.08617067 0.090137
|
|
0.08256125 0.0888052 0.08122015 0.08864975]
|
|
|
|
mean value: 0.0873178243637085
|
|
|
|
key: score_time
|
|
value: [0.01173377 0.01130414 0.01159191 0.01140666 0.01149297 0.01123619
|
|
0.01140857 0.01112556 0.01298761 0.01339841]
|
|
|
|
mean value: 0.011768579483032227
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 0.96547546 0.93202124 0.92980296 0.96551724 0.8953202
|
|
0.96551724 0.96551724 0.96551724 1. ]
|
|
|
|
mean value: 0.9550206057552516
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.98245614 0.96491228 0.96491228 0.98245614 0.94736842
|
|
0.98245614 0.98245614 0.98245614 1. ]
|
|
|
|
mean value: 0.9771929824561403
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 0.98305085 0.96666667 0.96551724 0.98245614 0.94736842
|
|
0.98245614 0.98245614 0.98245614 1. ]
|
|
|
|
mean value: 0.9774883878310622
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.96666667 0.93548387 0.96551724 1. 0.93103448
|
|
0.96551724 0.96551724 0.96551724 1. ]
|
|
|
|
mean value: 0.969525398591027
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96551724 1. 1. 0.96551724 0.96551724 0.96428571
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9860837438423645
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.98214286 0.96428571 0.96490148 0.98275862 0.9476601
|
|
0.98275862 0.98275862 0.98275862 1. ]
|
|
|
|
mean value: 0.9772783251231528
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 0.96666667 0.93548387 0.93333333 0.96551724 0.9
|
|
0.96551724 0.96551724 0.96551724 1. ]
|
|
|
|
mean value: 0.9563070077864294
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.04534984 0.05460882 0.04289627 0.07200694 0.07619452 0.06920385
|
|
0.04467583 0.07594752 0.07032585 0.07389331]
|
|
|
|
mean value: 0.06251027584075927
|
|
|
|
key: score_time
|
|
value: [0.021312 0.01302409 0.01268458 0.01460743 0.02094841 0.01261997
|
|
0.01289773 0.01310945 0.01993465 0.01473045]
|
|
|
|
mean value: 0.01558687686920166
|
|
|
|
key: test_mcc
|
|
value: [0.89988258 0.9321832 0.85960591 0.8953202 0.86189955 0.86189955
|
|
0.96547546 0.82880708 1. 0.85960591]
|
|
|
|
mean value: 0.8964679428369845
|
|
|
|
key: train_mcc
|
|
value: [0.97277661 0.97672758 0.98051435 0.98057426 0.98057426 0.97663743
|
|
0.97277537 0.97663743 0.97277537 0.97277537]
|
|
|
|
mean value: 0.9762768016203472
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.96491228 0.92982456 0.94736842 0.92982456 0.92982456
|
|
0.98245614 0.9122807 1. 0.92982456]
|
|
|
|
mean value: 0.9473684210526315
|
|
|
|
key: train_accuracy
|
|
value: [0.98635478 0.98830409 0.99025341 0.99025341 0.99025341 0.98830409
|
|
0.98635478 0.98830409 0.98635478 0.98635478]
|
|
|
|
mean value: 0.9881091617933723
|
|
|
|
key: test_fscore
|
|
value: [0.94545455 0.96428571 0.93103448 0.94736842 0.92857143 0.93103448
|
|
0.98181818 0.90566038 1. 0.92857143]
|
|
|
|
mean value: 0.9463799062629662
|
|
|
|
key: train_fscore
|
|
value: [0.98640777 0.98837209 0.99025341 0.99029126 0.99029126 0.98837209
|
|
0.98646035 0.98837209 0.98646035 0.98646035]
|
|
|
|
mean value: 0.9881741026125374
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.93103448 0.96428571 0.96296296 0.9
|
|
1. 0.96 1. 0.92857143]
|
|
|
|
mean value: 0.9646854588578726
|
|
|
|
key: train_precision
|
|
value: [0.98069498 0.98076923 0.98832685 0.98455598 0.98455598 0.98455598
|
|
0.98076923 0.98455598 0.98076923 0.98076923]
|
|
|
|
mean value: 0.9830322690244869
|
|
|
|
key: test_recall
|
|
value: [0.89655172 0.93103448 0.93103448 0.93103448 0.89655172 0.96428571
|
|
0.96428571 0.85714286 1. 0.92857143]
|
|
|
|
mean value: 0.9300492610837439
|
|
|
|
key: train_recall
|
|
value: [0.9921875 0.99609375 0.9921875 0.99609375 0.99609375 0.9922179
|
|
0.9922179 0.9922179 0.9922179 0.9922179 ]
|
|
|
|
mean value: 0.9933745744163425
|
|
|
|
key: test_roc_auc
|
|
value: [0.94827586 0.96551724 0.92980296 0.9476601 0.93041872 0.93041872
|
|
0.98214286 0.91133005 1. 0.92980296]
|
|
|
|
mean value: 0.9475369458128079
|
|
|
|
key: train_roc_auc
|
|
value: [0.98636612 0.98831925 0.99025717 0.99026477 0.99026477 0.98829645
|
|
0.98634332 0.98829645 0.98634332 0.98634332]
|
|
|
|
mean value: 0.9881094965953308
|
|
|
|
key: test_jcc
|
|
value: [0.89655172 0.93103448 0.87096774 0.9 0.86666667 0.87096774
|
|
0.96428571 0.82758621 1. 0.86666667]
|
|
|
|
mean value: 0.8994726945283119
|
|
|
|
key: train_jcc
|
|
value: [0.97318008 0.97701149 0.98069498 0.98076923 0.98076923 0.97701149
|
|
0.97328244 0.97701149 0.97328244 0.97328244]
|
|
|
|
mean value: 0.976629532986469
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02329779 0.01084447 0.01144338 0.01139617 0.01017046 0.0110743
|
|
0.01069474 0.01005888 0.01109362 0.01126432]
|
|
|
|
mean value: 0.01213381290435791
|
|
|
|
key: score_time
|
|
value: [0.0097506 0.00933862 0.00944877 0.00938034 0.00893426 0.00938964
|
|
0.00931454 0.00969791 0.0094347 0.00964665]
|
|
|
|
mean value: 0.009433603286743164
|
|
|
|
key: test_mcc
|
|
value: [0.86189955 0.96547546 0.68434084 0.75462449 0.47413793 0.7589669
|
|
0.7366424 0.68850906 0.82512315 0.75492611]
|
|
|
|
mean value: 0.7504645885346989
|
|
|
|
key: train_mcc
|
|
value: [0.73976678 0.72388482 0.76290396 0.75133166 0.78043156 0.74303497
|
|
0.75884232 0.76719997 0.73629377 0.75505926]
|
|
|
|
mean value: 0.7518749069827675
|
|
|
|
key: test_accuracy
|
|
value: [0.92982456 0.98245614 0.84210526 0.87719298 0.73684211 0.87719298
|
|
0.85964912 0.84210526 0.9122807 0.87719298]
|
|
|
|
mean value: 0.8736842105263157
|
|
|
|
key: train_accuracy
|
|
value: [0.86939571 0.86159844 0.88109162 0.87524366 0.88888889 0.87134503
|
|
0.8791423 0.88304094 0.86744639 0.87719298]
|
|
|
|
mean value: 0.875438596491228
|
|
|
|
key: test_fscore
|
|
value: [0.92857143 0.98305085 0.84745763 0.88135593 0.73684211 0.88135593
|
|
0.87096774 0.84745763 0.9122807 0.87719298]
|
|
|
|
mean value: 0.8766532926082291
|
|
|
|
key: train_fscore
|
|
value: [0.87238095 0.86424474 0.8833652 0.8778626 0.89305816 0.87356322
|
|
0.88167939 0.88636364 0.87169811 0.88 ]
|
|
|
|
mean value: 0.8784216009065232
|
|
|
|
key: test_precision
|
|
value: [0.96296296 0.96666667 0.83333333 0.86666667 0.75 0.83870968
|
|
0.79411765 0.80645161 0.89655172 0.86206897]
|
|
|
|
mean value: 0.8577529256666206
|
|
|
|
key: train_precision
|
|
value: [0.85130112 0.84644195 0.86516854 0.85820896 0.85920578 0.86037736
|
|
0.86516854 0.86346863 0.84615385 0.8619403 ]
|
|
|
|
mean value: 0.8577435010694252
|
|
|
|
key: test_recall
|
|
value: [0.89655172 1. 0.86206897 0.89655172 0.72413793 0.92857143
|
|
0.96428571 0.89285714 0.92857143 0.89285714]
|
|
|
|
mean value: 0.8986453201970444
|
|
|
|
key: train_recall
|
|
value: [0.89453125 0.8828125 0.90234375 0.8984375 0.9296875 0.88715953
|
|
0.89883268 0.91050584 0.89883268 0.89883268]
|
|
|
|
mean value: 0.9001975924124513
|
|
|
|
key: test_roc_auc
|
|
value: [0.93041872 0.98214286 0.84174877 0.87684729 0.73706897 0.87807882
|
|
0.8614532 0.8429803 0.91256158 0.87746305]
|
|
|
|
mean value: 0.8740763546798029
|
|
|
|
key: train_roc_auc
|
|
value: [0.86944461 0.86163971 0.88113296 0.87528879 0.88896826 0.87131414
|
|
0.87910384 0.88298729 0.86738509 0.87715072]
|
|
|
|
mean value: 0.8754415430447471
|
|
|
|
key: test_jcc
|
|
value: [0.86666667 0.96666667 0.73529412 0.78787879 0.58333333 0.78787879
|
|
0.77142857 0.73529412 0.83870968 0.78125 ]
|
|
|
|
mean value: 0.7854400726566286
|
|
|
|
key: train_jcc
|
|
value: [0.77364865 0.76094276 0.79109589 0.78231293 0.80677966 0.7755102
|
|
0.7883959 0.79591837 0.77257525 0.78571429]
|
|
|
|
mean value: 0.7832893898605223
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02009606 0.02367425 0.02581382 0.02568173 0.02375722 0.02464199
|
|
0.02251601 0.02397561 0.02633715 0.02839375]
|
|
|
|
mean value: 0.02448875904083252
|
|
|
|
key: score_time
|
|
value: [0.01082706 0.0116818 0.01224518 0.01217031 0.01221275 0.01211119
|
|
0.01224017 0.01211429 0.01587439 0.01224256]
|
|
|
|
mean value: 0.012371969223022462
|
|
|
|
key: test_mcc
|
|
value: [0.89988258 0.92980296 0.55317854 0.92980296 0.76689254 0.86189955
|
|
0.76689254 0.92980296 0.92980296 0.96547546]
|
|
|
|
mean value: 0.8533433009431273
|
|
|
|
key: train_mcc
|
|
value: [0.93120523 0.9652735 0.6841678 0.95324446 0.96491921 0.97663814
|
|
0.93901501 0.95324588 0.97289329 0.97271663]
|
|
|
|
mean value: 0.9313319164525132
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.96491228 0.73684211 0.96491228 0.87719298 0.92982456
|
|
0.87719298 0.96491228 0.96491228 0.98245614]
|
|
|
|
mean value: 0.9210526315789473
|
|
|
|
key: train_accuracy
|
|
value: [0.96491228 0.98245614 0.81871345 0.97660819 0.98245614 0.98830409
|
|
0.96881092 0.97660819 0.98635478 0.98635478]
|
|
|
|
mean value: 0.9631578947368421
|
|
|
|
key: test_fscore
|
|
value: [0.94545455 0.96551724 0.79452055 0.96551724 0.86792453 0.93103448
|
|
0.8852459 0.96428571 0.96428571 0.98181818]
|
|
|
|
mean value: 0.9265604099247834
|
|
|
|
key: train_fscore
|
|
value: [0.96385542 0.98265896 0.84628099 0.97647059 0.98238748 0.98828125
|
|
0.96969697 0.9765625 0.98651252 0.98640777]
|
|
|
|
mean value: 0.965911444750535
|
|
|
|
key: test_precision
|
|
value: [1. 0.96551724 0.65909091 0.96551724 0.95833333 0.9
|
|
0.81818182 0.96428571 0.96428571 1. ]
|
|
|
|
mean value: 0.919521197193611
|
|
|
|
key: train_precision
|
|
value: [0.99173554 0.96958175 0.73352436 0.98031496 0.98431373 0.99215686
|
|
0.94464945 0.98039216 0.97709924 0.98449612]
|
|
|
|
mean value: 0.9538264154435027
|
|
|
|
key: test_recall
|
|
value: [0.89655172 0.96551724 1. 0.96551724 0.79310345 0.96428571
|
|
0.96428571 0.96428571 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9442118226600985
|
|
|
|
key: train_recall
|
|
value: [0.9375 0.99609375 1. 0.97265625 0.98046875 0.9844358
|
|
0.99610895 0.97276265 0.99610895 0.98832685]
|
|
|
|
mean value: 0.9824461940661479
|
|
|
|
key: test_roc_auc
|
|
value: [0.94827586 0.96490148 0.73214286 0.96490148 0.87869458 0.93041872
|
|
0.87869458 0.96490148 0.96490148 0.98214286]
|
|
|
|
mean value: 0.9209975369458129
|
|
|
|
key: train_roc_auc
|
|
value: [0.96485895 0.98248267 0.81906615 0.9766005 0.98245227 0.98831165
|
|
0.9687576 0.9766157 0.98633572 0.98635092]
|
|
|
|
mean value: 0.9631832137645915
|
|
|
|
key: test_jcc
|
|
value: [0.89655172 0.93333333 0.65909091 0.93333333 0.76666667 0.87096774
|
|
0.79411765 0.93103448 0.93103448 0.96428571]
|
|
|
|
mean value: 0.8680416035359436
|
|
|
|
key: train_jcc
|
|
value: [0.93023256 0.96590909 0.73352436 0.95402299 0.96538462 0.97683398
|
|
0.94117647 0.95419847 0.97338403 0.97318008]
|
|
|
|
mean value: 0.9367846635991106
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01736093 0.02417779 0.01855564 0.02388573 0.02126384 0.02123237
|
|
0.02325678 0.02157664 0.0230937 0.01871109]
|
|
|
|
mean value: 0.021311450004577636
|
|
|
|
key: score_time
|
|
value: [0.01171088 0.01253772 0.01214719 0.01223183 0.01217747 0.01221824
|
|
0.01223803 0.01537561 0.0122745 0.01589489]
|
|
|
|
mean value: 0.012880635261535645
|
|
|
|
key: test_mcc
|
|
value: [0.93202124 0.92980296 0.7589669 0.8615634 0.75047877 0.89988258
|
|
0.86789789 0.86851042 0.8953202 0.89988258]
|
|
|
|
mean value: 0.8664326936672224
|
|
|
|
key: train_mcc
|
|
value: [0.92383514 0.9652735 0.91600355 0.96509685 0.83295731 0.96147894
|
|
0.96497895 0.9496686 0.93531646 0.93864387]
|
|
|
|
mean value: 0.9353253163476655
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.96491228 0.87719298 0.92982456 0.85964912 0.94736842
|
|
0.92982456 0.92982456 0.94736842 0.94736842]
|
|
|
|
mean value: 0.9298245614035088
|
|
|
|
key: train_accuracy
|
|
value: [0.96101365 0.98245614 0.95711501 0.98245614 0.91033138 0.98050682
|
|
0.98245614 0.97465887 0.9668616 0.96881092]
|
|
|
|
mean value: 0.9666666666666667
|
|
|
|
key: test_fscore
|
|
value: [0.96666667 0.96551724 0.87272727 0.93333333 0.84 0.94915254
|
|
0.92307692 0.93333333 0.94736842 0.94915254]
|
|
|
|
mean value: 0.9280328276315234
|
|
|
|
key: train_fscore
|
|
value: [0.96212121 0.98265896 0.95564516 0.98259188 0.9017094 0.98084291
|
|
0.98238748 0.97504798 0.96786389 0.96958175]
|
|
|
|
mean value: 0.9660450626117191
|
|
|
|
key: test_precision
|
|
value: [0.93548387 0.96551724 0.92307692 0.90322581 1. 0.90322581
|
|
1. 0.875 0.93103448 0.90322581]
|
|
|
|
mean value: 0.9339789937537435
|
|
|
|
key: train_precision
|
|
value: [0.93382353 0.96958175 0.9875 0.97318008 0.99528302 0.96603774
|
|
0.98818898 0.96212121 0.94117647 0.94795539]
|
|
|
|
mean value: 0.9664848159228501
|
|
|
|
key: test_recall
|
|
value: [1. 0.96551724 0.82758621 0.96551724 0.72413793 1.
|
|
0.85714286 1. 0.96428571 1. ]
|
|
|
|
mean value: 0.9304187192118226
|
|
|
|
key: train_recall
|
|
value: [0.9921875 0.99609375 0.92578125 0.9921875 0.82421875 0.99610895
|
|
0.9766537 0.98832685 0.99610895 0.9922179 ]
|
|
|
|
mean value: 0.9679885092412451
|
|
|
|
key: test_roc_auc
|
|
value: [0.96428571 0.96490148 0.87807882 0.92918719 0.86206897 0.94827586
|
|
0.92857143 0.93103448 0.9476601 0.94827586]
|
|
|
|
mean value: 0.9302339901477833
|
|
|
|
key: train_roc_auc
|
|
value: [0.96107429 0.98248267 0.95705405 0.98247507 0.91016385 0.98047635
|
|
0.98246747 0.97463217 0.96680447 0.9687652 ]
|
|
|
|
mean value: 0.966639561040856
|
|
|
|
key: test_jcc
|
|
value: [0.93548387 0.93333333 0.77419355 0.875 0.72413793 0.90322581
|
|
0.85714286 0.875 0.9 0.90322581]
|
|
|
|
mean value: 0.8680743153768737
|
|
|
|
key: train_jcc
|
|
value: [0.9270073 0.96590909 0.91505792 0.96577947 0.82101167 0.96240602
|
|
0.96538462 0.95131086 0.93772894 0.94095941]
|
|
|
|
mean value: 0.9352555285237902
|
|
|
|
MCC on Blind test: 0.88
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.22136092 0.2146697 0.22013402 0.21376967 0.21607924 0.21981454
|
|
0.21257806 0.2166841 0.21159649 0.21193361]
|
|
|
|
mean value: 0.21586203575134277
|
|
|
|
key: score_time
|
|
value: [0.01560521 0.01587868 0.01571321 0.01564407 0.01641417 0.01539636
|
|
0.01562452 0.01541305 0.01549864 0.01545691]
|
|
|
|
mean value: 0.01566448211669922
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 0.96551724 0.93202124 0.96547546 0.96551724 0.86851042
|
|
0.96551724 0.93202124 1. 0.92980296]
|
|
|
|
mean value: 0.9489900277779855
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.98245614 0.96491228 0.98245614 0.98245614 0.92982456
|
|
0.98245614 0.96491228 1. 0.96491228]
|
|
|
|
mean value: 0.9736842105263157
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 0.98245614 0.96666667 0.98305085 0.98245614 0.93333333
|
|
0.98245614 0.96296296 1. 0.96428571]
|
|
|
|
mean value: 0.9740124086109813
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.93548387 0.96666667 1. 0.875
|
|
0.96551724 1. 1. 0.96428571]
|
|
|
|
mean value: 0.9706953493299433
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96551724 0.96551724 1. 1. 0.96551724 1.
|
|
1. 0.92857143 1. 0.96428571]
|
|
|
|
mean value: 0.9789408866995074
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.98275862 0.96428571 0.98214286 0.98275862 0.93103448
|
|
0.98275862 0.96428571 1. 0.96490148]
|
|
|
|
mean value: 0.973768472906404
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 0.96551724 0.93548387 0.96666667 0.96551724 0.875
|
|
0.96551724 0.92857143 1. 0.93103448]
|
|
|
|
mean value: 0.94988254144817
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.89
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.07275224 0.07551146 0.07736611 0.08631492 0.08354521 0.08567595
|
|
0.09274006 0.08661175 0.08095288 0.07556176]
|
|
|
|
mean value: 0.08170323371887207
|
|
|
|
key: score_time
|
|
value: [0.02348256 0.03980899 0.02562881 0.04097319 0.04085207 0.04124117
|
|
0.03203225 0.03526425 0.02682853 0.02724028]
|
|
|
|
mean value: 0.033335208892822266
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 0.96551724 0.93202124 0.92980296 0.8953202 0.85960591
|
|
0.92980296 0.96547546 0.96551724 0.96547546]
|
|
|
|
mean value: 0.9374055898309153
|
|
|
|
key: train_mcc
|
|
value: [0.9922027 0.9922027 1. 0.9922027 0.99223298 0.99223298
|
|
0.9922027 0.99610895 0.98831165 0.99610895]
|
|
|
|
mean value: 0.9933806305949693
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.98245614 0.96491228 0.96491228 0.94736842 0.92982456
|
|
0.96491228 0.98245614 0.98245614 0.98245614]
|
|
|
|
mean value: 0.968421052631579
|
|
|
|
key: train_accuracy
|
|
value: [0.99610136 0.99610136 1. 0.99610136 0.99610136 0.99610136
|
|
0.99610136 0.99805068 0.99415205 0.99805068]
|
|
|
|
mean value: 0.9966861598440546
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 0.98245614 0.96666667 0.96551724 0.94736842 0.92857143
|
|
0.96428571 0.98181818 0.98245614 0.98181818]
|
|
|
|
mean value: 0.9683414256644747
|
|
|
|
key: train_fscore
|
|
value: [0.99609375 0.99609375 1. 0.99609375 0.99610895 0.99609375
|
|
0.99610895 0.99805068 0.99415205 0.99805068]
|
|
|
|
mean value: 0.9966846310138728
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.93548387 0.96551724 0.96428571 0.92857143
|
|
0.96428571 1. 0.96551724 1. ]
|
|
|
|
mean value: 0.972366121086922
|
|
|
|
key: train_precision
|
|
value: [0.99609375 0.99609375 1. 0.99609375 0.99224806 1.
|
|
0.99610895 1. 0.99609375 1. ]
|
|
|
|
mean value: 0.9972732011431846
|
|
|
|
key: test_recall
|
|
value: [0.96551724 0.96551724 1. 0.96551724 0.93103448 0.92857143
|
|
0.96428571 0.96428571 1. 0.96428571]
|
|
|
|
mean value: 0.9649014778325123
|
|
|
|
key: train_recall
|
|
value: [0.99609375 0.99609375 1. 0.99609375 1. 0.9922179
|
|
0.99610895 0.99610895 0.9922179 0.99610895]
|
|
|
|
mean value: 0.9961043895914397
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.98275862 0.96428571 0.96490148 0.9476601 0.92980296
|
|
0.96490148 0.98214286 0.98275862 0.98214286]
|
|
|
|
mean value: 0.9684113300492612
|
|
|
|
key: train_roc_auc
|
|
value: [0.99610135 0.99610135 1. 0.99610135 0.99610895 0.99610895
|
|
0.99610135 0.99805447 0.99415582 0.99805447]
|
|
|
|
mean value: 0.9966888071498055
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 0.96551724 0.93548387 0.93333333 0.9 0.86666667
|
|
0.93103448 0.96428571 0.96551724 0.96428571]
|
|
|
|
mean value: 0.9391641506435723
|
|
|
|
key: train_jcc
|
|
value: [0.9922179 0.9922179 1. 0.9922179 0.99224806 0.9922179
|
|
0.99224806 0.99610895 0.98837209 0.99610895]
|
|
|
|
mean value: 0.9933957711217688
|
|
|
|
MCC on Blind test: 0.79
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.22260284 0.25764489 0.23697352 0.23838496 0.2421689 0.22621894
|
|
0.18569469 0.18221235 0.21668482 0.20190597]
|
|
|
|
mean value: 0.22104918956756592
|
|
|
|
key: score_time
|
|
value: [0.02602887 0.0259223 0.02647781 0.02608204 0.0156517 0.02640224
|
|
0.03134131 0.01563168 0.01565242 0.02606297]
|
|
|
|
mean value: 0.023525333404541014
|
|
|
|
key: test_mcc
|
|
value: [0.71921182 0.7589669 0.65018988 0.68434084 0.64901478 0.75492611
|
|
0.75492611 0.61805122 0.85960591 0.61453202]
|
|
|
|
mean value: 0.706376559482153
|
|
|
|
key: train_mcc
|
|
value: [0.98443509 0.9766081 0.98831165 0.98051435 0.9844054 0.9766081
|
|
0.97277537 0.98051405 0.9766081 0.9766081 ]
|
|
|
|
mean value: 0.9797388296638977
|
|
|
|
key: test_accuracy
|
|
value: [0.85964912 0.87719298 0.8245614 0.84210526 0.8245614 0.87719298
|
|
0.87719298 0.80701754 0.92982456 0.80701754]
|
|
|
|
mean value: 0.8526315789473684
|
|
|
|
key: train_accuracy
|
|
value: [0.99220273 0.98830409 0.99415205 0.99025341 0.99220273 0.98830409
|
|
0.98635478 0.99025341 0.98830409 0.98830409]
|
|
|
|
mean value: 0.9898635477582846
|
|
|
|
key: test_fscore
|
|
value: [0.86206897 0.87272727 0.83333333 0.84745763 0.82758621 0.87719298
|
|
0.87719298 0.81355932 0.92857143 0.80701754]
|
|
|
|
mean value: 0.85467076649703
|
|
|
|
key: train_fscore
|
|
value: [0.99215686 0.98828125 0.99415205 0.99025341 0.9921875 0.98832685
|
|
0.98646035 0.99029126 0.98832685 0.98832685]
|
|
|
|
mean value: 0.9898763225880247
|
|
|
|
key: test_precision
|
|
value: [0.86206897 0.92307692 0.80645161 0.83333333 0.82758621 0.86206897
|
|
0.86206897 0.77419355 0.92857143 0.79310345]
|
|
|
|
mean value: 0.8472523397996146
|
|
|
|
key: train_precision
|
|
value: [0.99606299 0.98828125 0.9922179 0.98832685 0.9921875 0.98832685
|
|
0.98076923 0.98837209 0.98832685 0.98832685]
|
|
|
|
mean value: 0.9891198357747265
|
|
|
|
key: test_recall
|
|
value: [0.86206897 0.82758621 0.86206897 0.86206897 0.82758621 0.89285714
|
|
0.89285714 0.85714286 0.92857143 0.82142857]
|
|
|
|
mean value: 0.863423645320197
|
|
|
|
key: train_recall
|
|
value: [0.98828125 0.98828125 0.99609375 0.9921875 0.9921875 0.98832685
|
|
0.9922179 0.9922179 0.98832685 0.98832685]
|
|
|
|
mean value: 0.9906447592412452
|
|
|
|
key: test_roc_auc
|
|
value: [0.85960591 0.87807882 0.82389163 0.84174877 0.82450739 0.87746305
|
|
0.87746305 0.80788177 0.92980296 0.80726601]
|
|
|
|
mean value: 0.8527709359605911
|
|
|
|
key: train_roc_auc
|
|
value: [0.9921951 0.98830405 0.99415582 0.99025717 0.9922027 0.98830405
|
|
0.98634332 0.99024957 0.98830405 0.98830405]
|
|
|
|
mean value: 0.9898619892996109
|
|
|
|
key: test_jcc
|
|
value: [0.75757576 0.77419355 0.71428571 0.73529412 0.70588235 0.78125
|
|
0.78125 0.68571429 0.86666667 0.67647059]
|
|
|
|
mean value: 0.7478583031453051
|
|
|
|
key: train_jcc
|
|
value: [0.9844358 0.97683398 0.98837209 0.98069498 0.98449612 0.97692308
|
|
0.97328244 0.98076923 0.97692308 0.97692308]
|
|
|
|
mean value: 0.9799653876535144
|
|
|
|
MCC on Blind test: 0.45
|
|
|
|
Accuracy on Blind test: 0.75
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.86680079 0.85779858 0.87546086 0.87063336 0.85156989 0.8605125
|
|
0.87048817 0.85345745 0.84471488 0.86324358]
|
|
|
|
mean value: 0.8614680051803589
|
|
|
|
key: score_time
|
|
value: [0.00975132 0.01042104 0.00950241 0.00965095 0.00974298 0.00994873
|
|
0.00963378 0.0094924 0.01010132 0.00952649]
|
|
|
|
mean value: 0.009777140617370606
|
|
|
|
key: test_mcc
|
|
value: [1. 0.93202124 0.93202124 0.92980296 0.8951918 0.8953202
|
|
0.96551724 0.96547546 0.96551724 0.96547546]
|
|
|
|
mean value: 0.9446342833958448
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.96491228 0.96491228 0.96491228 0.94736842 0.94736842
|
|
0.98245614 0.98245614 0.98245614 0.98245614]
|
|
|
|
mean value: 0.9719298245614034
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.96666667 0.96666667 0.96551724 0.94915254 0.94736842
|
|
0.98245614 0.98181818 0.98245614 0.98181818]
|
|
|
|
mean value: 0.9723920182476274
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.93548387 0.93548387 0.96551724 0.93333333 0.93103448
|
|
0.96551724 1. 0.96551724 1. ]
|
|
|
|
mean value: 0.9631887282165369
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 0.96551724 0.96551724 0.96428571
|
|
1. 0.96428571 1. 0.96428571]
|
|
|
|
mean value: 0.9823891625615764
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.96428571 0.96428571 0.96490148 0.94704433 0.9476601
|
|
0.98275862 0.98214286 0.98275862 0.98214286]
|
|
|
|
mean value: 0.9717980295566503
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.93548387 0.93548387 0.93333333 0.90322581 0.9
|
|
0.96551724 0.96428571 0.96551724 0.96428571]
|
|
|
|
mean value: 0.9467132793050479
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.89
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.03680491 0.03327942 0.03254247 0.03225303 0.03544617 0.03223276
|
|
0.03192067 0.03175664 0.03236866 0.03210115]
|
|
|
|
mean value: 0.03307058811187744
|
|
|
|
key: score_time
|
|
value: [0.0125246 0.01293564 0.01295924 0.0146606 0.01284409 0.01471877
|
|
0.01512527 0.0162437 0.01573086 0.02103305]
|
|
|
|
mean value: 0.014877581596374511
|
|
|
|
key: test_mcc
|
|
value: [0.6317806 0.58358651 0.622444 0.6166424 0.75462449 0.54759338
|
|
0.53222729 0.72706729 0.72064772 0.68850906]
|
|
|
|
mean value: 0.6425122759141091
|
|
|
|
key: train_mcc
|
|
value: [0.86827667 0.89478174 0.98443556 0.97687436 0.92846644 0.98452465
|
|
0.84554532 0.84765255 0.94314064 0.86288853]
|
|
|
|
mean value: 0.9136586453351799
|
|
|
|
key: test_accuracy
|
|
value: [0.80701754 0.78947368 0.80701754 0.80701754 0.87719298 0.77192982
|
|
0.75438596 0.85964912 0.85964912 0.84210526]
|
|
|
|
mean value: 0.8175438596491228
|
|
|
|
key: train_accuracy
|
|
value: [0.92982456 0.9454191 0.99220273 0.98830409 0.96296296 0.99220273
|
|
0.91812865 0.91812865 0.97076023 0.92787524]
|
|
|
|
mean value: 0.9545808966861599
|
|
|
|
key: test_fscore
|
|
value: [0.83076923 0.80645161 0.82539683 0.81967213 0.88135593 0.77966102
|
|
0.78125 0.86666667 0.85185185 0.84745763]
|
|
|
|
mean value: 0.8290532895006528
|
|
|
|
key: train_fscore
|
|
value: [0.93430657 0.94776119 0.9922179 0.98814229 0.96146045 0.99227799
|
|
0.92391304 0.92446043 0.96993988 0.93235832]
|
|
|
|
mean value: 0.9566838066212353
|
|
|
|
key: test_precision
|
|
value: [0.75 0.75757576 0.76470588 0.78125 0.86666667 0.74193548
|
|
0.69444444 0.8125 0.88461538 0.80645161]
|
|
|
|
mean value: 0.7860145232429387
|
|
|
|
key: train_precision
|
|
value: [0.87671233 0.90714286 0.98837209 1. 1. 0.98467433
|
|
0.86440678 0.85953177 1. 0.87931034]
|
|
|
|
mean value: 0.9360150505499005
|
|
|
|
key: test_recall
|
|
value: [0.93103448 0.86206897 0.89655172 0.86206897 0.89655172 0.82142857
|
|
0.89285714 0.92857143 0.82142857 0.89285714]
|
|
|
|
mean value: 0.8805418719211823
|
|
|
|
key: train_recall
|
|
value: [1. 0.9921875 0.99609375 0.9765625 0.92578125 1.
|
|
0.9922179 1. 0.94163424 0.9922179 ]
|
|
|
|
mean value: 0.9816695038910506
|
|
|
|
key: test_roc_auc
|
|
value: [0.80480296 0.78817734 0.80541872 0.80603448 0.87684729 0.77278325
|
|
0.7567734 0.86083744 0.85899015 0.8429803 ]
|
|
|
|
mean value: 0.8173645320197044
|
|
|
|
key: train_roc_auc
|
|
value: [0.92996109 0.94551009 0.9922103 0.98828125 0.96289062 0.9921875
|
|
0.91798395 0.91796875 0.97081712 0.92774957]
|
|
|
|
mean value: 0.9545560250486381
|
|
|
|
key: test_jcc
|
|
value: [0.71052632 0.67567568 0.7027027 0.69444444 0.78787879 0.63888889
|
|
0.64102564 0.76470588 0.74193548 0.73529412]
|
|
|
|
mean value: 0.7093077940276582
|
|
|
|
key: train_jcc
|
|
value: [0.87671233 0.90070922 0.98455598 0.9765625 0.92578125 0.98467433
|
|
0.85858586 0.85953177 0.94163424 0.87328767]
|
|
|
|
mean value: 0.9182035156322301
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.58
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03299141 0.03931022 0.04896212 0.03910494 0.03912568 0.0160625
|
|
0.01604271 0.01604271 0.04011297 0.03951836]
|
|
|
|
mean value: 0.03272736072540283
|
|
|
|
key: score_time
|
|
value: [0.01981163 0.01908088 0.02321577 0.01895833 0.01523948 0.01230335
|
|
0.01237154 0.01829481 0.01899481 0.01901555]
|
|
|
|
mean value: 0.017728614807128906
|
|
|
|
key: test_mcc
|
|
value: [0.96547546 0.96551724 0.8951918 0.8953202 0.8953202 0.89988258
|
|
0.86189955 0.92980296 0.96547546 0.92980296]
|
|
|
|
mean value: 0.9203688385840486
|
|
|
|
key: train_mcc
|
|
value: [0.96127828 0.96907736 0.97277661 0.96892956 0.96113155 0.97663743
|
|
0.96127477 0.96892768 0.96509421 0.96509421]
|
|
|
|
mean value: 0.9670221668898701
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.98245614 0.94736842 0.94736842 0.94736842 0.94736842
|
|
0.92982456 0.96491228 0.98245614 0.96491228]
|
|
|
|
mean value: 0.9596491228070175
|
|
|
|
key: train_accuracy
|
|
value: [0.98050682 0.98440546 0.98635478 0.98440546 0.98050682 0.98830409
|
|
0.98050682 0.98440546 0.98245614 0.98245614]
|
|
|
|
mean value: 0.9834307992202729
|
|
|
|
key: test_fscore
|
|
value: [0.98305085 0.98245614 0.94915254 0.94736842 0.94736842 0.94915254
|
|
0.93103448 0.96428571 0.98181818 0.96428571]
|
|
|
|
mean value: 0.9599973007807762
|
|
|
|
key: train_fscore
|
|
value: [0.98069498 0.98455598 0.98640777 0.98449612 0.98062016 0.98837209
|
|
0.98076923 0.98455598 0.98265896 0.98265896]
|
|
|
|
mean value: 0.983579023873464
|
|
|
|
key: test_precision
|
|
value: [0.96666667 1. 0.93333333 0.96428571 0.96428571 0.90322581
|
|
0.9 0.96428571 1. 0.96428571]
|
|
|
|
mean value: 0.956036866359447
|
|
|
|
key: train_precision
|
|
value: [0.96946565 0.97328244 0.98069498 0.97692308 0.97307692 0.98455598
|
|
0.96958175 0.97701149 0.97328244 0.97328244]
|
|
|
|
mean value: 0.9751157185652505
|
|
|
|
key: test_recall
|
|
value: [1. 0.96551724 0.96551724 0.93103448 0.93103448 1.
|
|
0.96428571 0.96428571 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9650246305418719
|
|
|
|
key: train_recall
|
|
value: [0.9921875 0.99609375 0.9921875 0.9921875 0.98828125 0.9922179
|
|
0.9922179 0.9922179 0.9922179 0.9922179 ]
|
|
|
|
mean value: 0.9922026994163424
|
|
|
|
key: test_roc_auc
|
|
value: [0.98214286 0.98275862 0.94704433 0.9476601 0.9476601 0.94827586
|
|
0.93041872 0.96490148 0.98214286 0.96490148]
|
|
|
|
mean value: 0.9597906403940888
|
|
|
|
key: train_roc_auc
|
|
value: [0.98052955 0.9844282 0.98636612 0.9844206 0.98052195 0.98829645
|
|
0.98048395 0.9843902 0.98243707 0.98243707]
|
|
|
|
mean value: 0.9834311162451362
|
|
|
|
key: test_jcc
|
|
value: [0.96666667 0.96551724 0.90322581 0.9 0.9 0.90322581
|
|
0.87096774 0.93103448 0.96428571 0.93103448]
|
|
|
|
mean value: 0.9235957942687643
|
|
|
|
key: train_jcc
|
|
value: [0.96212121 0.96958175 0.97318008 0.96946565 0.96197719 0.97701149
|
|
0.96226415 0.96958175 0.96590909 0.96590909]
|
|
|
|
mean value: 0.9677001449029624
|
|
|
|
MCC on Blind test: 0.82
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.27509356 0.29359365 0.15107393 0.30144858 0.24914145 0.15170097
|
|
0.25802326 0.17746091 0.35042024 0.2620635 ]
|
|
|
|
mean value: 0.2470020055770874
|
|
|
|
key: score_time
|
|
value: [0.01926804 0.01318574 0.01766825 0.01989317 0.01275682 0.01305699
|
|
0.01279974 0.02457213 0.01348686 0.01300597]
|
|
|
|
mean value: 0.015969371795654295
|
|
|
|
key: test_mcc
|
|
value: [0.96547546 0.96551724 0.8951918 0.8953202 0.8953202 0.89988258
|
|
0.86189955 0.92980296 0.96547546 0.92980296]
|
|
|
|
mean value: 0.9203688385840486
|
|
|
|
key: train_mcc
|
|
value: [0.96127828 0.96907736 0.97277661 0.96892956 0.96113155 0.97663743
|
|
0.96127477 0.96892768 0.96509421 0.96509421]
|
|
|
|
mean value: 0.9670221668898701
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.98245614 0.94736842 0.94736842 0.94736842 0.94736842
|
|
0.92982456 0.96491228 0.98245614 0.96491228]
|
|
|
|
mean value: 0.9596491228070175
|
|
|
|
key: train_accuracy
|
|
value: [0.98050682 0.98440546 0.98635478 0.98440546 0.98050682 0.98830409
|
|
0.98050682 0.98440546 0.98245614 0.98245614]
|
|
|
|
mean value: 0.9834307992202729
|
|
|
|
key: test_fscore
|
|
value: [0.98305085 0.98245614 0.94915254 0.94736842 0.94736842 0.94915254
|
|
0.93103448 0.96428571 0.98181818 0.96428571]
|
|
|
|
mean value: 0.9599973007807762
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./katg_sl.py:128: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
smnc_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./katg_sl.py:131: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
smnc_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[0.98069498 0.98455598 0.98640777 0.98449612 0.98062016 0.98837209
|
|
0.98076923 0.98455598 0.98265896 0.98265896]
|
|
|
|
mean value: 0.983579023873464
|
|
|
|
key: test_precision
|
|
value: [0.96666667 1. 0.93333333 0.96428571 0.96428571 0.90322581
|
|
0.9 0.96428571 1. 0.96428571]
|
|
|
|
mean value: 0.956036866359447
|
|
|
|
key: train_precision
|
|
value: [0.96946565 0.97328244 0.98069498 0.97692308 0.97307692 0.98455598
|
|
0.96958175 0.97701149 0.97328244 0.97328244]
|
|
|
|
mean value: 0.9751157185652505
|
|
|
|
key: test_recall
|
|
value: [1. 0.96551724 0.96551724 0.93103448 0.93103448 1.
|
|
0.96428571 0.96428571 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9650246305418719
|
|
|
|
key: train_recall
|
|
value: [0.9921875 0.99609375 0.9921875 0.9921875 0.98828125 0.9922179
|
|
0.9922179 0.9922179 0.9922179 0.9922179 ]
|
|
|
|
mean value: 0.9922026994163424
|
|
|
|
key: test_roc_auc
|
|
value: [0.98214286 0.98275862 0.94704433 0.9476601 0.9476601 0.94827586
|
|
0.93041872 0.96490148 0.98214286 0.96490148]
|
|
|
|
mean value: 0.9597906403940888
|
|
|
|
key: train_roc_auc
|
|
value: [0.98052955 0.9844282 0.98636612 0.9844206 0.98052195 0.98829645
|
|
0.98048395 0.9843902 0.98243707 0.98243707]
|
|
|
|
mean value: 0.9834311162451362
|
|
|
|
key: test_jcc
|
|
value: [0.96666667 0.96551724 0.90322581 0.9 0.9 0.90322581
|
|
0.87096774 0.93103448 0.96428571 0.93103448]
|
|
|
|
mean value: 0.9235957942687643
|
|
|
|
key: train_jcc
|
|
value: [0.96212121 0.96958175 0.97318008 0.96946565 0.96197719 0.97701149
|
|
0.96226415 0.96958175 0.96590909 0.96590909]
|
|
|
|
mean value: 0.9677001449029624
|
|
|
|
MCC on Blind test: 0.82
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02893591 0.03623009 0.03782225 0.03774071 0.0414505 0.03827882
|
|
0.0374217 0.03790164 0.03869176 0.03810287]
|
|
|
|
mean value: 0.03725762367248535
|
|
|
|
key: score_time
|
|
value: [0.01233792 0.0122304 0.01600552 0.01422644 0.01441526 0.01328301
|
|
0.01476669 0.01448369 0.01468158 0.01480675]
|
|
|
|
mean value: 0.01412372589111328
|
|
|
|
key: test_mcc
|
|
value: [0.86189955 0.8951918 0.71921182 0.8951918 0.7366424 0.89988258
|
|
0.8953202 0.7589669 0.7589669 0.79778885]
|
|
|
|
mean value: 0.8219062797841707
|
|
|
|
key: train_mcc
|
|
value: [0.88315143 0.91830593 0.91050744 0.90654547 0.90309643 0.92985363
|
|
0.91433828 0.91447603 0.89887645 0.92593212]
|
|
|
|
mean value: 0.9105083215666911
|
|
|
|
key: test_accuracy
|
|
value: [0.92982456 0.94736842 0.85964912 0.94736842 0.85964912 0.94736842
|
|
0.94736842 0.87719298 0.87719298 0.89473684]
|
|
|
|
mean value: 0.9087719298245613
|
|
|
|
key: train_accuracy
|
|
value: [0.94152047 0.95906433 0.95516569 0.95321637 0.95126706 0.96491228
|
|
0.95711501 0.95711501 0.94931774 0.96296296]
|
|
|
|
mean value: 0.9551656920077972
|
|
|
|
key: test_fscore
|
|
value: [0.92857143 0.94915254 0.86206897 0.94915254 0.84615385 0.94915254
|
|
0.94736842 0.88135593 0.88135593 0.9 ]
|
|
|
|
mean value: 0.9094332152820572
|
|
|
|
key: train_fscore
|
|
value: [0.94186047 0.95938104 0.95551257 0.95348837 0.95201536 0.96484375
|
|
0.95752896 0.95769231 0.95 0.9631068 ]
|
|
|
|
mean value: 0.9555429620654722
|
|
|
|
key: test_precision
|
|
value: [0.96296296 0.93333333 0.86206897 0.93333333 0.95652174 0.90322581
|
|
0.93103448 0.83870968 0.83870968 0.84375 ]
|
|
|
|
mean value: 0.9003649978326249
|
|
|
|
key: train_precision
|
|
value: [0.93461538 0.95019157 0.94636015 0.94615385 0.93584906 0.96862745
|
|
0.95019157 0.94676806 0.9391635 0.96124031]
|
|
|
|
mean value: 0.9479160902385434
|
|
|
|
key: test_recall
|
|
value: [0.89655172 0.96551724 0.86206897 0.96551724 0.75862069 1.
|
|
0.96428571 0.92857143 0.92857143 0.96428571]
|
|
|
|
mean value: 0.9233990147783251
|
|
|
|
key: train_recall
|
|
value: [0.94921875 0.96875 0.96484375 0.9609375 0.96875 0.96108949
|
|
0.96498054 0.9688716 0.96108949 0.96498054]
|
|
|
|
mean value: 0.9633511673151751
|
|
|
|
key: test_roc_auc
|
|
value: [0.93041872 0.94704433 0.85960591 0.94704433 0.8614532 0.94827586
|
|
0.9476601 0.87807882 0.87807882 0.89593596]
|
|
|
|
mean value: 0.9093596059113301
|
|
|
|
key: train_roc_auc
|
|
value: [0.94153545 0.95908317 0.95518452 0.9532314 0.95130107 0.96491975
|
|
0.95709965 0.95709205 0.94929475 0.96295902]
|
|
|
|
mean value: 0.9551700814688716
|
|
|
|
key: test_jcc
|
|
value: [0.86666667 0.90322581 0.75757576 0.90322581 0.73333333 0.90322581
|
|
0.9 0.78787879 0.78787879 0.81818182]
|
|
|
|
mean value: 0.836119257086999
|
|
|
|
key: train_jcc
|
|
value: [0.89010989 0.92193309 0.91481481 0.91111111 0.90842491 0.93207547
|
|
0.91851852 0.91881919 0.9047619 0.92883895]
|
|
|
|
mean value: 0.9149407844443863
|
|
|
|
MCC on Blind test: 0.89
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.03073525 0.89607024 0.88266683 1.02899694 0.90770841 0.94768596
|
|
0.87751889 1.00172234 0.94122171 0.88169622]
|
|
|
|
mean value: 0.939602279663086
|
|
|
|
key: score_time
|
|
value: [0.01449108 0.01500177 0.01603222 0.01596642 0.0146718 0.01587105
|
|
0.01488566 0.01601124 0.01503849 0.01837087]
|
|
|
|
mean value: 0.01563405990600586
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 0.9321832 0.86189955 0.89988258 0.9321832 0.9321832
|
|
0.8615634 0.89988258 0.93202124 0.82880708]
|
|
|
|
mean value: 0.9046123268422085
|
|
|
|
key: train_mcc
|
|
value: [1. 0.98443556 1. 1. 0.9844054 0.98443509
|
|
1. 0.99610895 1. 1. ]
|
|
|
|
mean value: 0.9949384998413872
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.96491228 0.92982456 0.94736842 0.96491228 0.96491228
|
|
0.92982456 0.94736842 0.96491228 0.9122807 ]
|
|
|
|
mean value: 0.9508771929824561
|
|
|
|
key: train_accuracy
|
|
value: [1. 0.99220273 1. 1. 0.99220273 0.99220273
|
|
1. 0.99805068 1. 1. ]
|
|
|
|
mean value: 0.9974658869395712
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 0.96428571 0.92857143 0.94545455 0.96428571 0.96551724
|
|
0.92592593 0.94915254 0.96296296 0.90566038]
|
|
|
|
mean value: 0.9494272592947851
|
|
|
|
key: train_fscore
|
|
value: [1. 0.9922179 1. 1. 0.9921875 0.99224806
|
|
1. 0.99805068 1. 1. ]
|
|
|
|
mean value: 0.9974704143109397
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.96296296 1. 1. 0.93333333
|
|
0.96153846 0.90322581 1. 0.96 ]
|
|
|
|
mean value: 0.972106056428637
|
|
|
|
key: train_precision
|
|
value: [1. 0.98837209 1. 1. 0.9921875 0.98841699
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9968976581440244
|
|
|
|
key: test_recall
|
|
value: [0.96551724 0.93103448 0.89655172 0.89655172 0.93103448 1.
|
|
0.89285714 1. 0.92857143 0.85714286]
|
|
|
|
mean value: 0.9299261083743843
|
|
|
|
key: train_recall
|
|
value: [1. 0.99609375 1. 1. 0.9921875 0.99610895
|
|
1. 0.99610895 1. 1. ]
|
|
|
|
mean value: 0.9980499148832684
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.96551724 0.93041872 0.94827586 0.96551724 0.96551724
|
|
0.92918719 0.94827586 0.96428571 0.91133005]
|
|
|
|
mean value: 0.9511083743842365
|
|
|
|
key: train_roc_auc
|
|
value: [1. 0.9922103 1. 1. 0.9922027 0.9921951
|
|
1. 0.99805447 1. 1. ]
|
|
|
|
mean value: 0.9974662572957198
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 0.93103448 0.86666667 0.89655172 0.93103448 0.93333333
|
|
0.86206897 0.90322581 0.92857143 0.82758621]
|
|
|
|
mean value: 0.9045590338471318
|
|
|
|
key: train_jcc
|
|
value: [1. 0.98455598 1. 1. 0.98449612 0.98461538
|
|
1. 0.99610895 1. 1. ]
|
|
|
|
mean value: 0.994977644261872
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01544452 0.01079631 0.01084924 0.01048374 0.01036882 0.01137543
|
|
0.01046872 0.01074934 0.01117849 0.01048636]
|
|
|
|
mean value: 0.011220097541809082
|
|
|
|
key: score_time
|
|
value: [0.01254582 0.00956702 0.00937724 0.00905204 0.00924778 0.00919628
|
|
0.00945115 0.00926661 0.00934625 0.00930095]
|
|
|
|
mean value: 0.009635114669799804
|
|
|
|
key: test_mcc
|
|
value: [0.47938227 0.73477227 0.65018988 0.35662633 0.43842365 0.76689254
|
|
0.54377353 0.553659 0.68850906 0.68472906]
|
|
|
|
mean value: 0.5896957592225097
|
|
|
|
key: train_mcc
|
|
value: [0.59993634 0.57835025 0.59632528 0.59569245 0.63963001 0.67451781
|
|
0.61484606 0.59641724 0.65352889 0.68074744]
|
|
|
|
mean value: 0.6229991760196517
|
|
|
|
key: test_accuracy
|
|
value: [0.73684211 0.85964912 0.8245614 0.66666667 0.71929825 0.87719298
|
|
0.77192982 0.77192982 0.84210526 0.84210526]
|
|
|
|
mean value: 0.7912280701754386
|
|
|
|
key: train_accuracy
|
|
value: [0.79727096 0.78557505 0.79532164 0.79532164 0.81871345 0.83625731
|
|
0.80506823 0.79727096 0.8245614 0.83625731]
|
|
|
|
mean value: 0.8091617933723196
|
|
|
|
key: test_fscore
|
|
value: [0.76190476 0.875 0.83333333 0.72463768 0.72413793 0.8852459
|
|
0.76363636 0.78688525 0.84745763 0.84210526]
|
|
|
|
mean value: 0.8044344108885885
|
|
|
|
key: train_fscore
|
|
value: [0.80952381 0.80072464 0.80804388 0.80733945 0.82551595 0.84269663
|
|
0.81684982 0.78947368 0.83455882 0.84837545]
|
|
|
|
mean value: 0.8183102124965754
|
|
|
|
key: test_precision
|
|
value: [0.70588235 0.8 0.80645161 0.625 0.72413793 0.81818182
|
|
0.77777778 0.72727273 0.80645161 0.82758621]
|
|
|
|
mean value: 0.7618742039910986
|
|
|
|
key: train_precision
|
|
value: [0.76206897 0.74662162 0.75945017 0.76124567 0.79422383 0.81227437
|
|
0.7716263 0.82278481 0.79094077 0.79124579]
|
|
|
|
mean value: 0.7812482294147253
|
|
|
|
key: test_recall
|
|
value: [0.82758621 0.96551724 0.86206897 0.86206897 0.72413793 0.96428571
|
|
0.75 0.85714286 0.89285714 0.85714286]
|
|
|
|
mean value: 0.8562807881773399
|
|
|
|
key: train_recall
|
|
value: [0.86328125 0.86328125 0.86328125 0.859375 0.859375 0.87548638
|
|
0.86770428 0.75875486 0.88326848 0.91439689]
|
|
|
|
mean value: 0.8608204644941634
|
|
|
|
key: test_roc_auc
|
|
value: [0.73522167 0.85775862 0.82389163 0.66317734 0.71921182 0.87869458
|
|
0.77155172 0.77339901 0.8429803 0.84236453]
|
|
|
|
mean value: 0.7908251231527094
|
|
|
|
key: train_roc_auc
|
|
value: [0.79739938 0.78572623 0.79545385 0.79544625 0.81879256 0.83618069
|
|
0.80494589 0.79734618 0.82444674 0.83610469]
|
|
|
|
mean value: 0.8091842473249027
|
|
|
|
key: test_jcc
|
|
value: [0.61538462 0.77777778 0.71428571 0.56818182 0.56756757 0.79411765
|
|
0.61764706 0.64864865 0.73529412 0.72727273]
|
|
|
|
mean value: 0.6766177692648281
|
|
|
|
key: train_jcc
|
|
value: [0.68 0.66767372 0.67791411 0.67692308 0.7028754 0.72815534
|
|
0.69040248 0.65217391 0.71608833 0.73667712]
|
|
|
|
mean value: 0.6928883476418292
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01124644 0.0112834 0.0114224 0.01132727 0.01072431 0.01181841
|
|
0.01175499 0.01123571 0.01174855 0.01170516]
|
|
|
|
mean value: 0.011426663398742676
|
|
|
|
key: score_time
|
|
value: [0.00936103 0.00995636 0.00940537 0.00954866 0.00949144 0.00949359
|
|
0.00991058 0.00980997 0.00985241 0.00984097]
|
|
|
|
mean value: 0.009667038917541504
|
|
|
|
key: test_mcc
|
|
value: [0.65104858 0.65018988 0.57881773 0.54377353 0.51048128 0.58562417
|
|
0.62473685 0.53222729 0.72064772 0.54377353]
|
|
|
|
mean value: 0.5941320574687631
|
|
|
|
key: train_mcc
|
|
value: [0.66095589 0.64523042 0.66547519 0.6456446 0.66679649 0.62183277
|
|
0.66093013 0.66861729 0.64578738 0.66541423]
|
|
|
|
mean value: 0.6546684397270505
|
|
|
|
key: test_accuracy
|
|
value: [0.8245614 0.8245614 0.78947368 0.77192982 0.75438596 0.78947368
|
|
0.80701754 0.75438596 0.85964912 0.77192982]
|
|
|
|
mean value: 0.7947368421052632
|
|
|
|
key: train_accuracy
|
|
value: [0.83040936 0.82261209 0.83235867 0.82261209 0.83235867 0.81091618
|
|
0.83040936 0.83430799 0.82261209 0.83235867]
|
|
|
|
mean value: 0.8270955165692008
|
|
|
|
key: test_fscore
|
|
value: [0.82142857 0.83333333 0.79310345 0.77966102 0.75 0.8
|
|
0.81967213 0.78125 0.85185185 0.76363636]
|
|
|
|
mean value: 0.7993936716622676
|
|
|
|
key: train_fscore
|
|
value: [0.83172147 0.82261209 0.83587786 0.82533589 0.83834586 0.81165049
|
|
0.83236994 0.83495146 0.82666667 0.8365019 ]
|
|
|
|
mean value: 0.8296033627312248
|
|
|
|
key: test_precision
|
|
value: [0.85185185 0.80645161 0.79310345 0.76666667 0.77777778 0.75
|
|
0.75757576 0.69444444 0.88461538 0.77777778]
|
|
|
|
mean value: 0.7860264721888749
|
|
|
|
key: train_precision
|
|
value: [0.82375479 0.82101167 0.81716418 0.81132075 0.80797101 0.81007752
|
|
0.82442748 0.83333333 0.80970149 0.81784387]
|
|
|
|
mean value: 0.817660610307552
|
|
|
|
key: test_recall
|
|
value: [0.79310345 0.86206897 0.79310345 0.79310345 0.72413793 0.85714286
|
|
0.89285714 0.89285714 0.82142857 0.75 ]
|
|
|
|
mean value: 0.8179802955665024
|
|
|
|
key: train_recall
|
|
value: [0.83984375 0.82421875 0.85546875 0.83984375 0.87109375 0.81322957
|
|
0.84046693 0.83657588 0.84435798 0.85603113]
|
|
|
|
mean value: 0.8421130228599222
|
|
|
|
key: test_roc_auc
|
|
value: [0.82512315 0.82389163 0.78940887 0.77155172 0.75492611 0.79064039
|
|
0.80849754 0.7567734 0.85899015 0.77155172]
|
|
|
|
mean value: 0.7951354679802956
|
|
|
|
key: train_roc_auc
|
|
value: [0.83042771 0.82261521 0.83240364 0.82264561 0.83243403 0.81091166
|
|
0.83038971 0.83430356 0.82256961 0.83231244]
|
|
|
|
mean value: 0.8271013193093385
|
|
|
|
key: test_jcc
|
|
value: [0.6969697 0.71428571 0.65714286 0.63888889 0.6 0.66666667
|
|
0.69444444 0.64102564 0.74193548 0.61764706]
|
|
|
|
mean value: 0.6669006452118407
|
|
|
|
key: train_jcc
|
|
value: [0.71192053 0.6986755 0.71803279 0.70261438 0.72168285 0.68300654
|
|
0.71287129 0.71666667 0.70454545 0.71895425]
|
|
|
|
mean value: 0.7088970233011279
|
|
|
|
MCC on Blind test: 0.47
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.01082969 0.011024 0.01055241 0.0112505 0.01139307 0.01134706
|
|
0.01113677 0.01125765 0.01113772 0.01104569]
|
|
|
|
mean value: 0.011097455024719238
|
|
|
|
key: score_time
|
|
value: [0.01366639 0.01299024 0.01354313 0.01347542 0.01680923 0.01358867
|
|
0.01365185 0.01317978 0.01348186 0.01677513]
|
|
|
|
mean value: 0.014116168022155762
|
|
|
|
key: test_mcc
|
|
value: [0.58562417 0.59358067 0.40320623 0.37898808 0.33374384 0.54433498
|
|
0.47938227 0.23201635 0.43881637 0.4464279 ]
|
|
|
|
mean value: 0.443612085640124
|
|
|
|
key: train_mcc
|
|
value: [0.68941296 0.71041292 0.67593155 0.70540158 0.65887319 0.69002013
|
|
0.69956718 0.66878238 0.67386518 0.68754157]
|
|
|
|
mean value: 0.6859808624868353
|
|
|
|
key: test_accuracy
|
|
value: [0.78947368 0.78947368 0.70175439 0.68421053 0.66666667 0.77192982
|
|
0.73684211 0.61403509 0.71929825 0.71929825]
|
|
|
|
mean value: 0.7192982456140351
|
|
|
|
key: train_accuracy
|
|
value: [0.84405458 0.85380117 0.83625731 0.85185185 0.82846004 0.84210526
|
|
0.84795322 0.83235867 0.83625731 0.84210526]
|
|
|
|
mean value: 0.8415204678362573
|
|
|
|
key: test_fscore
|
|
value: [0.77777778 0.76923077 0.71186441 0.65384615 0.66666667 0.77192982
|
|
0.70588235 0.63333333 0.7037037 0.68 ]
|
|
|
|
mean value: 0.7074234988840645
|
|
|
|
key: train_fscore
|
|
value: [0.83870968 0.84662577 0.82716049 0.84615385 0.82113821 0.83160083
|
|
0.84016393 0.82304527 0.8313253 0.83435583]
|
|
|
|
mean value: 0.8340279158596092
|
|
|
|
key: test_precision
|
|
value: [0.84 0.86956522 0.7 0.73913043 0.67857143 0.75862069
|
|
0.7826087 0.59375 0.73076923 0.77272727]
|
|
|
|
mean value: 0.7465742969549192
|
|
|
|
key: train_precision
|
|
value: [0.86666667 0.88841202 0.87391304 0.87815126 0.8559322 0.89285714
|
|
0.88744589 0.87336245 0.85892116 0.87931034]
|
|
|
|
mean value: 0.8754972173577531
|
|
|
|
key: test_recall
|
|
value: [0.72413793 0.68965517 0.72413793 0.5862069 0.65517241 0.78571429
|
|
0.64285714 0.67857143 0.67857143 0.60714286]
|
|
|
|
mean value: 0.6772167487684729
|
|
|
|
key: train_recall
|
|
value: [0.8125 0.80859375 0.78515625 0.81640625 0.7890625 0.77821012
|
|
0.79766537 0.77821012 0.80544747 0.79377432]
|
|
|
|
mean value: 0.7965026142996109
|
|
|
|
key: test_roc_auc
|
|
value: [0.79064039 0.79125616 0.70135468 0.68596059 0.66687192 0.77216749
|
|
0.73522167 0.61514778 0.71859606 0.71736453]
|
|
|
|
mean value: 0.7194581280788177
|
|
|
|
key: train_roc_auc
|
|
value: [0.84399319 0.85371322 0.83615789 0.85178289 0.82838339 0.84223006
|
|
0.84805143 0.83246443 0.83631749 0.84219966]
|
|
|
|
mean value: 0.8415293652723735
|
|
|
|
key: test_jcc
|
|
value: [0.63636364 0.625 0.55263158 0.48571429 0.5 0.62857143
|
|
0.54545455 0.46341463 0.54285714 0.51515152]
|
|
|
|
mean value: 0.5495158767206264
|
|
|
|
key: train_jcc
|
|
value: [0.72222222 0.73404255 0.70526316 0.73333333 0.69655172 0.71174377
|
|
0.72438163 0.6993007 0.71134021 0.71578947]
|
|
|
|
mean value: 0.7153968767633878
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.56
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02510953 0.02347946 0.02306366 0.02614355 0.02323389 0.02468395
|
|
0.02582169 0.02323365 0.02800202 0.02702546]
|
|
|
|
mean value: 0.024979686737060545
|
|
|
|
key: score_time
|
|
value: [0.01257253 0.01330853 0.01220036 0.01338387 0.01318002 0.01259565
|
|
0.01303172 0.01234198 0.01249027 0.0124352 ]
|
|
|
|
mean value: 0.012754011154174804
|
|
|
|
key: test_mcc
|
|
value: [0.75808552 0.93202124 0.65018988 0.75808552 0.50862069 0.82942474
|
|
0.83797038 0.56277738 0.7366424 0.72706729]
|
|
|
|
mean value: 0.7300885045397139
|
|
|
|
key: train_mcc
|
|
value: [0.80503807 0.82685702 0.82430977 0.82136234 0.82365101 0.79844785
|
|
0.80135931 0.83061083 0.7827491 0.83061083]
|
|
|
|
mean value: 0.8144996123456597
|
|
|
|
key: test_accuracy
|
|
value: [0.87719298 0.96491228 0.8245614 0.87719298 0.75438596 0.9122807
|
|
0.9122807 0.77192982 0.85964912 0.85964912]
|
|
|
|
mean value: 0.8614035087719298
|
|
|
|
key: train_accuracy
|
|
value: [0.9005848 0.9122807 0.91033138 0.90838207 0.90838207 0.89668616
|
|
0.89863548 0.9122807 0.88888889 0.9122807 ]
|
|
|
|
mean value: 0.9048732943469785
|
|
|
|
key: test_fscore
|
|
value: [0.8852459 0.96666667 0.83333333 0.8852459 0.75862069 0.91525424
|
|
0.91803279 0.79365079 0.87096774 0.86666667]
|
|
|
|
mean value: 0.8693684719360186
|
|
|
|
key: train_fscore
|
|
value: [0.90502793 0.91525424 0.9141791 0.91280148 0.91376147 0.90239411
|
|
0.9037037 0.91743119 0.89502762 0.91743119]
|
|
|
|
mean value: 0.9097012046994798
|
|
|
|
key: test_precision
|
|
value: [0.84375 0.93548387 0.80645161 0.84375 0.75862069 0.87096774
|
|
0.84848485 0.71428571 0.79411765 0.8125 ]
|
|
|
|
mean value: 0.822841212529101
|
|
|
|
key: train_precision
|
|
value: [0.86476868 0.88363636 0.875 0.86925795 0.8615917 0.85664336
|
|
0.86219081 0.86805556 0.84965035 0.86805556]
|
|
|
|
mean value: 0.8658850323067816
|
|
|
|
key: test_recall
|
|
value: [0.93103448 1. 0.86206897 0.93103448 0.75862069 0.96428571
|
|
1. 0.89285714 0.96428571 0.92857143]
|
|
|
|
mean value: 0.9232758620689655
|
|
|
|
key: train_recall
|
|
value: [0.94921875 0.94921875 0.95703125 0.9609375 0.97265625 0.95330739
|
|
0.94941634 0.97276265 0.94552529 0.97276265]
|
|
|
|
mean value: 0.9582836819066147
|
|
|
|
key: test_roc_auc
|
|
value: [0.87623153 0.96428571 0.82389163 0.87623153 0.75431034 0.91317734
|
|
0.9137931 0.77401478 0.8614532 0.86083744]
|
|
|
|
mean value: 0.8618226600985222
|
|
|
|
key: train_roc_auc
|
|
value: [0.90067941 0.91235257 0.91042224 0.90848431 0.90850711 0.89657557
|
|
0.8985363 0.91216257 0.88877827 0.91216257]
|
|
|
|
mean value: 0.9048660931420234
|
|
|
|
key: test_jcc
|
|
value: [0.79411765 0.93548387 0.71428571 0.79411765 0.61111111 0.84375
|
|
0.84848485 0.65789474 0.77142857 0.76470588]
|
|
|
|
mean value: 0.773538002959068
|
|
|
|
key: train_jcc
|
|
value: [0.82653061 0.84375 0.8419244 0.83959044 0.84121622 0.82214765
|
|
0.82432432 0.84745763 0.81 0.84745763]
|
|
|
|
mean value: 0.8344398900340875
|
|
|
|
MCC on Blind test: 0.82
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [2.04724884 2.13221836 2.09088874 1.98468447 2.13889074 2.03775096
|
|
2.05242133 2.10154319 2.35233235 2.07667589]
|
|
|
|
mean value: 2.1014654874801635
|
|
|
|
key: score_time
|
|
value: [0.0151341 0.01452017 0.01477504 0.02222133 0.01449585 0.0148468
|
|
0.01498914 0.01492929 0.01478672 0.01493883]
|
|
|
|
mean value: 0.015563726425170898
|
|
|
|
key: test_mcc
|
|
value: [0.9321832 0.96551724 0.86189955 0.86851042 0.79778885 0.9321832
|
|
0.86851042 0.86851042 0.89952865 0.78940887]
|
|
|
|
mean value: 0.8784040805411836
|
|
|
|
key: train_mcc
|
|
value: [0.99610889 1. 1. 0.99610889 1. 1.
|
|
0.99610895 1. 0.99610895 0.99610895]
|
|
|
|
mean value: 0.9980544629026649
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.98245614 0.92982456 0.92982456 0.89473684 0.96491228
|
|
0.92982456 0.92982456 0.94736842 0.89473684]
|
|
|
|
mean value: 0.9368421052631579
|
|
|
|
key: train_accuracy
|
|
value: [0.99805068 1. 1. 0.99805068 1. 1.
|
|
0.99805068 1. 0.99805068 0.99805068]
|
|
|
|
mean value: 0.9990253411306043
|
|
|
|
key: test_fscore
|
|
value: [0.96428571 0.98245614 0.92857143 0.92592593 0.88888889 0.96551724
|
|
0.93333333 0.93333333 0.94339623 0.89285714]
|
|
|
|
mean value: 0.9358565375341049
|
|
|
|
key: train_fscore
|
|
value: [0.99804305 1. 1. 0.99804305 1. 1.
|
|
0.99805068 1. 0.99805068 0.99805068]
|
|
|
|
mean value: 0.9990238152458772
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.96296296 1. 0.96 0.93333333
|
|
0.875 0.875 1. 0.89285714]
|
|
|
|
mean value: 0.9499153439153439
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.93103448 0.96551724 0.89655172 0.86206897 0.82758621 1.
|
|
1. 1. 0.89285714 0.89285714]
|
|
|
|
mean value: 0.9268472906403941
|
|
|
|
key: train_recall
|
|
value: [0.99609375 1. 1. 0.99609375 1. 1.
|
|
0.99610895 1. 0.99610895 0.99610895]
|
|
|
|
mean value: 0.9980514348249028
|
|
|
|
key: test_roc_auc
|
|
value: [0.96551724 0.98275862 0.93041872 0.93103448 0.89593596 0.96551724
|
|
0.93103448 0.93103448 0.94642857 0.89470443]
|
|
|
|
mean value: 0.9374384236453202
|
|
|
|
key: train_roc_auc
|
|
value: [0.99804688 1. 1. 0.99804688 1. 1.
|
|
0.99805447 1. 0.99805447 0.99805447]
|
|
|
|
mean value: 0.9990257174124514
|
|
|
|
key: test_jcc
|
|
value: [0.93103448 0.96551724 0.86666667 0.86206897 0.8 0.93333333
|
|
0.875 0.875 0.89285714 0.80645161]
|
|
|
|
mean value: 0.8807929445415541
|
|
|
|
key: train_jcc
|
|
value: [0.99609375 1. 1. 0.99609375 1. 1.
|
|
0.99610895 1. 0.99610895 0.99610895]
|
|
|
|
mean value: 0.9980514348249028
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02580142 0.02164054 0.02364039 0.01933789 0.02146482 0.02186465
|
|
0.0194304 0.0223105 0.02421784 0.02074671]
|
|
|
|
mean value: 0.02204551696777344
|
|
|
|
key: score_time
|
|
value: [0.01239204 0.00951767 0.0091331 0.00935173 0.00931621 0.00931072
|
|
0.00933576 0.00936174 0.00928426 0.00949907]
|
|
|
|
mean value: 0.009650230407714844
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 0.96551724 1. 0.9321832 0.96551724 0.82490815
|
|
0.96551724 0.82512315 1. 1. ]
|
|
|
|
mean value: 0.9444283466299732
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.98245614 1. 0.96491228 0.98245614 0.9122807
|
|
0.98245614 0.9122807 1. 1. ]
|
|
|
|
mean value: 0.9719298245614034
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 0.98245614 1. 0.96428571 0.98245614 0.90909091
|
|
0.98245614 0.9122807 1. 1. ]
|
|
|
|
mean value: 0.9715481886534518
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 1. 1. 1. 0.92592593
|
|
0.96551724 0.89655172 1. 1. ]
|
|
|
|
mean value: 0.9787994891443167
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96551724 0.96551724 1. 0.93103448 0.96551724 0.89285714
|
|
1. 0.92857143 1. 1. ]
|
|
|
|
mean value: 0.9649014778325123
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.98275862 1. 0.96551724 0.98275862 0.91194581
|
|
0.98275862 0.91256158 1. 1. ]
|
|
|
|
mean value: 0.9721059113300493
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 0.96551724 1. 0.93103448 0.96551724 0.83333333
|
|
0.96551724 0.83870968 1. 1. ]
|
|
|
|
mean value: 0.9465146459028551
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.12869287 0.13245654 0.13260221 0.13266063 0.13408566 0.12997079
|
|
0.12685227 0.12470031 0.12397742 0.1276443 ]
|
|
|
|
mean value: 0.1293642997741699
|
|
|
|
key: score_time
|
|
value: [0.01836562 0.02021575 0.01952505 0.01995015 0.01989484 0.01940989
|
|
0.01952267 0.01921463 0.01887035 0.01998615]
|
|
|
|
mean value: 0.019495511054992677
|
|
|
|
key: test_mcc
|
|
value: [0.9321832 0.96551724 0.82880708 0.93202124 0.7589669 0.89988258
|
|
0.86851042 0.7589669 1. 0.82942474]
|
|
|
|
mean value: 0.8774280298980771
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.98245614 0.9122807 0.96491228 0.87719298 0.94736842
|
|
0.92982456 0.87719298 1. 0.9122807 ]
|
|
|
|
mean value: 0.9368421052631579
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96428571 0.98245614 0.91803279 0.96666667 0.87272727 0.94915254
|
|
0.93333333 0.88135593 1. 0.91525424]
|
|
|
|
mean value: 0.9383264626113517
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.875 0.93548387 0.92307692 0.90322581
|
|
0.875 0.83870968 1. 0.87096774]
|
|
|
|
mean value: 0.9221464019851117
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.93103448 0.96551724 0.96551724 1. 0.82758621 1.
|
|
1. 0.92857143 1. 0.96428571]
|
|
|
|
mean value: 0.9582512315270936
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96551724 0.98275862 0.91133005 0.96428571 0.87807882 0.94827586
|
|
0.93103448 0.87807882 1. 0.91317734]
|
|
|
|
mean value: 0.9372536945812808
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.93103448 0.96551724 0.84848485 0.93548387 0.77419355 0.90322581
|
|
0.875 0.78787879 1. 0.84375 ]
|
|
|
|
mean value: 0.8864568586308019
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.7
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01136732 0.01122093 0.01079583 0.01113224 0.01157928 0.01075482
|
|
0.01083755 0.01173544 0.01077223 0.01102138]
|
|
|
|
mean value: 0.011121702194213868
|
|
|
|
key: score_time
|
|
value: [0.00963473 0.00953341 0.00986958 0.00951362 0.00952601 0.00972843
|
|
0.00951052 0.00955749 0.00977182 0.00967717]
|
|
|
|
mean value: 0.009632277488708495
|
|
|
|
key: test_mcc
|
|
value: [0.64889453 0.92980296 0.93202124 0.65104858 0.50182897 0.75492611
|
|
0.76550573 0.68434084 0.96547546 0.72064772]
|
|
|
|
mean value: 0.7554492130393143
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.80701754 0.96491228 0.96491228 0.8245614 0.73684211 0.87719298
|
|
0.87719298 0.84210526 0.98245614 0.85964912]
|
|
|
|
mean value: 0.8736842105263157
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.7755102 0.96551724 0.96666667 0.82142857 0.69387755 0.87719298
|
|
0.8627451 0.83636364 0.98181818 0.85185185]
|
|
|
|
mean value: 0.8632971985105615
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.95 0.96551724 0.93548387 0.85185185 0.85 0.86206897
|
|
0.95652174 0.85185185 1. 0.88461538]
|
|
|
|
mean value: 0.9107910905313816
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.65517241 0.96551724 1. 0.79310345 0.5862069 0.89285714
|
|
0.78571429 0.82142857 0.96428571 0.82142857]
|
|
|
|
mean value: 0.8285714285714285
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.80972906 0.96490148 0.96428571 0.82512315 0.73953202 0.87746305
|
|
0.87561576 0.84174877 0.98214286 0.85899015]
|
|
|
|
mean value: 0.8739532019704433
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.63333333 0.93333333 0.93548387 0.6969697 0.53125 0.78125
|
|
0.75862069 0.71875 0.96428571 0.74193548]
|
|
|
|
mean value: 0.769521212241596
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.48
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.81794739 1.80400729 1.78327441 1.78702116 1.78159261 1.77446675
|
|
1.77745032 1.78069067 1.81510329 1.79544592]
|
|
|
|
mean value: 1.7916999816894532
|
|
|
|
key: score_time
|
|
value: [0.09758425 0.09366488 0.09205317 0.09208465 0.09284616 0.09224868
|
|
0.09185553 0.09374571 0.09647107 0.14306402]
|
|
|
|
mean value: 0.09856181144714356
|
|
|
|
key: test_mcc
|
|
value: [1. 1. 0.96547546 0.96551724 0.8951918 0.9321832
|
|
0.9321832 0.89988258 1. 0.96551724]
|
|
|
|
mean value: 0.9555950718602617
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 1. 0.98245614 0.98245614 0.94736842 0.96491228
|
|
0.96491228 0.94736842 1. 0.98245614]
|
|
|
|
mean value: 0.9771929824561403
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 1. 0.98305085 0.98245614 0.94915254 0.96551724
|
|
0.96551724 0.94915254 1. 0.98245614]
|
|
|
|
mean value: 0.9777302695663765
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.96666667 1. 0.93333333 0.93333333
|
|
0.93333333 0.90322581 1. 0.96551724]
|
|
|
|
mean value: 0.963540971449759
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 0.96551724 0.96551724 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.993103448275862
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 1. 0.98214286 0.98275862 0.94704433 0.96551724
|
|
0.96551724 0.94827586 1. 0.98275862]
|
|
|
|
mean value: 0.9774014778325123
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 1. 0.96666667 0.96551724 0.90322581 0.93333333
|
|
0.93333333 0.90322581 1. 0.96551724]
|
|
|
|
mean value: 0.957081942899518
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.94383526 0.9610858 1.04445434 0.96728754 1.01617026 0.99173713
|
|
0.96424818 1.04056263 0.96362472 0.97514272]
|
|
|
|
mean value: 0.9868148565292358
|
|
|
|
key: score_time
|
|
value: [0.2423315 0.27159905 0.20848846 0.27036047 0.26195121 0.25400162
|
|
0.24411821 0.23739958 0.24373174 0.18249941]
|
|
|
|
mean value: 0.24164812564849852
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 0.96547546 0.93202124 0.96551724 0.85960591 0.89988258
|
|
0.96551724 0.86189955 1. 0.9321832 ]
|
|
|
|
mean value: 0.9347619656218213
|
|
|
|
key: train_mcc
|
|
value: [0.96907736 0.96907736 0.9652735 0.97289533 0.97307329 0.97672617
|
|
0.97289329 0.98057338 0.97289329 0.98057338]
|
|
|
|
mean value: 0.9733056339025454
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.98245614 0.96491228 0.98245614 0.92982456 0.94736842
|
|
0.98245614 0.92982456 1. 0.96491228]
|
|
|
|
mean value: 0.9666666666666667
|
|
|
|
key: train_accuracy
|
|
value: [0.98440546 0.98440546 0.98245614 0.98635478 0.98635478 0.98830409
|
|
0.98635478 0.99025341 0.98635478 0.99025341]
|
|
|
|
mean value: 0.9865497076023392
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 0.98305085 0.96666667 0.98245614 0.93103448 0.94915254
|
|
0.98245614 0.93103448 1. 0.96551724]
|
|
|
|
mean value: 0.9673824684446358
|
|
|
|
key: train_fscore
|
|
value: [0.98455598 0.98455598 0.98265896 0.98646035 0.98651252 0.98841699
|
|
0.98651252 0.99032882 0.98651252 0.99032882]
|
|
|
|
mean value: 0.9866843477715449
|
|
|
|
key: test_precision
|
|
value: [1. 0.96666667 0.93548387 1. 0.93103448 0.90322581
|
|
0.96551724 0.9 1. 0.93333333]
|
|
|
|
mean value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
0.9535261401557286
|
|
|
|
key: train_precision
|
|
value: [0.97328244 0.97328244 0.96958175 0.97701149 0.97338403 0.98084291
|
|
0.97709924 0.98461538 0.97709924 0.98461538]
|
|
|
|
mean value: 0.9770814313607344
|
|
|
|
key: test_recall
|
|
value: [0.96551724 1. 1. 0.96551724 0.93103448 1.
|
|
1. 0.96428571 1. 1. ]
|
|
|
|
mean value: 0.9826354679802956
|
|
|
|
key: train_recall
|
|
value: [0.99609375 0.99609375 0.99609375 0.99609375 1. 0.99610895
|
|
0.99610895 0.99610895 0.99610895 0.99610895]
|
|
|
|
mean value: 0.9964919747081712
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.98214286 0.96428571 0.98275862 0.92980296 0.94827586
|
|
0.98275862 0.93041872 1. 0.96551724]
|
|
|
|
mean value: 0.9668719211822661
|
|
|
|
key: train_roc_auc
|
|
value: [0.9844282 0.9844282 0.98248267 0.98637372 0.98638132 0.98828885
|
|
0.98633572 0.99024197 0.98633572 0.99024197]
|
|
|
|
mean value: 0.9865538363326849
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 0.96666667 0.93548387 0.96551724 0.87096774 0.90322581
|
|
0.96551724 0.87096774 1. 0.93333333]
|
|
|
|
mean value: 0.9377196885428254
|
|
|
|
key: train_jcc
|
|
value: [0.96958175 0.96958175 0.96590909 0.97328244 0.97338403 0.97709924
|
|
0.97338403 0.98084291 0.97338403 0.98084291]
|
|
|
|
mean value: 0.9737292183406805
|
|
|
|
MCC on Blind test: 0.89
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02731586 0.01124978 0.01125789 0.011729 0.01140189 0.0112772
|
|
0.01180577 0.0113914 0.01137853 0.0109787 ]
|
|
|
|
mean value: 0.012978601455688476
|
|
|
|
key: score_time
|
|
value: [0.01053786 0.00988889 0.00988317 0.00965619 0.00942922 0.00986814
|
|
0.00963902 0.00995588 0.01001787 0.00911331]
|
|
|
|
mean value: 0.009798955917358399
|
|
|
|
key: test_mcc
|
|
value: [0.65104858 0.65018988 0.57881773 0.54377353 0.51048128 0.58562417
|
|
0.62473685 0.53222729 0.72064772 0.54377353]
|
|
|
|
mean value: 0.5941320574687631
|
|
|
|
key: train_mcc
|
|
value: [0.66095589 0.64523042 0.66547519 0.6456446 0.66679649 0.62183277
|
|
0.66093013 0.66861729 0.64578738 0.66541423]
|
|
|
|
mean value: 0.6546684397270505
|
|
|
|
key: test_accuracy
|
|
value: [0.8245614 0.8245614 0.78947368 0.77192982 0.75438596 0.78947368
|
|
0.80701754 0.75438596 0.85964912 0.77192982]
|
|
|
|
mean value: 0.7947368421052632
|
|
|
|
key: train_accuracy
|
|
value: [0.83040936 0.82261209 0.83235867 0.82261209 0.83235867 0.81091618
|
|
0.83040936 0.83430799 0.82261209 0.83235867]
|
|
|
|
mean value: 0.8270955165692008
|
|
|
|
key: test_fscore
|
|
value: [0.82142857 0.83333333 0.79310345 0.77966102 0.75 0.8
|
|
0.81967213 0.78125 0.85185185 0.76363636]
|
|
|
|
mean value: 0.7993936716622676
|
|
|
|
key: train_fscore
|
|
value: [0.83172147 0.82261209 0.83587786 0.82533589 0.83834586 0.81165049
|
|
0.83236994 0.83495146 0.82666667 0.8365019 ]
|
|
|
|
mean value: 0.8296033627312248
|
|
|
|
key: test_precision
|
|
value: [0.85185185 0.80645161 0.79310345 0.76666667 0.77777778 0.75
|
|
0.75757576 0.69444444 0.88461538 0.77777778]
|
|
|
|
mean value: 0.7860264721888749
|
|
|
|
key: train_precision
|
|
value: [0.82375479 0.82101167 0.81716418 0.81132075 0.80797101 0.81007752
|
|
0.82442748 0.83333333 0.80970149 0.81784387]
|
|
|
|
mean value: 0.817660610307552
|
|
|
|
key: test_recall
|
|
value: [0.79310345 0.86206897 0.79310345 0.79310345 0.72413793 0.85714286
|
|
0.89285714 0.89285714 0.82142857 0.75 ]
|
|
|
|
mean value: 0.8179802955665024
|
|
|
|
key: train_recall
|
|
value: [0.83984375 0.82421875 0.85546875 0.83984375 0.87109375 0.81322957
|
|
0.84046693 0.83657588 0.84435798 0.85603113]
|
|
|
|
mean value: 0.8421130228599222
|
|
|
|
key: test_roc_auc
|
|
value: [0.82512315 0.82389163 0.78940887 0.77155172 0.75492611 0.79064039
|
|
0.80849754 0.7567734 0.85899015 0.77155172]
|
|
|
|
mean value: 0.7951354679802956
|
|
|
|
key: train_roc_auc
|
|
value: [0.83042771 0.82261521 0.83240364 0.82264561 0.83243403 0.81091166
|
|
0.83038971 0.83430356 0.82256961 0.83231244]
|
|
|
|
mean value: 0.8271013193093385
|
|
|
|
key: test_jcc
|
|
value: [0.6969697 0.71428571 0.65714286 0.63888889 0.6 0.66666667
|
|
0.69444444 0.64102564 0.74193548 0.61764706]
|
|
|
|
mean value: 0.6669006452118407
|
|
|
|
key: train_jcc
|
|
value: [0.71192053 0.6986755 0.71803279 0.70261438 0.72168285 0.68300654
|
|
0.71287129 0.71666667 0.70454545 0.71895425]
|
|
|
|
mean value: 0.7088970233011279
|
|
|
|
MCC on Blind test: 0.47
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.10117292 0.06557846 0.07522464 0.07597613 0.23702955 0.07183504
|
|
0.07094789 0.08759308 0.06865072 0.07027149]
|
|
|
|
mean value: 0.09242799282073974
|
|
|
|
key: score_time
|
|
value: [0.01112437 0.01097035 0.01136065 0.01136637 0.01188517 0.01121473
|
|
0.01161575 0.01136327 0.01081252 0.0108099 ]
|
|
|
|
mean value: 0.011252307891845703
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 1. 1. 0.96551724 0.96551724 0.92980296
|
|
0.96551724 0.8953202 1. 1. ]
|
|
|
|
mean value: 0.9687192118226601
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 1. 1. 0.98245614 0.98245614 0.96491228
|
|
0.98245614 0.94736842 1. 1. ]
|
|
|
|
mean value: 0.9842105263157894
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 1. 1. 0.98245614 0.98245614 0.96428571
|
|
0.98245614 0.94736842 1. 1. ]
|
|
|
|
mean value: 0.9841478696741854
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 1. 1. 1. 0.96428571
|
|
0.96551724 0.93103448 1. 1. ]
|
|
|
|
mean value: 0.9860837438423645
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96551724 1. 1. 0.96551724 0.96551724 0.96428571
|
|
1. 0.96428571 1. 1. ]
|
|
|
|
mean value: 0.982512315270936
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 1. 1. 0.98275862 0.98275862 0.96490148
|
|
0.98275862 0.9476601 1. 1. ]
|
|
|
|
mean value: 0.9843596059113301
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 1. 1. 0.96551724 0.96551724 0.93103448
|
|
0.96551724 0.9 1. 1. ]
|
|
|
|
mean value: 0.9693103448275863
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.04481077 0.05475616 0.04192662 0.06302857 0.05291224 0.07026553
|
|
0.07563853 0.05653453 0.07711101 0.05302048]
|
|
|
|
mean value: 0.05900044441223144
|
|
|
|
key: score_time
|
|
value: [0.01975703 0.01240492 0.01247978 0.0125854 0.01426888 0.02779078
|
|
0.01338577 0.01489663 0.02277255 0.01260567]
|
|
|
|
mean value: 0.0162947416305542
|
|
|
|
key: test_mcc
|
|
value: [0.9321832 0.82942474 0.78940887 0.89988258 0.8953202 0.85960591
|
|
0.96547546 0.68472906 1. 0.72133224]
|
|
|
|
mean value: 0.857736224960981
|
|
|
|
key: train_mcc
|
|
value: [0.96884072 0.97289533 0.96884072 0.96491975 0.96884072 0.9688108
|
|
0.9611292 0.95717934 0.96883978 0.97277537]
|
|
|
|
mean value: 0.9673071727588148
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.9122807 0.89473684 0.94736842 0.94736842 0.92982456
|
|
0.98245614 0.84210526 1. 0.85964912]
|
|
|
|
mean value: 0.9280701754385965
|
|
|
|
key: train_accuracy
|
|
value: [0.98440546 0.98635478 0.98440546 0.98245614 0.98440546 0.98440546
|
|
0.98050682 0.9785575 0.98440546 0.98635478]
|
|
|
|
mean value: 0.9836257309941521
|
|
|
|
key: test_fscore
|
|
value: [0.96428571 0.90909091 0.89655172 0.94545455 0.94736842 0.92857143
|
|
0.98181818 0.84210526 1. 0.86206897]
|
|
|
|
mean value: 0.9277315153086478
|
|
|
|
key: train_fscore
|
|
value: [0.9844358 0.98646035 0.9844358 0.98245614 0.9844358 0.9844358
|
|
0.98069498 0.9787234 0.98449612 0.98646035]
|
|
|
|
mean value: 0.9837034536318615
|
|
|
|
key: test_precision
|
|
value: [1. 0.96153846 0.89655172 1. 0.96428571 0.92857143
|
|
1. 0.82758621 1. 0.83333333]
|
|
|
|
mean value: 0.9411866868763421
|
|
|
|
key: train_precision
|
|
value: [0.98062016 0.97701149 0.98062016 0.98054475 0.98062016 0.9844358
|
|
0.97318008 0.97307692 0.98069498 0.98076923]
|
|
|
|
mean value: 0.9791573715285722
|
|
|
|
key: test_recall
|
|
value: [0.93103448 0.86206897 0.89655172 0.89655172 0.93103448 0.92857143
|
|
0.96428571 0.85714286 1. 0.89285714]
|
|
|
|
mean value: 0.9160098522167488
|
|
|
|
key: train_recall
|
|
value: [0.98828125 0.99609375 0.98828125 0.984375 0.98828125 0.9844358
|
|
0.98832685 0.9844358 0.98832685 0.9922179 ]
|
|
|
|
mean value: 0.9883055690661479
|
|
|
|
key: test_roc_auc
|
|
value: [0.96551724 0.91317734 0.89470443 0.94827586 0.9476601 0.92980296
|
|
0.98214286 0.84236453 1. 0.86022167]
|
|
|
|
mean value: 0.9283866995073892
|
|
|
|
key: train_roc_auc
|
|
value: [0.984413 0.98637372 0.984413 0.98245987 0.984413 0.9844054
|
|
0.98049155 0.97854602 0.9843978 0.98634332]
|
|
|
|
mean value: 0.983625668774319
|
|
|
|
key: test_jcc
|
|
value: [0.93103448 0.83333333 0.8125 0.89655172 0.9 0.86666667
|
|
0.96428571 0.72727273 1. 0.75757576]
|
|
|
|
mean value: 0.8689220406030751
|
|
|
|
key: train_jcc
|
|
value: [0.96934866 0.97328244 0.96934866 0.96551724 0.96934866 0.96934866
|
|
0.96212121 0.95833333 0.96946565 0.97328244]
|
|
|
|
mean value: 0.9679396957200327
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01195359 0.01182127 0.01160693 0.01116824 0.01125407 0.01143885
|
|
0.01075578 0.01118302 0.0114491 0.01059365]
|
|
|
|
mean value: 0.011322450637817384
|
|
|
|
key: score_time
|
|
value: [0.01034856 0.01003671 0.00991583 0.00989461 0.00985289 0.00991845
|
|
0.00990295 0.00980353 0.00983691 0.00996399]
|
|
|
|
mean value: 0.009947443008422851
|
|
|
|
key: test_mcc
|
|
value: [0.68434084 0.85960591 0.58076493 0.6166424 0.40394089 0.71921182
|
|
0.68472906 0.50182897 0.7589669 0.65104858]
|
|
|
|
mean value: 0.6461080302281371
|
|
|
|
key: train_mcc
|
|
value: [0.68136015 0.67760831 0.68889059 0.69292741 0.69025624 0.69758394
|
|
0.69286683 0.65887319 0.67126183 0.70503989]
|
|
|
|
mean value: 0.6856668379759446
|
|
|
|
key: test_accuracy
|
|
value: [0.84210526 0.92982456 0.78947368 0.80701754 0.70175439 0.85964912
|
|
0.84210526 0.73684211 0.87719298 0.8245614 ]
|
|
|
|
mean value: 0.8210526315789474
|
|
|
|
key: train_accuracy
|
|
value: [0.84015595 0.83820663 0.84405458 0.8460039 0.84405458 0.84795322
|
|
0.8460039 0.82846004 0.83430799 0.85185185]
|
|
|
|
mean value: 0.8421052631578947
|
|
|
|
key: test_fscore
|
|
value: [0.84745763 0.93103448 0.78571429 0.81967213 0.70175439 0.85714286
|
|
0.84210526 0.76923077 0.88135593 0.82758621]
|
|
|
|
mean value: 0.8263053941335466
|
|
|
|
key: train_fscore
|
|
value: [0.84410646 0.84250474 0.84732824 0.84952381 0.84962406 0.85338346
|
|
0.85009488 0.83520599 0.84171322 0.85660377]
|
|
|
|
mean value: 0.8470088644663055
|
|
|
|
key: test_precision
|
|
value: [0.83333333 0.93103448 0.81481481 0.78125 0.71428571 0.85714286
|
|
0.82758621 0.67567568 0.83870968 0.8 ]
|
|
|
|
mean value: 0.8073832762326922
|
|
|
|
key: train_precision
|
|
value: [0.82222222 0.81918819 0.82835821 0.82899628 0.81884058 0.82545455
|
|
0.82962963 0.80505415 0.80714286 0.83150183]
|
|
|
|
mean value: 0.8216388500650803
|
|
|
|
key: test_recall
|
|
value: [0.86206897 0.93103448 0.75862069 0.86206897 0.68965517 0.85714286
|
|
0.85714286 0.89285714 0.92857143 0.85714286]
|
|
|
|
mean value: 0.8496305418719212
|
|
|
|
key: train_recall
|
|
value: [0.8671875 0.8671875 0.8671875 0.87109375 0.8828125 0.88326848
|
|
0.87159533 0.86770428 0.87937743 0.88326848]
|
|
|
|
mean value: 0.8740682757782101
|
|
|
|
key: test_roc_auc
|
|
value: [0.84174877 0.92980296 0.79002463 0.80603448 0.70197044 0.85960591
|
|
0.84236453 0.73953202 0.87807882 0.82512315]
|
|
|
|
mean value: 0.8214285714285714
|
|
|
|
key: train_roc_auc
|
|
value: [0.84020854 0.83826301 0.84409959 0.84605271 0.84412999 0.84788424
|
|
0.84595392 0.82838339 0.83421997 0.85179049]
|
|
|
|
mean value: 0.8420985834143969
|
|
|
|
key: test_jcc
|
|
value: [0.73529412 0.87096774 0.64705882 0.69444444 0.54054054 0.75
|
|
0.72727273 0.625 0.78787879 0.70588235]
|
|
|
|
mean value: 0.7084339536189631
|
|
|
|
key: train_jcc
|
|
value: [0.73026316 0.72786885 0.73509934 0.7384106 0.73856209 0.7442623
|
|
0.73927393 0.7170418 0.7266881 0.74917492]
|
|
|
|
mean value: 0.7346645079135289
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02216458 0.02373385 0.02363467 0.02583838 0.0320487 0.0309329
|
|
0.02790904 0.02842832 0.02639103 0.02799988]
|
|
|
|
mean value: 0.026908135414123534
|
|
|
|
key: score_time
|
|
value: [0.01075006 0.01167011 0.01230669 0.01222515 0.01232409 0.01231933
|
|
0.01234722 0.01239252 0.01221395 0.01230645]
|
|
|
|
mean value: 0.012085556983947754
|
|
|
|
key: test_mcc
|
|
value: [0.72242731 0.8953202 0.60819237 0.89988258 0.89988258 0.8953202
|
|
0.72242731 0.9321832 0.96547546 0.79778885]
|
|
|
|
mean value: 0.8338900041513282
|
|
|
|
key: train_mcc
|
|
value: [0.74613112 0.96509685 0.75908545 0.94169697 0.96127477 0.98057338
|
|
0.79621952 0.9844054 0.94645043 0.96147894]
|
|
|
|
mean value: 0.9042412834902326
|
|
|
|
key: test_accuracy
|
|
value: [0.84210526 0.94736842 0.77192982 0.94736842 0.94736842 0.94736842
|
|
0.84210526 0.96491228 0.98245614 0.89473684]
|
|
|
|
mean value: 0.9087719298245613
|
|
|
|
key: train_accuracy
|
|
value: [0.85769981 0.98245614 0.86549708 0.97076023 0.98050682 0.99025341
|
|
0.88888889 0.99220273 0.97270955 0.98050682]
|
|
|
|
mean value: 0.9481481481481481
|
|
|
|
key: test_fscore
|
|
value: [0.81632653 0.94736842 0.81690141 0.94545455 0.94545455 0.94736842
|
|
0.86153846 0.96551724 0.98181818 0.9 ]
|
|
|
|
mean value: 0.9127747756813257
|
|
|
|
key: train_fscore
|
|
value: [0.83371298 0.98259188 0.88123924 0.9704142 0.98023715 0.99032882
|
|
0.89982425 0.9922179 0.97338403 0.98084291]
|
|
|
|
mean value: 0.9484793372602178
|
|
|
|
key: test_precision
|
|
value: [1. 0.96428571 0.69047619 1. 1. 0.93103448
|
|
0.75675676 0.93333333 1. 0.84375 ]
|
|
|
|
mean value: 0.9119636477610615
|
|
|
|
key: train_precision
|
|
value: [1. 0.97318008 0.78769231 0.98007968 0.992 0.98461538
|
|
0.82051282 0.9922179 0.95167286 0.96603774]
|
|
|
|
mean value: 0.9448008767859039
|
|
|
|
key: test_recall
|
|
value: [0.68965517 0.93103448 1. 0.89655172 0.89655172 0.96428571
|
|
1. 1. 0.96428571 0.96428571]
|
|
|
|
mean value: 0.9306650246305419
|
|
|
|
key: train_recall
|
|
value: [0.71484375 0.9921875 1. 0.9609375 0.96875 0.99610895
|
|
0.99610895 0.9922179 0.99610895 0.99610895]
|
|
|
|
mean value: 0.9613372446498054
|
|
|
|
key: test_roc_auc
|
|
value: [0.84482759 0.9476601 0.76785714 0.94827586 0.94827586 0.9476601
|
|
0.84482759 0.96551724 0.98214286 0.89593596]
|
|
|
|
mean value: 0.9092980295566503
|
|
|
|
key: train_roc_auc
|
|
value: [0.85742188 0.98247507 0.86575875 0.97074112 0.98048395 0.99024197
|
|
0.88867947 0.9922027 0.97266385 0.98047635]
|
|
|
|
mean value: 0.9481145124027237
|
|
|
|
key: test_jcc
|
|
value: [0.68965517 0.9 0.69047619 0.89655172 0.89655172 0.9
|
|
0.75675676 0.93333333 0.96428571 0.81818182]
|
|
|
|
mean value: 0.8445792433723468
|
|
|
|
key: train_jcc
|
|
value: [0.71484375 0.96577947 0.78769231 0.94252874 0.96124031 0.98084291
|
|
0.81789137 0.98455598 0.94814815 0.96240602]
|
|
|
|
mean value: 0.9065929004503658
|
|
|
|
MCC on Blind test: 0.88
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0179286 0.02019238 0.01859355 0.02175641 0.02475739 0.01901031
|
|
0.01926279 0.01868582 0.02003837 0.01949239]
|
|
|
|
mean value: 0.019971799850463868
|
|
|
|
key: score_time
|
|
value: [0.01125741 0.01217079 0.01214743 0.01216531 0.01229024 0.01215935
|
|
0.01220012 0.01220322 0.01220989 0.01215482]
|
|
|
|
mean value: 0.01209585666656494
|
|
|
|
key: test_mcc
|
|
value: [0.64058163 0.75808552 0.7589669 0.8615634 0.9321832 0.8953202
|
|
0.79778885 0.82512315 0.92980296 0.85960591]
|
|
|
|
mean value: 0.8259021726142358
|
|
|
|
key: train_mcc
|
|
value: [0.66233052 0.9243158 0.88990958 0.89903527 0.9766081 0.94949987
|
|
0.95367737 0.93407434 0.95367737 0.95712245]
|
|
|
|
mean value: 0.9100250676726173
|
|
|
|
key: test_accuracy
|
|
value: [0.78947368 0.87719298 0.87719298 0.92982456 0.96491228 0.94736842
|
|
0.89473684 0.9122807 0.96491228 0.92982456]
|
|
|
|
mean value: 0.9087719298245613
|
|
|
|
key: train_accuracy
|
|
value: [0.80506823 0.96101365 0.94346979 0.94736842 0.98830409 0.97465887
|
|
0.97660819 0.9668616 0.97660819 0.9785575 ]
|
|
|
|
mean value: 0.9518518518518518
|
|
|
|
key: test_fscore
|
|
value: [0.73913043 0.8852459 0.87272727 0.93333333 0.96428571 0.94736842
|
|
0.9 0.9122807 0.96428571 0.92857143]
|
|
|
|
mean value: 0.9047228922432434
|
|
|
|
key: train_fscore
|
|
value: [0.75728155 0.96226415 0.94093686 0.94972067 0.98828125 0.97445972
|
|
0.97701149 0.96646943 0.97701149 0.9785575 ]
|
|
|
|
mean value: 0.9471994134614119
|
|
|
|
key: test_precision
|
|
value: [1. 0.84375 0.92307692 0.90322581 1. 0.93103448
|
|
0.84375 0.89655172 0.96428571 0.92857143]
|
|
|
|
mean value: 0.9234246079282231
|
|
|
|
key: train_precision
|
|
value: [1. 0.93065693 0.98297872 0.90747331 0.98828125 0.98412698
|
|
0.96226415 0.98 0.96226415 0.98046875]
|
|
|
|
mean value: 0.9678514253333143
|
|
|
|
key: test_recall
|
|
value: [0.5862069 0.93103448 0.82758621 0.96551724 0.93103448 0.96428571
|
|
0.96428571 0.92857143 0.96428571 0.92857143]
|
|
|
|
mean value: 0.8991379310344828
|
|
|
|
key: train_recall
|
|
value: [0.609375 0.99609375 0.90234375 0.99609375 0.98828125 0.96498054
|
|
0.9922179 0.95330739 0.9922179 0.9766537 ]
|
|
|
|
mean value: 0.9371564931906615
|
|
|
|
key: test_roc_auc
|
|
value: [0.79310345 0.87623153 0.87807882 0.92918719 0.96551724 0.9476601
|
|
0.89593596 0.91256158 0.96490148 0.92980296]
|
|
|
|
mean value: 0.9092980295566503
|
|
|
|
key: train_roc_auc
|
|
value: [0.8046875 0.96108189 0.94338977 0.94746322 0.98830405 0.97467777
|
|
0.9765777 0.96688807 0.9765777 0.97856122]
|
|
|
|
mean value: 0.951820890077821
|
|
|
|
key: test_jcc
|
|
value: [0.5862069 0.79411765 0.77419355 0.875 0.93103448 0.9
|
|
0.81818182 0.83870968 0.93103448 0.86666667]
|
|
|
|
mean value: 0.8315145219782726
|
|
|
|
key: train_jcc
|
|
value: [0.609375 0.92727273 0.88846154 0.90425532 0.97683398 0.95019157
|
|
0.95505618 0.9351145 0.95505618 0.95801527]
|
|
|
|
mean value: 0.9059632263141333
|
|
|
|
MCC on Blind test: 0.94
|
|
|
|
Accuracy on Blind test: 0.97
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.18768001 0.18640518 0.19032073 0.1912744 0.18816471 0.18460417
|
|
0.18546081 0.1917305 0.19130564 0.19142318]
|
|
|
|
mean value: 0.188836932182312
|
|
|
|
key: score_time
|
|
value: [0.01549673 0.01700187 0.01743078 0.0170064 0.01694393 0.01691818
|
|
0.01700211 0.01698422 0.01711059 0.01660037]
|
|
|
|
mean value: 0.016849517822265625
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 0.96551724 0.96547546 0.96551724 0.92980296 0.8953202
|
|
0.96551724 0.8953202 1. 0.93202124]
|
|
|
|
mean value: 0.9480009013217541
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.98245614 0.98245614 0.98245614 0.96491228 0.94736842
|
|
0.98245614 0.94736842 1. 0.96491228]
|
|
|
|
mean value: 0.9736842105263157
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 0.98245614 0.98305085 0.98245614 0.96551724 0.94736842
|
|
0.98245614 0.94736842 1. 0.96296296]
|
|
|
|
mean value: 0.9736092455308673
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.96666667 1. 0.96551724 0.93103448
|
|
0.96551724 0.93103448 1. 1. ]
|
|
|
|
mean value: 0.9759770114942529
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96551724 0.96551724 1. 0.96551724 0.96551724 0.96428571
|
|
1. 0.96428571 1. 0.92857143]
|
|
|
|
mean value: 0.9719211822660099
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.98275862 0.98214286 0.98275862 0.96490148 0.9476601
|
|
0.98275862 0.9476601 1. 0.96428571]
|
|
|
|
mean value: 0.973768472906404
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 0.96551724 0.96666667 0.96551724 0.93333333 0.9
|
|
0.96551724 0.9 1. 0.92857143]
|
|
|
|
mean value: 0.949064039408867
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.94
|
|
|
|
Accuracy on Blind test: 0.97
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.05879712 0.06589222 0.06732559 0.0758357 0.06974936 0.07213354
|
|
0.07219601 0.08236074 0.0727253 0.08393288]
|
|
|
|
mean value: 0.07209484577178955
|
|
|
|
key: score_time
|
|
value: [0.02332783 0.02187681 0.02655411 0.03225303 0.02155733 0.02742934
|
|
0.03805351 0.03988814 0.02360201 0.04238701]
|
|
|
|
mean value: 0.029692912101745607
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 0.9321832 0.92980296 0.96551724 0.9321832 0.8951918
|
|
1. 0.9321832 1. 1. ]
|
|
|
|
mean value: 0.9552578839217298
|
|
|
|
key: train_mcc
|
|
value: [0.99610889 0.9922027 1. 0.9922027 0.99610889 0.99223298
|
|
0.99610895 0.99610895 0.99223298 0.99610895]
|
|
|
|
mean value: 0.9949415988478287
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.96491228 0.96491228 0.98245614 0.96491228 0.94736842
|
|
1. 0.96491228 1. 1. ]
|
|
|
|
mean value: 0.9771929824561403
|
|
|
|
key: train_accuracy
|
|
value: [0.99805068 0.99610136 1. 0.99610136 0.99805068 0.99610136
|
|
0.99805068 0.99805068 0.99610136 0.99805068]
|
|
|
|
mean value: 0.9974658869395712
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 0.96428571 0.96551724 0.98245614 0.96428571 0.94545455
|
|
1. 0.96551724 1. 1. ]
|
|
|
|
mean value: 0.9769972737486349
|
|
|
|
key: train_fscore
|
|
value: [0.99804305 0.99609375 1. 0.99609375 0.99804305 0.99609375
|
|
0.99805068 0.99805068 0.99609375 0.99805068]
|
|
|
|
mean value: 0.9974613152458772
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.96551724 1. 1. 0.96296296
|
|
1. 0.93333333 1. 1. ]
|
|
|
|
mean value: 0.9861813537675607
|
|
|
|
key: train_precision
|
|
value: [1. 0.99609375 1. 0.99609375 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.99921875
|
|
|
|
key: test_recall
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
[0.96551724 0.93103448 0.96551724 0.96551724 0.93103448 0.92857143
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9687192118226601
|
|
|
|
key: train_recall
|
|
value: [0.99609375 0.99609375 1. 0.99609375 0.99609375 0.9922179
|
|
0.99610895 0.99610895 0.9922179 0.99610895]
|
|
|
|
mean value: 0.9957137645914397
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.96551724 0.96490148 0.98275862 0.96551724 0.94704433
|
|
1. 0.96551724 1. 1. ]
|
|
|
|
mean value: 0.9774014778325123
|
|
|
|
key: train_roc_auc
|
|
value: [0.99804688 0.99610135 1. 0.99610135 0.99804688 0.99610895
|
|
0.99805447 0.99805447 0.99610895 0.99805447]
|
|
|
|
mean value: 0.9974677772373541
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 0.93103448 0.93333333 0.96551724 0.93103448 0.89655172
|
|
1. 0.93333333 1. 1. ]
|
|
|
|
mean value: 0.9556321839080459
|
|
|
|
key: train_jcc
|
|
value: [0.99609375 0.9922179 1. 0.9922179 0.99609375 0.9922179
|
|
0.99610895 0.99610895 0.9922179 0.99610895]
|
|
|
|
mean value: 0.9949385943579767
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.15817857 0.20125985 0.22563338 0.20408607 0.2192142 0.20384526
|
|
0.19948983 0.22740364 0.22744894 0.22111392]
|
|
|
|
mean value: 0.20876736640930177
|
|
|
|
key: score_time
|
|
value: [0.01797915 0.03252816 0.03050971 0.02747774 0.02705717 0.02753019
|
|
0.02618265 0.02748179 0.02658629 0.02650428]
|
|
|
|
mean value: 0.02698371410369873
|
|
|
|
key: test_mcc
|
|
value: [0.89988258 0.89988258 0.68472906 0.72064772 0.61453202 0.79161589
|
|
0.82490815 0.61805122 0.96547546 0.61453202]
|
|
|
|
mean value: 0.7634256700967201
|
|
|
|
key: train_mcc
|
|
value: [0.97663743 0.98443509 0.98831147 0.9766081 0.98443509 0.98051435
|
|
0.9766081 0.98831165 0.98051435 0.97663814]
|
|
|
|
mean value: 0.9813013765941954
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.94736842 0.84210526 0.85964912 0.80701754 0.89473684
|
|
0.9122807 0.80701754 0.98245614 0.80701754]
|
|
|
|
mean value: 0.8807017543859649
|
|
|
|
key: train_accuracy
|
|
value: [0.98830409 0.99220273 0.99415205 0.98830409 0.99220273 0.99025341
|
|
0.98830409 0.99415205 0.99025341 0.98830409]
|
|
|
|
mean value: 0.9906432748538011
|
|
|
|
key: test_fscore
|
|
value: [0.94545455 0.94545455 0.84210526 0.86666667 0.80701754 0.89655172
|
|
0.90909091 0.81355932 0.98181818 0.80701754]
|
|
|
|
mean value: 0.8814736245533871
|
|
|
|
key: train_fscore
|
|
value: [0.98823529 0.99215686 0.99412916 0.98828125 0.99215686 0.99025341
|
|
0.98832685 0.99415205 0.99025341 0.98828125]
|
|
|
|
mean value: 0.9906226395765302
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.85714286 0.83870968 0.82142857 0.86666667
|
|
0.92592593 0.77419355 1. 0.79310345]
|
|
|
|
mean value: 0.8877170695246335
|
|
|
|
key: train_precision
|
|
value: [0.99212598 0.99606299 0.99607843 0.98828125 0.99606299 0.9921875
|
|
0.98832685 0.99609375 0.9921875 0.99215686]
|
|
|
|
mean value: 0.9929564110870611
|
|
|
|
key: test_recall
|
|
value: [0.89655172 0.89655172 0.82758621 0.89655172 0.79310345 0.92857143
|
|
0.89285714 0.85714286 0.96428571 0.82142857]
|
|
|
|
mean value: 0.8774630541871922
|
|
|
|
key: train_recall
|
|
value: [0.984375 0.98828125 0.9921875 0.98828125 0.98828125 0.98832685
|
|
0.98832685 0.9922179 0.98832685 0.9844358 ]
|
|
|
|
mean value: 0.9883040491245136
|
|
|
|
key: test_roc_auc
|
|
value: [0.94827586 0.94827586 0.84236453 0.85899015 0.80726601 0.8953202
|
|
0.91194581 0.80788177 0.98214286 0.80726601]
|
|
|
|
mean value: 0.8809729064039409
|
|
|
|
key: train_roc_auc
|
|
value: [0.98829645 0.9921951 0.99414822 0.98830405 0.9921951 0.99025717
|
|
0.98830405 0.99415582 0.99025717 0.98831165]
|
|
|
|
mean value: 0.9906424793287938
|
|
|
|
key: test_jcc
|
|
value: [0.89655172 0.89655172 0.72727273 0.76470588 0.67647059 0.8125
|
|
0.83333333 0.68571429 0.96428571 0.67647059]
|
|
|
|
mean value: 0.7933856567705452
|
|
|
|
key: train_jcc
|
|
value: [0.97674419 0.9844358 0.98832685 0.97683398 0.9844358 0.98069498
|
|
0.97692308 0.98837209 0.98069498 0.97683398]
|
|
|
|
mean value: 0.9814295714630527
|
|
|
|
MCC on Blind test: 0.45
|
|
|
|
Accuracy on Blind test: 0.75
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.7267344 0.71088648 0.71327186 0.71637559 0.70127916 0.71965098
|
|
0.7172935 0.713516 0.71736693 0.7136898 ]
|
|
|
|
mean value: 0.7150064706802368
|
|
|
|
key: score_time
|
|
value: [0.00998569 0.00966191 0.00968909 0.00982118 0.00983381 0.01384377
|
|
0.00946999 0.00947881 0.0097661 0.00945377]
|
|
|
|
mean value: 0.010100412368774413
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 0.96551724 1. 0.96551724 0.96551724 0.92980296
|
|
0.96551724 0.8953202 1. 1. ]
|
|
|
|
mean value: 0.9652709359605911
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.98245614 1. 0.98245614 0.98245614 0.96491228
|
|
0.98245614 0.94736842 1. 1. ]
|
|
|
|
mean value: 0.9824561403508771
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 0.98245614 1. 0.98245614 0.98245614 0.96428571
|
|
0.98245614 0.94736842 1. 1. ]
|
|
|
|
mean value: 0.9823934837092732
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 1. 1. 1. 0.96428571
|
|
0.96551724 0.93103448 1. 1. ]
|
|
|
|
mean value: 0.9860837438423645
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96551724 0.96551724 1. 0.96551724 0.96551724 0.96428571
|
|
1. 0.96428571 1. 1. ]
|
|
|
|
mean value: 0.979064039408867
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.98275862 1. 0.98275862 0.98275862 0.96490148
|
|
0.98275862 0.9476601 1. 1. ]
|
|
|
|
mean value: 0.9826354679802957
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 0.96551724 1. 0.96551724 0.96551724 0.93103448
|
|
0.96551724 0.9 1. 1. ]
|
|
|
|
mean value: 0.9658620689655173
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.03299451 0.03226686 0.03665662 0.03213477 0.0319376 0.03235912
|
|
0.03242207 0.03254771 0.04732156 0.06062531]
|
|
|
|
mean value: 0.03712661266326904
|
|
|
|
key: score_time
|
|
value: [0.01289058 0.01308775 0.01283455 0.014925 0.01495719 0.01515841
|
|
0.01587915 0.01514864 0.01835227 0.01392794]
|
|
|
|
mean value: 0.014716148376464844
|
|
|
|
key: test_mcc
|
|
value: [0.76689254 0.65104858 0.6317806 0.61805122 0.6166424 0.61805122
|
|
0.79161589 0.54592083 0.69397486 0.54592083]
|
|
|
|
mean value: 0.6479898980537036
|
|
|
|
key: train_mcc
|
|
value: [0.93577244 0.87861783 0.93865489 0.91759789 0.86838482 0.91188178
|
|
0.90330592 0.91232594 0.87173285 0.92677222]
|
|
|
|
mean value: 0.9065046573444461
|
|
|
|
key: test_accuracy
|
|
value: [0.87719298 0.8245614 0.80701754 0.80701754 0.80701754 0.80701754
|
|
0.89473684 0.75438596 0.84210526 0.75438596]
|
|
|
|
mean value: 0.8175438596491228
|
|
|
|
key: train_accuracy
|
|
value: [0.9668616 0.93567251 0.96881092 0.95711501 0.9337232 0.95516569
|
|
0.94931774 0.95516569 0.93177388 0.96296296]
|
|
|
|
mean value: 0.9516569200779726
|
|
|
|
key: test_fscore
|
|
value: [0.86792453 0.82142857 0.83076923 0.8 0.81967213 0.81355932
|
|
0.89655172 0.78787879 0.82352941 0.78787879]
|
|
|
|
mean value: 0.8249192495341341
|
|
|
|
key: train_fscore
|
|
value: [0.96565657 0.93110647 0.96946565 0.95510204 0.932 0.95652174
|
|
0.94672131 0.9566855 0.92693111 0.96380952]
|
|
|
|
mean value: 0.9503999907089703
|
|
|
|
key: test_precision
|
|
value: [0.95833333 0.85185185 0.75 0.84615385 0.78125 0.77419355
|
|
0.86666667 0.68421053 0.91304348 0.68421053]
|
|
|
|
mean value: 0.8109913777285244
|
|
|
|
key: train_precision
|
|
value: [1. 1. 0.94776119 1. 0.95491803 0.93014706
|
|
1. 0.9270073 1. 0.94402985]
|
|
|
|
mean value: 0.9703863435656607
|
|
|
|
key: test_recall
|
|
value: [0.79310345 0.79310345 0.93103448 0.75862069 0.86206897 0.85714286
|
|
0.92857143 0.92857143 0.75 0.92857143]
|
|
|
|
mean value: 0.8530788177339902
|
|
|
|
key: train_recall
|
|
value: [0.93359375 0.87109375 0.9921875 0.9140625 0.91015625 0.9844358
|
|
0.89883268 0.98832685 0.86381323 0.9844358 ]
|
|
|
|
mean value: 0.9340938107976654
|
|
|
|
key: test_roc_auc
|
|
value: [0.87869458 0.82512315 0.80480296 0.80788177 0.80603448 0.80788177
|
|
0.8953202 0.75738916 0.84051724 0.75738916]
|
|
|
|
mean value: 0.8181034482758621
|
|
|
|
key: train_roc_auc
|
|
value: [0.96679688 0.93554688 0.9688564 0.95703125 0.93367735 0.95510852
|
|
0.94941634 0.95510092 0.93190661 0.96292102]
|
|
|
|
mean value: 0.9516362171692607
|
|
|
|
key: test_jcc
|
|
value: [0.76666667 0.6969697 0.71052632 0.66666667 0.69444444 0.68571429
|
|
0.8125 0.65 0.7 0.65 ]
|
|
|
|
mean value: 0.7033488076251234
|
|
|
|
key: train_jcc
|
|
value: [0.93359375 0.87109375 0.94074074 0.9140625 0.87265918 0.91666667
|
|
0.89883268 0.91696751 0.86381323 0.93014706]
|
|
|
|
mean value: 0.9058577065683058
|
|
|
|
MCC on Blind test: 0.2
|
|
|
|
Accuracy on Blind test: 0.61
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02408338 0.05451012 0.04921699 0.03914237 0.03929567 0.03930926
|
|
0.03907394 0.03979349 0.03902602 0.03887081]
|
|
|
|
mean value: 0.04023220539093018
|
|
|
|
key: score_time
|
|
value: [0.01881361 0.01886582 0.0218544 0.01910973 0.01883149 0.01882291
|
|
0.0191009 0.01888132 0.01903081 0.01887798]
|
|
|
|
mean value: 0.01921889781951904
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 0.8953202 0.8951918 0.96551724 0.92980296 0.9321832
|
|
0.9321832 0.86851042 0.8953202 0.86189955]
|
|
|
|
mean value: 0.9141445996607607
|
|
|
|
key: train_mcc
|
|
value: [0.9610433 0.96907736 0.9610433 0.96127828 0.97289533 0.96892768
|
|
0.96127477 0.96127477 0.96127477 0.97277537]
|
|
|
|
mean value: 0.9650864958366505
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.94736842 0.94736842 0.98245614 0.96491228 0.96491228
|
|
0.96491228 0.92982456 0.94736842 0.92982456]
|
|
|
|
mean value: 0.956140350877193
|
|
|
|
key: train_accuracy
|
|
value: [0.98050682 0.98440546 0.98050682 0.98050682 0.98635478 0.98440546
|
|
0.98050682 0.98050682 0.98050682 0.98635478]
|
|
|
|
mean value: 0.9824561403508771
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 0.94736842 0.94915254 0.98245614 0.96551724 0.96551724
|
|
0.96551724 0.93333333 0.94736842 0.93103448]
|
|
|
|
mean value: 0.9569721205409785
|
|
|
|
key: train_fscore /home/tanu/git/LSHTM_analysis/scripts/ml/./katg_sl.py:148: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
ros_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./katg_sl.py:151: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
ros_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
|
|
value: [0.98054475 0.98455598 0.98054475 0.98069498 0.98646035 0.98455598
|
|
0.98076923 0.98076923 0.98076923 0.98646035]
|
|
|
|
mean value: 0.9826124832603018
|
|
|
|
key: test_precision
|
|
value: [1. 0.96428571 0.93333333 1. 0.96551724 0.93333333
|
|
0.93333333 0.875 0.93103448 0.9 ]
|
|
|
|
mean value: 0.9435837438423645
|
|
|
|
key: train_precision
|
|
value: [0.97674419 0.97328244 0.97674419 0.96946565 0.97701149 0.97701149
|
|
0.96958175 0.96958175 0.96958175 0.98076923]
|
|
|
|
mean value: 0.9739773930119343
|
|
|
|
key: test_recall
|
|
value: [0.96551724 0.93103448 0.96551724 0.96551724 0.96551724 1.
|
|
1. 1. 0.96428571 0.96428571]
|
|
|
|
mean value: 0.972167487684729
|
|
|
|
key: train_recall
|
|
value: [0.984375 0.99609375 0.984375 0.9921875 0.99609375 0.9922179
|
|
0.9922179 0.9922179 0.9922179 0.9922179 ]
|
|
|
|
mean value: 0.9914214494163425
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.9476601 0.94704433 0.98275862 0.96490148 0.96551724
|
|
0.96551724 0.93103448 0.9476601 0.93041872]
|
|
|
|
mean value: 0.9565270935960591
|
|
|
|
key: train_roc_auc
|
|
value: [0.98051435 0.9844282 0.98051435 0.98052955 0.98637372 0.9843902
|
|
0.98048395 0.98048395 0.98048395 0.98634332]
|
|
|
|
mean value: 0.9824545537451362
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 0.9 0.90322581 0.96551724 0.93333333 0.93333333
|
|
0.93333333 0.875 0.9 0.87096774]
|
|
|
|
mean value: 0.9180228031145717
|
|
|
|
key: train_jcc
|
|
value: [0.96183206 0.96958175 0.96183206 0.96212121 0.97328244 0.96958175
|
|
0.96226415 0.96226415 0.96226415 0.97328244]
|
|
|
|
mean value: 0.9658306170683848
|
|
|
|
MCC on Blind test: 0.82
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.26162863 0.29615593 0.30786967 0.42793369 0.33886909 0.30964303
|
|
0.3056078 0.28437376 0.28493261 0.29579973]
|
|
|
|
mean value: 0.3112813949584961
|
|
|
|
key: score_time
|
|
value: [0.01991153 0.02569556 0.01902437 0.01899886 0.01900196 0.01907182
|
|
0.01891279 0.01901984 0.01902318 0.01902699]
|
|
|
|
mean value: 0.019768691062927245
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 0.8953202 0.8951918 0.89988258 0.92980296 0.9321832
|
|
0.9321832 0.79778885 0.9321832 0.86189955]
|
|
|
|
mean value: 0.9041952772966289
|
|
|
|
key: train_mcc
|
|
value: [0.9610433 0.96907736 0.9610433 0.96884072 0.97289533 0.9688108
|
|
0.96127477 0.9611292 0.9611292 0.97277537]
|
|
|
|
mean value: 0.9658019355184825
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.94736842 0.94736842 0.94736842 0.96491228 0.96491228
|
|
0.96491228 0.89473684 0.96491228 0.92982456]
|
|
|
|
mean value: 0.9508771929824561
|
|
|
|
key: train_accuracy
|
|
value: [0.98050682 0.98440546 0.98050682 0.98440546 0.98635478 0.98440546
|
|
0.98050682 0.98050682 0.98050682 0.98635478]
|
|
|
|
mean value: 0.9828460038986355
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 0.94736842 0.94915254 0.94545455 0.96551724 0.96551724
|
|
0.96551724 0.9 0.96551724 0.93103448]
|
|
|
|
mean value: 0.9517535097506797
|
|
|
|
key: train_fscore
|
|
value: [0.98054475 0.98455598 0.98054475 0.9844358 0.98646035 0.9844358
|
|
0.98076923 0.98069498 0.98069498 0.98646035]
|
|
|
|
mean value: 0.9829596962534292
|
|
|
|
key: test_precision
|
|
value: [1. 0.96428571 0.93333333 1. 0.96551724 0.93333333
|
|
0.93333333 0.84375 0.93333333 0.9 ]
|
|
|
|
mean value: 0.9406886288998358
|
|
|
|
key: train_precision
|
|
value: [0.97674419 0.97328244 0.97674419 0.98062016 0.97701149 0.9844358
|
|
0.96958175 0.97318008 0.97318008 0.98076923]
|
|
|
|
mean value: 0.9765549394873483
|
|
|
|
key: test_recall
|
|
value: [0.96551724 0.93103448 0.96551724 0.89655172 0.96551724 1.
|
|
1. 0.96428571 1. 0.96428571]
|
|
|
|
mean value: 0.9652709359605911
|
|
|
|
key: train_recall
|
|
value: [0.984375 0.99609375 0.984375 0.98828125 0.99609375 0.9844358
|
|
0.9922179 0.98832685 0.98832685 0.9922179 ]
|
|
|
|
mean value: 0.9894744041828794
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.9476601 0.94704433 0.94827586 0.96490148 0.96551724
|
|
0.96551724 0.89593596 0.96551724 0.93041872]
|
|
|
|
mean value: 0.9513546798029557
|
|
|
|
key: train_roc_auc
|
|
value: [0.98051435 0.9844282 0.98051435 0.984413 0.98637372 0.9844054
|
|
0.98048395 0.98049155 0.98049155 0.98634332]
|
|
|
|
mean value: 0.9828459387159534
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 0.9 0.90322581 0.89655172 0.93333333 0.93333333
|
|
0.93333333 0.81818182 0.93333333 0.87096774]
|
|
|
|
mean value: 0.908777766541949
|
|
|
|
key: train_jcc
|
|
value: [0.96183206 0.96958175 0.96183206 0.96934866 0.97328244 0.96934866
|
|
0.96226415 0.96212121 0.96212121 0.97328244]
|
|
|
|
mean value: 0.9665014649876501
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0320878 0.0302875 0.03224111 0.03226137 0.03130984 0.02569366
|
|
0.0313189 0.03118372 0.0292275 0.03123951]
|
|
|
|
mean value: 0.03068509101867676
|
|
|
|
key: score_time
|
|
value: [0.01438856 0.01213551 0.01232028 0.01176858 0.0118649 0.01178813
|
|
0.01177907 0.01186204 0.01174831 0.0119381 ]
|
|
|
|
mean value: 0.012159347534179688
|
|
|
|
key: test_mcc
|
|
value: [0.87447463 0.6681531 0.53145678 0.87082337 0.65714286 0.65714286
|
|
0.7320658 0.37799476 0.79426746 0.79426746]
|
|
|
|
mean value: 0.695778908240471
|
|
|
|
key: train_mcc
|
|
value: [0.87229005 0.89479986 0.87133216 0.91763327 0.85847606 0.86558426
|
|
0.87926178 0.86403922 0.87198258 0.87926178]
|
|
|
|
mean value: 0.8774661011532328
|
|
|
|
key: test_accuracy
|
|
value: [0.93333333 0.83333333 0.75862069 0.93103448 0.82758621 0.82758621
|
|
0.86206897 0.68965517 0.89655172 0.89655172]
|
|
|
|
mean value: 0.845632183908046
|
|
|
|
key: train_accuracy
|
|
value: [0.9351145 0.94656489 0.93536122 0.9581749 0.92775665 0.93155894
|
|
0.9391635 0.93155894 0.93536122 0.9391635 ]
|
|
|
|
mean value: 0.9379778248628566
|
|
|
|
key: test_fscore
|
|
value: [0.9375 0.83870968 0.77419355 0.93333333 0.82758621 0.82758621
|
|
0.85714286 0.70967742 0.90322581 0.90322581]
|
|
|
|
mean value: 0.851218086233381
|
|
|
|
key: train_fscore
|
|
value: [0.93726937 0.94814815 0.93680297 0.95940959 0.93090909 0.93430657
|
|
0.94029851 0.93283582 0.93680297 0.94029851]
|
|
|
|
mean value: 0.9397081558966259
|
|
|
|
key: test_precision
|
|
value: [0.88235294 0.8125 0.70588235 0.875 0.8 0.8
|
|
0.92307692 0.6875 0.875 0.875 ]
|
|
|
|
mean value: 0.823631221719457
|
|
|
|
key: train_precision
|
|
value: [0.90714286 0.92086331 0.91970803 0.9352518 0.8951049 0.90140845
|
|
0.91970803 0.91240876 0.91304348 0.91970803]
|
|
|
|
mean value: 0.9144347635841845
|
|
|
|
key: test_recall
|
|
value: [1. 0.86666667 0.85714286 1. 0.85714286 0.85714286
|
|
0.8 0.73333333 0.93333333 0.93333333]
|
|
|
|
mean value: 0.8838095238095238
|
|
|
|
key: train_recall
|
|
value: [0.96946565 0.97709924 0.95454545 0.98484848 0.96969697 0.96969697
|
|
0.96183206 0.95419847 0.96183206 0.96183206]
|
|
|
|
mean value: 0.9665047420772612
|
|
|
|
key: test_roc_auc
|
|
value: [0.93333333 0.83333333 0.76190476 0.93333333 0.82857143 0.82857143
|
|
0.86428571 0.68809524 0.8952381 0.8952381 ]
|
|
|
|
mean value: 0.8461904761904763
|
|
|
|
key: train_roc_auc
|
|
value: [0.9351145 0.94656489 0.93528799 0.9580731 0.92759658 0.93141337
|
|
0.93924936 0.93164469 0.93546149 0.93924936]
|
|
|
|
mean value: 0.9379655331945409
|
|
|
|
key: test_jcc
|
|
value: [0.88235294 0.72222222 0.63157895 0.875 0.70588235 0.70588235
|
|
0.75 0.55 0.82352941 0.82352941]
|
|
|
|
mean value: 0.7469977640178879
|
|
|
|
key: train_jcc
|
|
value: [0.88194444 0.90140845 0.88111888 0.92198582 0.8707483 0.87671233
|
|
0.88732394 0.87412587 0.88111888 0.88732394]
|
|
|
|
mean value: 0.8863810862525938
|
|
|
|
MCC on Blind test: 0.89
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.72078919 0.87093186 0.76989698 0.7747066 0.90087986 0.77547503
|
|
0.76327038 0.875139 0.7889235 0.83031797]
|
|
|
|
mean value: 0.8070330381393432
|
|
|
|
key: score_time
|
|
value: [0.01421833 0.01484942 0.01467299 0.01446486 0.01456046 0.01565599
|
|
0.01448941 0.01475024 0.01570296 0.01492667]
|
|
|
|
mean value: 0.014829134941101075
|
|
|
|
key: test_mcc
|
|
value: [0.93541435 0.80178373 0.72380952 0.86965655 0.93333333 0.72954522
|
|
0.7320658 0.58571429 0.79426746 0.93302503]
|
|
|
|
mean value: 0.8038615283681674
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 1. 0.99242424 1. ]
|
|
|
|
mean value: 0.9992424242424243
|
|
|
|
key: test_accuracy
|
|
value: [0.96666667 0.9 0.86206897 0.93103448 0.96551724 0.86206897
|
|
0.86206897 0.79310345 0.89655172 0.96551724]
|
|
|
|
mean value: 0.9004597701149425
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 1. 0.99619772 1. ]
|
|
|
|
mean value: 0.9996197718631179
|
|
|
|
key: test_fscore
|
|
value: [0.96774194 0.90322581 0.85714286 0.92307692 0.96551724 0.84615385
|
|
0.85714286 0.8 0.90322581 0.96774194]
|
|
|
|
mean value: 0.8990969208766761
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 1. 0.99619772 1. ]
|
|
|
|
mean value: 0.9996197718631179
|
|
|
|
key: test_precision
|
|
value: [0.9375 0.875 0.85714286 1. 0.93333333 0.91666667
|
|
0.92307692 0.8 0.875 0.9375 ]
|
|
|
|
mean value: 0.905521978021978
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 1. 0.99242424 1. ]
|
|
|
|
mean value: 0.9992424242424243
|
|
|
|
key: test_recall
|
|
value: [1. 0.93333333 0.85714286 0.85714286 1. 0.78571429
|
|
0.8 0.8 0.93333333 1. ]
|
|
|
|
mean value: 0.8966666666666667
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96666667 0.9 0.86190476 0.92857143 0.96666667 0.85952381
|
|
0.86428571 0.79285714 0.8952381 0.96428571]
|
|
|
|
mean value: 0.9
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 1. 0.99621212 1. ]
|
|
|
|
mean value: 0.9996212121212121
|
|
|
|
key: test_jcc
|
|
value: [0.9375 0.82352941 0.75 0.85714286 0.93333333 0.73333333
|
|
0.75 0.66666667 0.82352941 0.9375 ]
|
|
|
|
mean value: 0.8212535014005602
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 1. 0.99242424 1. ]
|
|
|
|
mean value: 0.9992424242424243
|
|
|
|
MCC on Blind test: 0.67
|
|
|
|
Accuracy on Blind test: 0.83
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01313186 0.00966859 0.00940609 0.00909376 0.00917745 0.00930548
|
|
0.00901055 0.00931048 0.00922346 0.00923944]
|
|
|
|
mean value: 0.009656715393066406
|
|
|
|
key: score_time
|
|
value: [0.01565719 0.00918484 0.00901389 0.0087471 0.00874877 0.00872231
|
|
0.00879788 0.00867724 0.00927162 0.00879645]
|
|
|
|
mean value: 0.009561729431152344
|
|
|
|
key: test_mcc
|
|
value: [0.47087096 0.13608276 0.59628479 0.67156812 0.7952381 0.51675233
|
|
0.7320658 0.44761905 0.75093926 0.60575767]
|
|
|
|
mean value: 0.5723178827436838
|
|
|
|
key: train_mcc
|
|
value: [0.64962264 0.52565029 0.72322987 0.68527926 0.72970342 0.62009723
|
|
0.70257774 0.72484208 0.687232 0.64958558]
|
|
|
|
mean value: 0.6697820100068201
|
|
|
|
key: test_accuracy
|
|
value: [0.73333333 0.56666667 0.75862069 0.82758621 0.89655172 0.75862069
|
|
0.86206897 0.72413793 0.86206897 0.79310345]
|
|
|
|
mean value: 0.7782758620689655
|
|
|
|
key: train_accuracy
|
|
value: [0.82061069 0.73664122 0.85931559 0.8365019 0.85931559 0.80988593
|
|
0.84790875 0.85931559 0.84030418 0.82129278]
|
|
|
|
mean value: 0.8291092212579456
|
|
|
|
key: test_fscore
|
|
value: [0.75 0.60606061 0.8 0.83870968 0.89655172 0.74074074
|
|
0.85714286 0.73333333 0.88235294 0.82352941]
|
|
|
|
mean value: 0.7928421291776
|
|
|
|
key: train_fscore
|
|
value: [0.83392226 0.78369906 0.86738351 0.85121107 0.87108014 0.80769231
|
|
0.85714286 0.86738351 0.85 0.83274021]
|
|
|
|
mean value: 0.8422254936530311
|
|
|
|
key: test_precision
|
|
value: [0.70588235 0.55555556 0.66666667 0.76470588 0.86666667 0.76923077
|
|
0.92307692 0.73333333 0.78947368 0.73684211]
|
|
|
|
mean value: 0.7511433939297716
|
|
|
|
key: train_precision
|
|
value: [0.77631579 0.66489362 0.82312925 0.78343949 0.80645161 0.8203125
|
|
0.80536913 0.81756757 0.79865772 0.78 ]
|
|
|
|
mean value: 0.7876136674749878
|
|
|
|
key: test_recall
|
|
value: [0.8 0.66666667 1. 0.92857143 0.92857143 0.71428571
|
|
0.8 0.73333333 1. 0.93333333]
|
|
|
|
mean value: 0.8504761904761905
|
|
|
|
key: train_recall
|
|
value: [0.90076336 0.95419847 0.91666667 0.93181818 0.9469697 0.79545455
|
|
0.91603053 0.92366412 0.90839695 0.89312977]
|
|
|
|
mean value: 0.9087092297015961
|
|
|
|
key: test_roc_auc
|
|
value: [0.73333333 0.56666667 0.76666667 0.83095238 0.89761905 0.75714286
|
|
0.86428571 0.72380952 0.85714286 0.78809524]
|
|
|
|
mean value: 0.7785714285714286
|
|
|
|
key: train_roc_auc
|
|
value: [0.82061069 0.73664122 0.85909669 0.8361381 0.85898103 0.80994101
|
|
0.84816678 0.85955933 0.84056211 0.82156489]
|
|
|
|
mean value: 0.8291261855193153
|
|
|
|
key: test_jcc
|
|
value: [0.6 0.43478261 0.66666667 0.72222222 0.8125 0.58823529
|
|
0.75 0.57894737 0.78947368 0.7 ]
|
|
|
|
mean value: 0.6642827844333767
|
|
|
|
key: train_jcc
|
|
value: [0.71515152 0.6443299 0.76582278 0.74096386 0.77160494 0.67741935
|
|
0.75 0.76582278 0.73913043 0.71341463]
|
|
|
|
mean value: 0.7283660199139936
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00948977 0.00949192 0.0095396 0.00949955 0.00931692 0.00925446
|
|
0.00933433 0.0091722 0.00922704 0.00945187]
|
|
|
|
mean value: 0.009377765655517577
|
|
|
|
key: score_time
|
|
value: [0.00895619 0.00884056 0.00899911 0.00888467 0.00869846 0.00875378
|
|
0.00876188 0.00872612 0.00863218 0.00872183]
|
|
|
|
mean value: 0.008797478675842286
|
|
|
|
key: test_mcc
|
|
value: [0.6 0.47087096 0.6555099 0.51675233 0.51904762 0.51904762
|
|
0.47079191 0.37799476 0.58943389 0.59330823]
|
|
|
|
mean value: 0.531275719579118
|
|
|
|
key: train_mcc
|
|
value: [0.6261184 0.69498055 0.62787882 0.68095573 0.65781864 0.65781864
|
|
0.59892463 0.68823734 0.63502806 0.65024005]
|
|
|
|
mean value: 0.6518000864251088
|
|
|
|
key: test_accuracy
|
|
value: [0.8 0.73333333 0.82758621 0.75862069 0.75862069 0.75862069
|
|
0.72413793 0.68965517 0.79310345 0.79310345]
|
|
|
|
mean value: 0.7636781609195402
|
|
|
|
key: train_accuracy
|
|
value: [0.8129771 0.84732824 0.81368821 0.84030418 0.82889734 0.82889734
|
|
0.79847909 0.84410646 0.81749049 0.82509506]
|
|
|
|
mean value: 0.8257263518416393
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.71428571 0.81481481 0.74074074 0.75862069 0.75862069
|
|
0.69230769 0.70967742 0.8125 0.78571429]
|
|
|
|
mean value: 0.7587282046528431
|
|
|
|
key: train_fscore
|
|
value: [0.81081081 0.84962406 0.81081081 0.83846154 0.82889734 0.82889734
|
|
0.78884462 0.84410646 0.81538462 0.82307692]
|
|
|
|
mean value: 0.823891452089343
|
|
|
|
key: test_precision
|
|
value: [0.8 0.76923077 0.84615385 0.76923077 0.73333333 0.73333333
|
|
0.81818182 0.6875 0.76470588 0.84615385]
|
|
|
|
mean value: 0.7767823597970657
|
|
|
|
key: train_precision
|
|
value: [0.8203125 0.83703704 0.82677165 0.8515625 0.83206107 0.83206107
|
|
0.825 0.84090909 0.82170543 0.82945736]
|
|
|
|
mean value: 0.831687770959169
|
|
|
|
key: test_recall
|
|
value: [0.8 0.66666667 0.78571429 0.71428571 0.78571429 0.78571429
|
|
0.6 0.73333333 0.86666667 0.73333333]
|
|
|
|
mean value: 0.7471428571428571
|
|
|
|
key: train_recall
|
|
value: [0.80152672 0.86259542 0.79545455 0.82575758 0.82575758 0.82575758
|
|
0.75572519 0.84732824 0.80916031 0.81679389]
|
|
|
|
mean value: 0.816585704371964
|
|
|
|
key: test_roc_auc
|
|
value: [0.8 0.73333333 0.82619048 0.75714286 0.75952381 0.75952381
|
|
0.72857143 0.68809524 0.79047619 0.7952381 ]
|
|
|
|
mean value: 0.7638095238095238
|
|
|
|
key: train_roc_auc
|
|
value: [0.8129771 0.84732824 0.81375781 0.8403597 0.82890932 0.82890932
|
|
0.79831714 0.84411867 0.81745894 0.82506361]
|
|
|
|
mean value: 0.8257199861207495
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.55555556 0.6875 0.58823529 0.61111111 0.61111111
|
|
0.52941176 0.55 0.68421053 0.64705882]
|
|
|
|
mean value: 0.6130860853113176
|
|
|
|
key: train_jcc
|
|
value: [0.68181818 0.73856209 0.68181818 0.7218543 0.70779221 0.70779221
|
|
0.65131579 0.73026316 0.68831169 0.69934641]
|
|
|
|
mean value: 0.7008874216268676
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00884247 0.01010442 0.01012564 0.00979137 0.0101862 0.01019549
|
|
0.01009035 0.01044989 0.01014018 0.01014948]
|
|
|
|
mean value: 0.010007548332214355
|
|
|
|
key: score_time
|
|
value: [0.0144124 0.01187205 0.01217246 0.01165318 0.01194263 0.01190805
|
|
0.01191378 0.01192284 0.01200461 0.01685095]
|
|
|
|
mean value: 0.012665295600891113
|
|
|
|
key: test_mcc
|
|
value: [0.40824829 0.06726728 0.51904762 0.37799476 0.31579309 0.44761905
|
|
0.41051346 0.44932255 0.44932255 0.45455066]
|
|
|
|
mean value: 0.38996793082617276
|
|
|
|
key: train_mcc
|
|
value: [0.6261184 0.67938931 0.65024005 0.65831512 0.61981608 0.62737841
|
|
0.65779886 0.71102244 0.62739995 0.62737841]
|
|
|
|
mean value: 0.6484857019359617
|
|
|
|
key: test_accuracy
|
|
value: [0.7 0.53333333 0.75862069 0.68965517 0.65517241 0.72413793
|
|
0.68965517 0.72413793 0.72413793 0.72413793]
|
|
|
|
mean value: 0.6922988505747126
|
|
|
|
key: train_accuracy
|
|
value: [0.8129771 0.83969466 0.82509506 0.82889734 0.80988593 0.81368821
|
|
0.82889734 0.85551331 0.81368821 0.81368821]
|
|
|
|
mean value: 0.8242025367892492
|
|
|
|
key: test_fscore
|
|
value: [0.72727273 0.5625 0.75862069 0.66666667 0.66666667 0.71428571
|
|
0.64 0.75 0.75 0.71428571]
|
|
|
|
mean value: 0.6950298178832661
|
|
|
|
key: train_fscore
|
|
value: [0.81509434 0.83969466 0.82706767 0.82625483 0.81203008 0.81509434
|
|
0.82758621 0.85496183 0.81368821 0.81226054]
|
|
|
|
mean value: 0.8243732694633406
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.52941176 0.73333333 0.69230769 0.625 0.71428571
|
|
0.8 0.70588235 0.70588235 0.76923077]
|
|
|
|
mean value: 0.6942000646412412
|
|
|
|
key: train_precision
|
|
value: [0.80597015 0.83969466 0.82089552 0.84251969 0.80597015 0.81203008
|
|
0.83076923 0.85496183 0.81060606 0.81538462]
|
|
|
|
mean value: 0.8238801976432387
|
|
|
|
key: test_recall
|
|
value: [0.8 0.6 0.78571429 0.64285714 0.71428571 0.71428571
|
|
0.53333333 0.8 0.8 0.66666667]
|
|
|
|
mean value: 0.7057142857142857
|
|
|
|
key: train_recall
|
|
value: [0.82442748 0.83969466 0.83333333 0.81060606 0.81818182 0.81818182
|
|
0.82442748 0.85496183 0.81679389 0.80916031]
|
|
|
|
mean value: 0.8249768679157993
|
|
|
|
key: test_roc_auc
|
|
value: [0.7 0.53333333 0.75952381 0.68809524 0.65714286 0.72380952
|
|
0.6952381 0.72142857 0.72142857 0.72619048]
|
|
|
|
mean value: 0.6926190476190477
|
|
|
|
key: train_roc_auc
|
|
value: [0.8129771 0.83969466 0.82506361 0.82896715 0.80985427 0.81367106
|
|
0.82888041 0.85551122 0.81369998 0.81367106]
|
|
|
|
mean value: 0.8241990515845478
|
|
|
|
key: test_jcc
|
|
value: [0.57142857 0.39130435 0.61111111 0.5 0.5 0.55555556
|
|
0.47058824 0.6 0.6 0.55555556]
|
|
|
|
mean value: 0.5355543376770998
|
|
|
|
key: train_jcc
|
|
value: [0.68789809 0.72368421 0.70512821 0.70394737 0.6835443 0.68789809
|
|
0.70588235 0.74666667 0.68589744 0.68387097]
|
|
|
|
mean value: 0.7014417689464205
|
|
|
|
MCC on Blind test: 0.32
|
|
|
|
Accuracy on Blind test: 0.67
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01403451 0.01422858 0.01349854 0.01356149 0.01517701 0.01492858
|
|
0.01510477 0.0139873 0.01390624 0.01416183]
|
|
|
|
mean value: 0.014258885383605957
|
|
|
|
key: score_time
|
|
value: [0.0101316 0.01045656 0.01018333 0.01049423 0.01039624 0.01010704
|
|
0.00998545 0.01039577 0.01092601 0.0106802 ]
|
|
|
|
mean value: 0.010375642776489257
|
|
|
|
key: test_mcc
|
|
value: [0.76088591 0.6 0.59330823 0.75522869 0.7320658 0.49891416
|
|
0.6130103 0.52473682 0.6669552 0.72954522]
|
|
|
|
mean value: 0.6474650323376637
|
|
|
|
key: train_mcc
|
|
value: [0.78526617 0.82149863 0.77061242 0.77908214 0.78449388 0.77753667
|
|
0.80694112 0.77773853 0.77636354 0.79584333]
|
|
|
|
mean value: 0.7875376442859248
|
|
|
|
key: test_accuracy
|
|
value: [0.86666667 0.8 0.79310345 0.86206897 0.86206897 0.72413793
|
|
0.79310345 0.75862069 0.82758621 0.86206897]
|
|
|
|
mean value: 0.8149425287356322
|
|
|
|
key: train_accuracy
|
|
value: [0.88931298 0.90839695 0.88212928 0.88593156 0.88973384 0.88593156
|
|
0.90114068 0.88593156 0.88212928 0.8973384 ]
|
|
|
|
mean value: 0.890797608335994
|
|
|
|
key: test_fscore
|
|
value: [0.88235294 0.8 0.8 0.875 0.86666667 0.76470588
|
|
0.76923077 0.78787879 0.84848485 0.875 ]
|
|
|
|
mean value: 0.8269319895790483
|
|
|
|
key: train_fscore
|
|
value: [0.89605735 0.91304348 0.88967972 0.89361702 0.89605735 0.89285714
|
|
0.9057971 0.89208633 0.89122807 0.89962825]
|
|
|
|
mean value: 0.8970051808385671
|
|
|
|
key: test_precision
|
|
value: [0.78947368 0.8 0.75 0.77777778 0.8125 0.65
|
|
0.90909091 0.72222222 0.77777778 0.82352941]
|
|
|
|
mean value: 0.7812371782843919
|
|
|
|
key: train_precision
|
|
value: [0.84459459 0.86896552 0.83892617 0.84 0.85034014 0.84459459
|
|
0.86206897 0.84353741 0.82467532 0.87681159]
|
|
|
|
mean value: 0.8494514316343086
|
|
|
|
key: test_recall
|
|
value: [1. 0.8 0.85714286 1. 0.92857143 0.92857143
|
|
0.66666667 0.86666667 0.93333333 0.93333333]
|
|
|
|
mean value: 0.8914285714285715
|
|
|
|
key: train_recall
|
|
value: [0.95419847 0.96183206 0.9469697 0.95454545 0.9469697 0.9469697
|
|
0.95419847 0.94656489 0.96946565 0.92366412]
|
|
|
|
mean value: 0.9505378209576683
|
|
|
|
key: test_roc_auc
|
|
value: [0.86666667 0.8 0.7952381 0.86666667 0.86428571 0.73095238
|
|
0.79761905 0.7547619 0.82380952 0.85952381]
|
|
|
|
mean value: 0.815952380952381
|
|
|
|
key: train_roc_auc
|
|
value: [0.88931298 0.90839695 0.8818818 0.88566967 0.88951538 0.88569859
|
|
0.90134166 0.88616123 0.8824601 0.89743812]
|
|
|
|
mean value: 0.8907876474670368
|
|
|
|
key: test_jcc
|
|
value: [0.78947368 0.66666667 0.66666667 0.77777778 0.76470588 0.61904762
|
|
0.625 0.65 0.73684211 0.77777778]
|
|
|
|
mean value: 0.7073958179763133
|
|
|
|
key: train_jcc
|
|
value: [0.81168831 0.84 0.80128205 0.80769231 0.81168831 0.80645161
|
|
0.82781457 0.80519481 0.80379747 0.81756757]
|
|
|
|
mean value: 0.8133177005907435
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.13563442 1.24090719 1.12091327 1.32892895 1.12880158 1.26187825
|
|
1.13260579 1.25287151 1.14153361 1.26136589]
|
|
|
|
mean value: 1.200544047355652
|
|
|
|
key: score_time
|
|
value: [0.01544094 0.01418996 0.01236367 0.01428485 0.0153091 0.01543164
|
|
0.01432538 0.01541853 0.01238394 0.01642776]
|
|
|
|
mean value: 0.014557576179504395
|
|
|
|
key: test_mcc
|
|
value: [0.93541435 0.6 0.65714286 0.65714286 0.7320658 0.58571429
|
|
0.67156812 0.51675233 0.6669552 0.86965655]
|
|
|
|
mean value: 0.6892412343569269
|
|
|
|
key: train_mcc
|
|
value: [0.99239533 1. 1. 1. 1. 1.
|
|
1. 0.99242424 1. 0.99242424]
|
|
|
|
mean value: 0.9977243811746231
|
|
|
|
key: test_accuracy
|
|
value: [0.96666667 0.8 0.82758621 0.82758621 0.86206897 0.79310345
|
|
0.82758621 0.75862069 0.82758621 0.93103448]
|
|
|
|
mean value: 0.842183908045977
|
|
|
|
key: train_accuracy
|
|
value: [0.99618321 1. 1. 1. 1. 1.
|
|
1. 0.99619772 1. 0.99619772]
|
|
|
|
mean value: 0.9988578643369228
|
|
|
|
key: test_fscore
|
|
value: [0.96774194 0.8 0.82758621 0.82758621 0.86666667 0.78571429
|
|
0.81481481 0.77419355 0.84848485 0.9375 ]
|
|
|
|
mean value: 0.8450288513344687
|
|
|
|
key: train_fscore
|
|
value: [0.99619772 1. 1. 1. 1. 1.
|
|
1. 0.99619772 1. 0.99619772]
|
|
|
|
mean value: 0.9988593155893536
|
|
|
|
key: test_precision
|
|
value: [0.9375 0.8 0.8 0.8 0.8125 0.78571429
|
|
0.91666667 0.75 0.77777778 0.88235294]
|
|
|
|
mean value: 0.8262511671335201
|
|
|
|
key: train_precision
|
|
value: [0.99242424 1. 1. 1. 1. 1.
|
|
1. 0.99242424 1. 0.99242424]
|
|
|
|
mean value: 0.9977272727272727
|
|
|
|
key: test_recall
|
|
value: [1. 0.8 0.85714286 0.85714286 0.92857143 0.78571429
|
|
0.73333333 0.8 0.93333333 1. ]
|
|
|
|
mean value: 0.8695238095238095
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96666667 0.8 0.82857143 0.82857143 0.86428571 0.79285714
|
|
0.83095238 0.75714286 0.82380952 0.92857143]
|
|
|
|
mean value: 0.8421428571428572
|
|
|
|
key: train_roc_auc
|
|
value: [0.99618321 1. 1. 1. 1. 1.
|
|
1. 0.99621212 1. 0.99621212]
|
|
|
|
mean value: 0.9988607448531113
|
|
|
|
key: test_jcc
|
|
value: [0.9375 0.66666667 0.70588235 0.70588235 0.76470588 0.64705882
|
|
0.6875 0.63157895 0.73684211 0.88235294]
|
|
|
|
mean value: 0.7365970072239422
|
|
|
|
key: train_jcc
|
|
value: [0.99242424 1. 1. 1. 1. 1.
|
|
1. 0.99242424 1. 0.99242424]
|
|
|
|
mean value: 0.9977272727272727
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01959491 0.01731563 0.01389885 0.01488256 0.01310444 0.01357579
|
|
0.01422048 0.01348567 0.01311564 0.01370406]
|
|
|
|
mean value: 0.014689803123474121
|
|
|
|
key: score_time
|
|
value: [0.01209402 0.00925374 0.00893354 0.00895524 0.00898933 0.00864458
|
|
0.00868011 0.00873613 0.00873876 0.00870037]
|
|
|
|
mean value: 0.009172582626342773
|
|
|
|
key: test_mcc
|
|
value: [0.93541435 0.76088591 0.7952381 1. 0.93333333 0.93302503
|
|
0.93302503 0.86190476 0.86190476 0.93333333]
|
|
|
|
mean value: 0.89480646108909
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96666667 0.86666667 0.89655172 1. 0.96551724 0.96551724
|
|
0.96551724 0.93103448 0.93103448 0.96551724]
|
|
|
|
mean value: 0.9454022988505747
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96551724 0.88235294 0.89655172 1. 0.96551724 0.96296296
|
|
0.96774194 0.93333333 0.93333333 0.96551724]
|
|
|
|
mean value: 0.9472827954565833
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.78947368 0.86666667 1. 0.93333333 1.
|
|
0.9375 0.93333333 0.93333333 1. ]
|
|
|
|
mean value: 0.9393640350877193
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.93333333 1. 0.92857143 1. 1. 0.92857143
|
|
1. 0.93333333 0.93333333 0.93333333]
|
|
|
|
mean value: 0.959047619047619
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96666667 0.86666667 0.89761905 1. 0.96666667 0.96428571
|
|
0.96428571 0.93095238 0.93095238 0.96666667]
|
|
|
|
mean value: 0.9454761904761905
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.93333333 0.78947368 0.8125 1. 0.93333333 0.92857143
|
|
0.9375 0.875 0.875 0.93333333]
|
|
|
|
mean value: 0.9018045112781955
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10119128 0.10048294 0.09955859 0.09904623 0.0998683 0.10016418
|
|
0.09969735 0.09930968 0.09993553 0.10012221]
|
|
|
|
mean value: 0.09993762969970703
|
|
|
|
key: score_time
|
|
value: [0.01779675 0.01739693 0.01740122 0.01741219 0.01740956 0.01755548
|
|
0.01746917 0.01769543 0.01732755 0.01740599]
|
|
|
|
mean value: 0.01748702526092529
|
|
|
|
key: test_mcc
|
|
value: [0.87447463 0.6 0.7952381 0.87082337 0.72380952 0.58571429
|
|
0.67156812 0.6555099 0.72954522 0.79426746]
|
|
|
|
mean value: 0.730095059899352
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.93333333 0.8 0.89655172 0.93103448 0.86206897 0.79310345
|
|
0.82758621 0.82758621 0.86206897 0.89655172]
|
|
|
|
mean value: 0.8629885057471265
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.9375 0.8 0.89655172 0.93333333 0.85714286 0.78571429
|
|
0.81481481 0.83870968 0.875 0.90322581]
|
|
|
|
mean value: 0.8641992499014189
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.88235294 0.8 0.86666667 0.875 0.85714286 0.78571429
|
|
0.91666667 0.8125 0.82352941 0.875 ]
|
|
|
|
mean value: 0.8494572829131652
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.8 0.92857143 1. 0.85714286 0.78571429
|
|
0.73333333 0.86666667 0.93333333 0.93333333]
|
|
|
|
mean value: 0.8838095238095238
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.93333333 0.8 0.89761905 0.93333333 0.86190476 0.79285714
|
|
0.83095238 0.82619048 0.85952381 0.8952381 ]
|
|
|
|
mean value: 0.8630952380952381
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.88235294 0.66666667 0.8125 0.875 0.75 0.64705882
|
|
0.6875 0.72222222 0.77777778 0.82352941]
|
|
|
|
mean value: 0.7644607843137254
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.89
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00918102 0.00931501 0.00934839 0.00934172 0.00923967 0.00923657
|
|
0.00935435 0.00937009 0.00967383 0.00934172]
|
|
|
|
mean value: 0.009340238571166993
|
|
|
|
key: score_time
|
|
value: [0.00858641 0.008672 0.00861835 0.00867772 0.00864005 0.00860858
|
|
0.00866485 0.00873733 0.00887012 0.00874567]
|
|
|
|
mean value: 0.00868210792541504
|
|
|
|
key: test_mcc
|
|
value: [0.62254302 0.3363364 0.51904762 0.1702129 0.44932255 0.44932255
|
|
0.17703552 0.25123412 0.51904762 0.37799476]
|
|
|
|
mean value: 0.387209704925739
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.8 0.66666667 0.75862069 0.5862069 0.72413793 0.72413793
|
|
0.5862069 0.62068966 0.75862069 0.68965517]
|
|
|
|
mean value: 0.6914942528735633
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.76923077 0.6875 0.75862069 0.5 0.69230769 0.69230769
|
|
0.57142857 0.59259259 0.75862069 0.70967742]
|
|
|
|
mean value: 0.6732286116532501
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.90909091 0.64705882 0.73333333 0.6 0.75 0.75
|
|
0.61538462 0.66666667 0.78571429 0.6875 ]
|
|
|
|
mean value: 0.7144748633719222
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.66666667 0.73333333 0.78571429 0.42857143 0.64285714 0.64285714
|
|
0.53333333 0.53333333 0.73333333 0.73333333]
|
|
|
|
mean value: 0.6433333333333333
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.8 0.66666667 0.75952381 0.58095238 0.72142857 0.72142857
|
|
0.58809524 0.62380952 0.75952381 0.68809524]
|
|
|
|
mean value: 0.690952380952381
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.625 0.52380952 0.61111111 0.33333333 0.52941176 0.52941176
|
|
0.4 0.42105263 0.61111111 0.55 ]
|
|
|
|
mean value: 0.5134241240355791
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.29
|
|
|
|
Accuracy on Blind test: 0.58
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.35667634 1.3484025 1.35595965 1.36112309 1.36566496 1.35094857
|
|
1.36121631 1.35161304 1.35443139 1.35554624]
|
|
|
|
mean value: 1.3561582088470459
|
|
|
|
key: score_time
|
|
value: [0.089463 0.08975649 0.08854628 0.09363389 0.08926678 0.09072804
|
|
0.08889103 0.09302664 0.08930659 0.08826613]
|
|
|
|
mean value: 0.09008848667144775
|
|
|
|
key: test_mcc
|
|
value: [0.86666667 0.73994007 0.87082337 0.87082337 0.93333333 0.93333333
|
|
0.87082337 0.72954522 0.80917359 0.86190476]
|
|
|
|
mean value: 0.8486367077732017
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.93333333 0.86666667 0.93103448 0.93103448 0.96551724 0.96551724
|
|
0.93103448 0.86206897 0.89655172 0.93103448]
|
|
|
|
mean value: 0.9213793103448276
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.875 0.93333333 0.93333333 0.96551724 0.96551724
|
|
0.92857143 0.875 0.90909091 0.93333333]
|
|
|
|
mean value: 0.9252030153754291
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.93333333 0.82352941 0.875 0.875 0.93333333 0.93333333
|
|
1. 0.82352941 0.83333333 0.93333333]
|
|
|
|
mean value: 0.8963725490196078
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.93333333 0.93333333 1. 1. 1. 1.
|
|
0.86666667 0.93333333 1. 0.93333333]
|
|
|
|
mean value: 0.96
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.93333333 0.86666667 0.93333333 0.93333333 0.96666667 0.96666667
|
|
0.93333333 0.85952381 0.89285714 0.93095238]
|
|
|
|
mean value: 0.9216666666666666
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.77777778 0.875 0.875 0.93333333 0.93333333
|
|
0.86666667 0.77777778 0.83333333 0.875 ]
|
|
|
|
mean value: 0.8622222222222222
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.94
|
|
|
|
Accuracy on Blind test: 0.97
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.87463975 0.91463089 0.88259697 0.87200809 0.91593957 0.88284707
|
|
1.00016236 1.01180363 0.92674494 0.95000696]
|
|
|
|
mean value: 0.9231380224227905
|
|
|
|
key: score_time
|
|
value: [0.19464636 0.18248391 0.18350911 0.25170422 0.19830871 0.21641397
|
|
0.23606873 0.16629505 0.22552013 0.22263598]
|
|
|
|
mean value: 0.20775861740112306
|
|
|
|
key: test_mcc
|
|
value: [0.93541435 0.80178373 0.87082337 0.87082337 0.93333333 0.93333333
|
|
0.93302503 0.72954522 0.86965655 0.86190476]
|
|
|
|
mean value: 0.8739643038807166
|
|
|
|
key: train_mcc
|
|
value: [0.96253342 0.9699179 0.97002678 0.9553594 0.96266749 0.96266749
|
|
0.96267809 0.96267809 0.96267809 0.95537456]
|
|
|
|
mean value: 0.9626581323585782
|
|
|
|
key: test_accuracy
|
|
value: [0.96666667 0.9 0.93103448 0.93103448 0.96551724 0.96551724
|
|
0.96551724 0.86206897 0.93103448 0.93103448]
|
|
|
|
mean value: 0.9349425287356322
|
|
|
|
key: train_accuracy
|
|
value: [0.98091603 0.98473282 0.98479087 0.97718631 0.98098859 0.98098859
|
|
0.98098859 0.98098859 0.98098859 0.97718631]
|
|
|
|
mean value: 0.9809755318840159
|
|
|
|
key: test_fscore
|
|
value: [0.96774194 0.90322581 0.93333333 0.93333333 0.96551724 0.96551724
|
|
0.96774194 0.875 0.9375 0.93333333]
|
|
|
|
mean value: 0.9382244160177976
|
|
|
|
key: train_fscore
|
|
value: [0.98127341 0.98496241 0.98507463 0.97777778 0.98141264 0.98141264
|
|
0.98127341 0.98127341 0.98127341 0.97761194]
|
|
|
|
mean value: 0.9813345662726205
|
|
|
|
key: test_precision
|
|
value: [0.9375 0.875 0.875 0.875 0.93333333 0.93333333
|
|
0.9375 0.82352941 0.88235294 0.93333333]
|
|
|
|
mean value: 0.9005882352941177
|
|
|
|
key: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
train_precision
|
|
value: [0.96323529 0.97037037 0.97058824 0.95652174 0.96350365 0.96350365
|
|
0.96323529 0.96323529 0.96323529 0.95620438]
|
|
|
|
mean value: 0.9633633200097628
|
|
|
|
key: test_recall
|
|
value: [1. 0.93333333 1. 1. 1. 1.
|
|
1. 0.93333333 1. 0.93333333]
|
|
|
|
mean value: 0.98
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96666667 0.9 0.93333333 0.93333333 0.96666667 0.96666667
|
|
0.96428571 0.85952381 0.92857143 0.93095238]
|
|
|
|
mean value: 0.935
|
|
|
|
key: train_roc_auc
|
|
value: [0.98091603 0.98473282 0.98473282 0.97709924 0.98091603 0.98091603
|
|
0.98106061 0.98106061 0.98106061 0.97727273]
|
|
|
|
mean value: 0.9809767522553783
|
|
|
|
key: test_jcc
|
|
value: [0.9375 0.82352941 0.875 0.875 0.93333333 0.93333333
|
|
0.9375 0.77777778 0.88235294 0.875 ]
|
|
|
|
mean value: 0.8850326797385621
|
|
|
|
key: train_jcc
|
|
value: [0.96323529 0.97037037 0.97058824 0.95652174 0.96350365 0.96350365
|
|
0.96323529 0.96323529 0.96323529 0.95620438]
|
|
|
|
mean value: 0.9633633200097628
|
|
|
|
MCC on Blind test: 0.94
|
|
|
|
Accuracy on Blind test: 0.97
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.0245676 0.01031852 0.01025105 0.01026773 0.00969219 0.01044655
|
|
0.01050472 0.01041341 0.01058054 0.01019526]
|
|
|
|
mean value: 0.011723756790161133
|
|
|
|
key: score_time
|
|
value: [0.01320982 0.00926924 0.00959635 0.00879884 0.0094924 0.0095644
|
|
0.00961423 0.00954556 0.00919628 0.00949216]
|
|
|
|
mean value: 0.00977792739868164
|
|
|
|
key: test_mcc
|
|
value: [0.6 0.47087096 0.6555099 0.51675233 0.51904762 0.51904762
|
|
0.47079191 0.37799476 0.58943389 0.59330823]
|
|
|
|
mean value: 0.531275719579118
|
|
|
|
key: train_mcc
|
|
value: [0.6261184 0.69498055 0.62787882 0.68095573 0.65781864 0.65781864
|
|
0.59892463 0.68823734 0.63502806 0.65024005]
|
|
|
|
mean value: 0.6518000864251088
|
|
|
|
key: test_accuracy
|
|
value: [0.8 0.73333333 0.82758621 0.75862069 0.75862069 0.75862069
|
|
0.72413793 0.68965517 0.79310345 0.79310345]
|
|
|
|
mean value: 0.7636781609195402
|
|
|
|
key: train_accuracy
|
|
value: [0.8129771 0.84732824 0.81368821 0.84030418 0.82889734 0.82889734
|
|
0.79847909 0.84410646 0.81749049 0.82509506]
|
|
|
|
mean value: 0.8257263518416393
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.71428571 0.81481481 0.74074074 0.75862069 0.75862069
|
|
0.69230769 0.70967742 0.8125 0.78571429]
|
|
|
|
mean value: 0.7587282046528431
|
|
|
|
key: train_fscore
|
|
value: [0.81081081 0.84962406 0.81081081 0.83846154 0.82889734 0.82889734
|
|
0.78884462 0.84410646 0.81538462 0.82307692]
|
|
|
|
mean value: 0.823891452089343
|
|
|
|
key: test_precision
|
|
value: [0.8 0.76923077 0.84615385 0.76923077 0.73333333 0.73333333
|
|
0.81818182 0.6875 0.76470588 0.84615385]
|
|
|
|
mean value: 0.7767823597970657
|
|
|
|
key: train_precision
|
|
value: [0.8203125 0.83703704 0.82677165 0.8515625 0.83206107 0.83206107
|
|
0.825 0.84090909 0.82170543 0.82945736]
|
|
|
|
mean value: 0.831687770959169
|
|
|
|
key: test_recall
|
|
value: [0.8 0.66666667 0.78571429 0.71428571 0.78571429 0.78571429
|
|
0.6 0.73333333 0.86666667 0.73333333]
|
|
|
|
mean value: 0.7471428571428571
|
|
|
|
key: train_recall
|
|
value: [0.80152672 0.86259542 0.79545455 0.82575758 0.82575758 0.82575758
|
|
0.75572519 0.84732824 0.80916031 0.81679389]
|
|
|
|
mean value: 0.816585704371964
|
|
|
|
key: test_roc_auc
|
|
value: [0.8 0.73333333 0.82619048 0.75714286 0.75952381 0.75952381
|
|
0.72857143 0.68809524 0.79047619 0.7952381 ]
|
|
|
|
mean value: 0.7638095238095238
|
|
|
|
key: train_roc_auc
|
|
value: [0.8129771 0.84732824 0.81375781 0.8403597 0.82890932 0.82890932
|
|
0.79831714 0.84411867 0.81745894 0.82506361]
|
|
|
|
mean value: 0.8257199861207495
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.55555556 0.6875 0.58823529 0.61111111 0.61111111
|
|
0.52941176 0.55 0.68421053 0.64705882]
|
|
|
|
mean value: 0.6130860853113176
|
|
|
|
key: train_jcc
|
|
value: [0.68181818 0.73856209 0.68181818 0.7218543 0.70779221 0.70779221
|
|
0.65131579 0.73026316 0.68831169 0.69934641]
|
|
|
|
mean value: 0.7008874216268676
|
|
|
|
MCC on Blind test: 0.43
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.23225641 0.0465076 0.05317879 0.05914593 0.05356336 0.05773616
|
|
0.07073283 0.05968189 0.07531691 0.04838347]
|
|
|
|
mean value: 0.07565033435821533
|
|
|
|
key: score_time
|
|
value: [0.01092958 0.01095676 0.01085782 0.01044011 0.01059103 0.01085639
|
|
0.01082778 0.01170135 0.01140642 0.01048422]
|
|
|
|
mean value: 0.010905146598815918
|
|
|
|
key: test_mcc
|
|
value: [0.93541435 0.87447463 0.93333333 1. 0.93333333 1.
|
|
0.93333333 0.93302503 0.93302503 0.93333333]
|
|
|
|
mean value: 0.9409272380452472
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96666667 0.93333333 0.96551724 1. 0.96551724 1.
|
|
0.96551724 0.96551724 0.96551724 0.96551724]
|
|
|
|
mean value: 0.9693103448275863
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96551724 0.9375 0.96551724 1. 0.96551724 1.
|
|
0.96551724 0.96774194 0.96774194 0.96551724]
|
|
|
|
mean value: 0.9700570077864294
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.88235294 0.93333333 1. 0.93333333 1.
|
|
1. 0.9375 0.9375 1. ]
|
|
|
|
mean value: 0.9624019607843137
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.93333333 1. 1. 1. 1. 1.
|
|
0.93333333 1. 1. 0.93333333]
|
|
|
|
mean value: 0.98
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96666667 0.93333333 0.96666667 1. 0.96666667 1.
|
|
0.96666667 0.96428571 0.96428571 0.96666667]
|
|
|
|
mean value: 0.9695238095238096
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.93333333 0.88235294 0.93333333 1. 0.93333333 1.
|
|
0.93333333 0.9375 0.9375 0.93333333]
|
|
|
|
mean value: 0.9424019607843137
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.04960895 0.06460238 0.06222391 0.05709147 0.06891847 0.05664587
|
|
0.06179833 0.0570817 0.0564754 0.05665326]
|
|
|
|
mean value: 0.059109973907470706
|
|
|
|
key: score_time
|
|
value: [0.02435279 0.02223825 0.02524686 0.02369213 0.02095008 0.02401018
|
|
0.02549863 0.02384377 0.02049375 0.0244801 ]
|
|
|
|
mean value: 0.023480653762817383
|
|
|
|
key: test_mcc
|
|
value: [0.86666667 0.73994007 0.7952381 0.93333333 0.6555099 0.59330823
|
|
0.45455066 0.7952381 0.65714286 0.72380952]
|
|
|
|
mean value: 0.7214737425055385
|
|
|
|
key: train_mcc
|
|
value: [0.97735555 0.99239533 0.98490371 0.98490371 0.98490371 0.96969173
|
|
0.98490544 0.98490544 0.97744232 0.97744232]
|
|
|
|
mean value: 0.9818849266007134
|
|
|
|
key: test_accuracy
|
|
value: [0.93333333 0.86666667 0.89655172 0.96551724 0.82758621 0.79310345
|
|
0.72413793 0.89655172 0.82758621 0.86206897]
|
|
|
|
mean value: 0.8593103448275862
|
|
|
|
key: train_accuracy
|
|
value: [0.98854962 0.99618321 0.99239544 0.99239544 0.99239544 0.98479087
|
|
0.99239544 0.99239544 0.98859316 0.98859316]
|
|
|
|
mean value: 0.9908687197051056
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.875 0.89655172 0.96551724 0.81481481 0.8
|
|
0.71428571 0.89655172 0.82758621 0.86666667]
|
|
|
|
mean value: 0.8590307425652253
|
|
|
|
key: train_fscore
|
|
value: [0.98867925 0.99619772 0.9924812 0.9924812 0.9924812 0.98496241
|
|
0.99242424 0.99242424 0.98867925 0.98867925]
|
|
|
|
mean value: 0.9909489954366314
|
|
|
|
key: test_precision
|
|
value: [0.93333333 0.82352941 0.86666667 0.93333333 0.84615385 0.75
|
|
0.76923077 0.92857143 0.85714286 0.86666667]
|
|
|
|
mean value: 0.8574628312863607
|
|
|
|
key: train_precision
|
|
value: [0.97761194 0.99242424 0.98507463 0.98507463 0.98507463 0.97761194
|
|
0.98496241 0.98496241 0.97761194 0.97761194]
|
|
|
|
mean value: 0.9828020696245362
|
|
|
|
key: test_recall
|
|
value: [0.93333333 0.93333333 0.92857143 1. 0.78571429 0.85714286
|
|
0.66666667 0.86666667 0.8 0.86666667]
|
|
|
|
mean value: 0.8638095238095238
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 0.99242424
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9992424242424243
|
|
|
|
key: test_roc_auc
|
|
value: [0.93333333 0.86666667 0.89761905 0.96666667 0.82619048 0.7952381
|
|
0.72619048 0.89761905 0.82857143 0.86190476]
|
|
|
|
mean value: 0.86
|
|
|
|
key: train_roc_auc
|
|
value: [0.98854962 0.99618321 0.99236641 0.99236641 0.99236641 0.98476174
|
|
0.99242424 0.99242424 0.98863636 0.98863636]
|
|
|
|
mean value: 0.9908715012722646
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.77777778 0.8125 0.93333333 0.6875 0.66666667
|
|
0.55555556 0.8125 0.70588235 0.76470588]
|
|
|
|
mean value: 0.7591421568627451
|
|
|
|
key: train_jcc
|
|
value: [0.97761194 0.99242424 0.98507463 0.98507463 0.98507463 0.97037037
|
|
0.98496241 0.98496241 0.97761194 0.97761194]
|
|
|
|
mean value: 0.9820779126317225
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01477504 0.00951862 0.00931025 0.00919795 0.00936079 0.00953674
|
|
0.01026082 0.01047182 0.00936103 0.00997782]
|
|
|
|
mean value: 0.010177087783813477
|
|
|
|
key: score_time
|
|
value: [0.00942469 0.00893283 0.00876451 0.00868773 0.00890875 0.00949097
|
|
0.00881505 0.00911379 0.00960183 0.00871181]
|
|
|
|
mean value: 0.00904519557952881
|
|
|
|
key: test_mcc
|
|
value: [0.81649658 0.60540551 0.59330823 0.67156812 0.7320658 0.44188962
|
|
0.59330823 0.51675233 0.72954522 0.51675233]
|
|
|
|
mean value: 0.621709195862531
|
|
|
|
key: train_mcc
|
|
value: [0.65259237 0.64668979 0.67853804 0.68974549 0.67619108 0.65131111
|
|
0.69883248 0.68267177 0.6481901 0.63570088]
|
|
|
|
mean value: 0.6660463116903754
|
|
|
|
key: test_accuracy
|
|
value: [0.9 0.8 0.79310345 0.82758621 0.86206897 0.68965517
|
|
0.79310345 0.75862069 0.86206897 0.75862069]
|
|
|
|
mean value: 0.8044827586206896
|
|
|
|
key: train_accuracy
|
|
value: [0.82442748 0.82061069 0.8365019 0.84410646 0.8365019 0.82509506
|
|
0.84790875 0.84030418 0.82129278 0.81749049]
|
|
|
|
mean value: 0.8314239688851479
|
|
|
|
key: test_fscore
|
|
value: [0.90909091 0.78571429 0.8 0.83870968 0.86666667 0.74285714
|
|
0.78571429 0.77419355 0.875 0.77419355]
|
|
|
|
mean value: 0.8152140064236838
|
|
|
|
key: train_fscore
|
|
value: [0.83333333 0.83154122 0.84697509 0.84981685 0.84476534 0.83088235
|
|
0.8540146 0.84558824 0.83154122 0.82089552]
|
|
|
|
mean value: 0.8389353761517929
|
|
|
|
key: test_precision
|
|
value: [0.83333333 0.84615385 0.75 0.76470588 0.8125 0.61904762
|
|
0.84615385 0.75 0.82352941 0.75 ]
|
|
|
|
mean value: 0.7795423938806292
|
|
|
|
key: train_precision
|
|
value: [0.79310345 0.78378378 0.79865772 0.82269504 0.80689655 0.80714286
|
|
0.81818182 0.81560284 0.78378378 0.80291971]
|
|
|
|
mean value: 0.803276754138267
|
|
|
|
key: test_recall
|
|
value: [1. 0.73333333 0.85714286 0.92857143 0.92857143 0.92857143
|
|
0.73333333 0.8 0.93333333 0.8 ]
|
|
|
|
mean value: 0.8642857142857143
|
|
|
|
key: train_recall
|
|
value: [0.8778626 0.88549618 0.90151515 0.87878788 0.88636364 0.85606061
|
|
0.89312977 0.8778626 0.88549618 0.83969466]
|
|
|
|
mean value: 0.8782269257460097
|
|
|
|
key: test_roc_auc
|
|
value: [0.9 0.8 0.7952381 0.83095238 0.86428571 0.69761905
|
|
0.7952381 0.75714286 0.85952381 0.75714286]
|
|
|
|
mean value: 0.8057142857142857
|
|
|
|
key: train_roc_auc
|
|
value: [0.82442748 0.82061069 0.83625376 0.84397409 0.83631159 0.82497687
|
|
0.84808004 0.84044645 0.82153597 0.8175746 ]
|
|
|
|
mean value: 0.8314191533657183
|
|
|
|
key: test_jcc
|
|
value: [0.83333333 0.64705882 0.66666667 0.72222222 0.76470588 0.59090909
|
|
0.64705882 0.63157895 0.77777778 0.63157895]
|
|
|
|
mean value: 0.6912890515057698
|
|
|
|
key: train_jcc
|
|
value: [0.71428571 0.71165644 0.7345679 0.7388535 0.73125 0.71069182
|
|
0.74522293 0.73248408 0.71165644 0.69620253]
|
|
|
|
mean value: 0.7226871364054945
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01238084 0.02012634 0.0197072 0.01957273 0.01937509 0.02072191
|
|
0.02095866 0.01990652 0.01836991 0.01708722]
|
|
|
|
mean value: 0.018820643424987793
|
|
|
|
key: score_time
|
|
value: [0.0095036 0.01113105 0.01119828 0.01179147 0.01183867 0.01185751
|
|
0.01195765 0.01185346 0.01179409 0.01182723]
|
|
|
|
mean value: 0.011475300788879395
|
|
|
|
key: test_mcc
|
|
value: [0.76088591 0.60540551 0.67156812 0.86965655 0.87082337 0.72954522
|
|
0.7952381 0.65714286 0.79426746 0.86965655]
|
|
|
|
mean value: 0.7624189650461067
|
|
|
|
key: train_mcc
|
|
value: [0.82856162 0.99239533 0.97721358 0.95447974 0.84450885 0.94745994
|
|
0.9337642 0.93286731 0.97744232 0.91766613]
|
|
|
|
mean value: 0.9306359023287414
|
|
|
|
key: test_accuracy
|
|
value: [0.86666667 0.8 0.82758621 0.93103448 0.93103448 0.86206897
|
|
0.89655172 0.82758621 0.89655172 0.93103448]
|
|
|
|
mean value: 0.8770114942528735
|
|
|
|
key: train_accuracy
|
|
value: [0.90839695 0.99618321 0.98859316 0.97718631 0.91634981 0.97338403
|
|
0.96577947 0.96577947 0.98859316 0.9581749 ]
|
|
|
|
mean value: 0.9638420456854265
|
|
|
|
key: test_fscore
|
|
value: [0.88235294 0.78571429 0.83870968 0.92307692 0.93333333 0.84615385
|
|
0.89655172 0.82758621 0.90322581 0.9375 ]
|
|
|
|
mean value: 0.8774204744360309
|
|
|
|
key: train_fscore
|
|
value: [0.91549296 0.99619772 0.98867925 0.97744361 0.92307692 0.97297297
|
|
0.96678967 0.96470588 0.98867925 0.95910781]
|
|
|
|
mean value: 0.9653146028957218
|
|
|
|
key: test_precision
|
|
value: [0.78947368 0.84615385 0.76470588 1. 0.875 0.91666667
|
|
0.92857143 0.85714286 0.875 0.88235294]
|
|
|
|
mean value: 0.8735067306274736
|
|
|
|
key: train_precision
|
|
value: [0.8496732 0.99242424 0.98496241 0.97014925 0.85714286 0.99212598
|
|
0.93571429 0.99193548 0.97761194 0.93478261]
|
|
|
|
mean value: 0.9486522264759241
|
|
|
|
key: test_recall
|
|
value: [1. 0.73333333 0.92857143 0.85714286 1. 0.78571429
|
|
0.86666667 0.8 0.93333333 1. ]
|
|
|
|
mean value: 0.8904761904761904
|
|
|
|
key: train_recall
|
|
value: [0.99236641 1. 0.99242424 0.98484848 1. 0.95454545
|
|
1. 0.9389313 1. 0.98473282]
|
|
|
|
mean value: 0.9847848716169327
|
|
|
|
key: test_roc_auc
|
|
value: [0.86666667 0.8 0.83095238 0.92857143 0.93333333 0.85952381
|
|
0.89761905 0.82857143 0.8952381 0.92857143]
|
|
|
|
mean value: 0.876904761904762
|
|
|
|
key: train_roc_auc
|
|
value: [0.90839695 0.99618321 0.98857853 0.97715707 0.91603053 0.97345593
|
|
0.96590909 0.96567777 0.98863636 0.9582755 ]
|
|
|
|
mean value: 0.9638300948415452
|
|
|
|
key: test_jcc
|
|
value: [0.78947368 0.64705882 0.72222222 0.85714286 0.875 0.73333333
|
|
0.8125 0.70588235 0.82352941 0.88235294]
|
|
|
|
mean value: 0.7848495626320704
|
|
|
|
key: train_jcc
|
|
value: [0.84415584 0.99242424 0.97761194 0.95588235 0.85714286 0.94736842
|
|
0.93571429 0.93181818 0.97761194 0.92142857]
|
|
|
|
mean value: 0.9341158637274806
|
|
|
|
MCC on Blind test: 0.67
|
|
|
|
Accuracy on Blind test: 0.81
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0153439 0.01487064 0.01535463 0.01554227 0.01502013 0.01618838
|
|
0.01572561 0.01596141 0.01561093 0.0169456 ]
|
|
|
|
mean value: 0.015656352043151855
|
|
|
|
key: score_time
|
|
value: [0.00961018 0.01186347 0.01176333 0.01177406 0.01178432 0.01180696
|
|
0.01184416 0.01181364 0.01184845 0.01183987]
|
|
|
|
mean value: 0.011594843864440919
|
|
|
|
key: test_mcc
|
|
value: [1. 0.6681531 0.59628479 0.86190476 0.86190476 0.79426746
|
|
0.72954522 0.6130103 0.7952381 0.86965655]
|
|
|
|
mean value: 0.7789965054376262
|
|
|
|
key: train_mcc
|
|
value: [0.90935126 0.92636711 0.90177727 0.95447974 0.90885432 0.96222382
|
|
0.85796431 0.8003837 0.68576928 0.96223033]
|
|
|
|
mean value: 0.8869401136471388
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.83333333 0.75862069 0.93103448 0.93103448 0.89655172
|
|
0.86206897 0.79310345 0.89655172 0.93103448]
|
|
|
|
mean value: 0.8833333333333333
|
|
|
|
key: train_accuracy
|
|
value: [0.95419847 0.96183206 0.95057034 0.97718631 0.95437262 0.98098859
|
|
0.92395437 0.89353612 0.82509506 0.98098859]
|
|
|
|
mean value: 0.9402722549560271
|
|
|
|
key: test_fscore
|
|
value: [1. 0.83870968 0.8 0.92857143 0.92857143 0.88888889
|
|
0.875 0.76923077 0.89655172 0.9375 ]
|
|
|
|
mean value: 0.8863023916819801
|
|
|
|
key: train_fscore
|
|
value: [0.953125 0.96323529 0.95167286 0.97744361 0.95419847 0.98127341
|
|
0.92907801 0.88235294 0.79090909 0.98113208]
|
|
|
|
mean value: 0.9364420768857535
|
|
|
|
key: test_precision
|
|
value: [1. 0.8125 0.66666667 0.92857143 0.92857143 0.92307692
|
|
0.82352941 0.90909091 0.92857143 0.88235294]
|
|
|
|
mean value: 0.8802931137489961
|
|
|
|
key: train_precision
|
|
value: [0.976 0.92907801 0.93430657 0.97014925 0.96153846 0.97037037
|
|
0.86754967 0.98130841 0.97752809 0.97014925]
|
|
|
|
mean value: 0.9537978092875747
|
|
|
|
key: test_recall
|
|
value: [1. 0.86666667 1. 0.92857143 0.92857143 0.85714286
|
|
0.93333333 0.66666667 0.86666667 1. ]
|
|
|
|
mean value: 0.9047619047619048
|
|
|
|
key: train_recall
|
|
value: [0.93129771 1. 0.96969697 0.98484848 0.9469697 0.99242424
|
|
1. 0.80152672 0.66412214 0.99236641]
|
|
|
|
mean value: 0.928325237103863
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.83333333 0.76666667 0.93095238 0.93095238 0.8952381
|
|
0.85952381 0.79761905 0.89761905 0.92857143]
|
|
|
|
mean value: 0.8840476190476191
|
|
|
|
key: train_roc_auc
|
|
value: [0.95419847 0.96183206 0.95049734 0.97715707 0.95440088 0.98094495
|
|
0.92424242 0.8931876 0.82448531 0.98103169]
|
|
|
|
mean value: 0.9401977793199168
|
|
|
|
key: test_jcc
|
|
value: [1. 0.72222222 0.66666667 0.86666667 0.86666667 0.8
|
|
0.77777778 0.625 0.8125 0.88235294]
|
|
|
|
mean value: 0.8019852941176471
|
|
|
|
key: train_jcc
|
|
value: [0.91044776 0.92907801 0.90780142 0.95588235 0.91240876 0.96323529
|
|
0.86754967 0.78947368 0.65413534 0.96296296]
|
|
|
|
mean value: 0.885297525439458
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.13885784 0.13189292 0.13409328 0.13466954 0.1360693 0.13970566
|
|
0.13426328 0.13068724 0.13306546 0.13250685]
|
|
|
|
mean value: 0.1345811367034912
|
|
|
|
key: score_time
|
|
value: [0.01640296 0.01646113 0.01672888 0.01593828 0.01671743 0.01641703
|
|
0.01637411 0.01561213 0.01619744 0.01557469]
|
|
|
|
mean value: 0.016242408752441408
|
|
|
|
key: test_mcc
|
|
value: [0.93541435 0.87447463 0.86190476 1. 0.87082337 0.79426746
|
|
0.93333333 0.93302503 0.80917359 0.93333333]
|
|
|
|
mean value: 0.8945749865401353
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96666667 0.93333333 0.93103448 1. 0.93103448 0.89655172
|
|
0.96551724 0.96551724 0.89655172 0.96551724]
|
|
|
|
mean value: 0.9451724137931035
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96551724 0.9375 0.92857143 1. 0.93333333 0.88888889
|
|
0.96551724 0.96774194 0.90909091 0.96551724]
|
|
|
|
mean value: 0.9461678219506362
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.88235294 0.92857143 1. 0.875 0.92307692
|
|
1. 0.9375 0.83333333 1. ]
|
|
|
|
mean value: 0.9379834626158156
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.93333333 1. 0.92857143 1. 1. 0.85714286
|
|
0.93333333 1. 1. 0.93333333]
|
|
|
|
mean value: 0.9585714285714286
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96666667 0.93333333 0.93095238 1. 0.93333333 0.8952381
|
|
0.96666667 0.96428571 0.89285714 0.96666667]
|
|
|
|
mean value: 0.9450000000000001
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.93333333 0.88235294 0.86666667 1. 0.875 0.8
|
|
0.93333333 0.9375 0.83333333 0.93333333]
|
|
|
|
mean value: 0.8994852941176471
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.04823947 0.04498172 0.04811335 0.04562259 0.04412913 0.0470643
|
|
0.05953717 0.04916573 0.05571675 0.05847645]
|
|
|
|
mean value: 0.05010466575622559
|
|
|
|
key: score_time
|
|
value: [0.02403498 0.02468944 0.02403665 0.02459049 0.03906131 0.03024817
|
|
0.0207746 0.03200412 0.03876305 0.02441049]
|
|
|
|
mean value: 0.028261327743530275
|
|
|
|
key: test_mcc
|
|
value: [0.93541435 0.68041382 0.86190476 0.93302503 0.93333333 0.86965655
|
|
1. 0.93302503 0.93302503 0.93333333]
|
|
|
|
mean value: 0.9013131248529029
|
|
|
|
key: train_mcc
|
|
value: [0.98473282 1. 1. 0.98479065 0.9772149 0.98490544
|
|
0.98490371 0.96200555 0.95448499 0.98490544]
|
|
|
|
mean value: 0.9817943520005022
|
|
|
|
key: test_accuracy
|
|
value: [0.96666667 0.83333333 0.93103448 0.96551724 0.96551724 0.93103448
|
|
1. 0.96551724 0.96551724 0.96551724]
|
|
|
|
mean value: 0.9489655172413793
|
|
|
|
key: train_accuracy
|
|
value: [0.99236641 1. 1. 0.99239544 0.98859316 0.99239544
|
|
0.99239544 0.98098859 0.97718631 0.99239544]
|
|
|
|
mean value: 0.9908716222099672
|
|
|
|
key: test_fscore
|
|
value: [0.96551724 0.84848485 0.92857143 0.96296296 0.96551724 0.92307692
|
|
1. 0.96774194 0.96774194 0.96551724]
|
|
|
|
mean value: 0.9495131758201836
|
|
|
|
key: train_fscore
|
|
value: [0.99236641 1. 1. 0.99242424 0.98859316 0.99236641
|
|
0.99230769 0.98098859 0.97727273 0.99242424]
|
|
|
|
mean value: 0.9908743477905815
|
|
|
|
key: test_precision
|
|
value: [1. 0.77777778 0.92857143 1. 0.93333333 1.
|
|
1. 0.9375 0.9375 1. ]
|
|
|
|
mean value: 0.951468253968254
|
|
|
|
key: train_precision
|
|
value: [0.99236641 1. 1. 0.99242424 0.99236641 1.
|
|
1. 0.97727273 0.96992481 0.98496241]
|
|
|
|
mean value: 0.9909317012169563
|
|
|
|
key: test_recall
|
|
value: [0.93333333 0.93333333 0.92857143 0.92857143 1. 0.85714286
|
|
1. 1. 1. 0.93333333]
|
|
|
|
mean value: 0.9514285714285714
|
|
|
|
key: train_recall
|
|
value: [0.99236641 1. 1. 0.99242424 0.98484848 0.98484848
|
|
0.98473282 0.98473282 0.98473282 1. ]
|
|
|
|
mean value: 0.9908686097617395
|
|
|
|
key: test_roc_auc
|
|
value: [0.96666667 0.83333333 0.93095238 0.96428571 0.96666667 0.92857143
|
|
1. 0.96428571 0.96428571 0.96666667]
|
|
|
|
mean value: 0.9485714285714286
|
|
|
|
key: train_roc_auc
|
|
value: [0.99236641 1. 1. 0.99239533 0.98860745 0.99242424
|
|
0.99236641 0.98100278 0.9772149 0.99242424]
|
|
|
|
mean value: 0.9908801758038399
|
|
|
|
key: test_jcc
|
|
value: [0.93333333 0.73684211 0.86666667 0.92857143 0.93333333 0.85714286
|
|
1. 0.9375 0.9375 0.93333333]
|
|
|
|
mean value: 0.9064223057644111
|
|
|
|
key: train_jcc
|
|
value: [0.98484848 1. 1. 0.98496241 0.97744361 0.98484848
|
|
0.98473282 0.96268657 0.95555556 0.98496241]
|
|
|
|
mean value: 0.9820040337896817
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.08211589 0.08095455 0.06087279 0.03729415 0.048522 0.07543254
|
|
0.06906867 0.04290318 0.03936195 0.07277274]
|
|
|
|
mean value: 0.06092984676361084
|
|
|
|
key: score_time
|
|
value: [0.0236876 0.02488613 0.01395512 0.01358104 0.02060533 0.02655029
|
|
0.01827741 0.0167532 0.02412868 0.02458143]
|
|
|
|
mean value: 0.020700621604919433
|
|
|
|
key: test_mcc
|
|
value: [0.46666667 0.34585723 0.24285714 0.67156812 0.59330823 0.45455066
|
|
0.34975426 0.37799476 0.44932255 0.58943389]
|
|
|
|
mean value: 0.4541313498424294
|
|
|
|
key: train_mcc
|
|
value: [0.98473282 0.97712771 0.98479065 0.98479065 0.98479065 0.98490371
|
|
0.96958131 0.9772149 0.9772149 0.96958131]
|
|
|
|
mean value: 0.979472861832087
|
|
|
|
key: test_accuracy
|
|
value: [0.73333333 0.66666667 0.62068966 0.82758621 0.79310345 0.72413793
|
|
0.65517241 0.68965517 0.72413793 0.79310345]
|
|
|
|
mean value: 0.7227586206896551
|
|
|
|
key: train_accuracy
|
|
value: [0.99236641 0.98854962 0.99239544 0.99239544 0.99239544 0.99239544
|
|
0.98479087 0.98859316 0.98859316 0.98479087]
|
|
|
|
mean value: 0.9897265840420283
|
|
|
|
key: test_fscore
|
|
value: [0.73333333 0.61538462 0.62068966 0.83870968 0.8 0.73333333
|
|
0.58333333 0.70967742 0.75 0.8125 ]
|
|
|
|
mean value: 0.7196961367331223
|
|
|
|
key: train_fscore
|
|
value: [0.99236641 0.98859316 0.99242424 0.99242424 0.99242424 0.9924812
|
|
0.98473282 0.98859316 0.98859316 0.98473282]
|
|
|
|
mean value: 0.9897365459029557
|
|
|
|
key: test_precision
|
|
value: [0.73333333 0.72727273 0.6 0.76470588 0.75 0.6875
|
|
0.77777778 0.6875 0.70588235 0.76470588]
|
|
|
|
mean value: 0.7198677956030897
|
|
|
|
key: train_precision
|
|
value: [0.99236641 0.98484848 0.99242424 0.99242424 0.99242424 0.98507463
|
|
0.98473282 0.98484848 0.98484848 0.98473282]
|
|
|
|
mean value: 0.9878724869752555
|
|
|
|
key: test_recall
|
|
value: [0.73333333 0.53333333 0.64285714 0.92857143 0.85714286 0.78571429
|
|
0.46666667 0.73333333 0.8 0.86666667]
|
|
|
|
mean value: 0.7347619047619047
|
|
|
|
key: train_recall
|
|
value: [0.99236641 0.99236641 0.99242424 0.99242424 0.99242424 1.
|
|
0.98473282 0.99236641 0.99236641 0.98473282]
|
|
|
|
mean value: 0.9916204024982651
|
|
|
|
key: test_roc_auc
|
|
value: [0.73333333 0.66666667 0.62142857 0.83095238 0.7952381 0.72619048
|
|
0.66190476 0.68809524 0.72142857 0.79047619]
|
|
|
|
mean value: 0.7235714285714286
|
|
|
|
key: train_roc_auc
|
|
value: [0.99236641 0.98854962 0.99239533 0.99239533 0.99239533 0.99236641
|
|
0.98479065 0.98860745 0.98860745 0.98479065]
|
|
|
|
mean value: 0.9897264631043257
|
|
|
|
key: test_jcc
|
|
value: [0.57894737 0.44444444 0.45 0.72222222 0.66666667 0.57894737
|
|
0.41176471 0.55 0.6 0.68421053]
|
|
|
|
mean value: 0.5687203302373581
|
|
|
|
key: train_jcc
|
|
value: [0.98484848 0.97744361 0.98496241 0.98496241 0.98496241 0.98507463
|
|
0.96992481 0.97744361 0.97744361 0.96992481]
|
|
|
|
mean value: 0.9796990780887088
|
|
|
|
MCC on Blind test: 0.39
|
|
|
|
Accuracy on Blind test: 0.69
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.44738221 0.46150136 0.46548247 0.45794606 0.42596221 0.4200623
|
|
0.44993877 0.42784166 0.42434502 0.43154621]
|
|
|
|
mean value: 0.44120082855224607
|
|
|
|
key: score_time
|
|
value: [0.0106287 0.01016116 0.01001334 0.0097239 0.00935102 0.00991964
|
|
0.00951171 0.01013899 0.01051497 0.00930452]
|
|
|
|
mean value: 0.009926795959472656
|
|
|
|
key: test_mcc
|
|
value: [0.87447463 0.76088591 0.86190476 1. 0.93333333 1.
|
|
1. 0.86965655 0.86965655 0.93333333]
|
|
|
|
mean value: 0.9103245077976663
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.93333333 0.86666667 0.93103448 1. 0.96551724 1.
|
|
1. 0.93103448 0.93103448 0.96551724]
|
|
|
|
mean value: 0.9524137931034483
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.92857143 0.88235294 0.92857143 1. 0.96551724 1.
|
|
1. 0.9375 0.9375 0.96551724]
|
|
|
|
mean value: 0.9545530281077949
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.78947368 0.92857143 1. 0.93333333 1.
|
|
1. 0.88235294 0.88235294 1. ]
|
|
|
|
mean value: 0.9416084328468229
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.86666667 1. 0.92857143 1. 1. 1.
|
|
1. 1. 1. 0.93333333]
|
|
|
|
mean value: 0.9728571428571429
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.93333333 0.86666667 0.93095238 1. 0.96666667 1.
|
|
1. 0.92857143 0.92857143 0.96666667]
|
|
|
|
mean value: 0.9521428571428572
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.86666667 0.78947368 0.86666667 1. 0.93333333 1.
|
|
1. 0.88235294 0.88235294 0.93333333]
|
|
|
|
mean value: 0.9154179566563467
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.02211094 0.03654742 0.03356147 0.04863787 0.05131888 0.02375007
|
|
0.02474117 0.02334833 0.02405477 0.0240171 ]
|
|
|
|
mean value: 0.03120880126953125
|
|
|
|
key: score_time
|
|
value: [0.01956391 0.01965404 0.01273465 0.0126214 0.0166223 0.01554108
|
|
0.01730466 0.01577425 0.01530719 0.0165956 ]
|
|
|
|
mean value: 0.016171908378601073
|
|
|
|
key: test_mcc
|
|
value: [0.26726124 0.15075567 0.38368877 0.26533187 0.28749445 0.17703552
|
|
0.38095238 0.02898855 0.24688536 0.46057608]
|
|
|
|
mean value: 0.26489698985225885
|
|
|
|
key: train_mcc
|
|
value: [0.87085548 0.84404875 0.97743845 0.81830918 0.81183809 0.83790995
|
|
0.71853328 0.89179531 0.65867038 0.96267809]
|
|
|
|
mean value: 0.8392076957241592
|
|
|
|
key: test_accuracy
|
|
value: [0.63333333 0.56666667 0.65517241 0.62068966 0.62068966 0.5862069
|
|
0.68965517 0.51724138 0.62068966 0.72413793]
|
|
|
|
mean value: 0.623448275862069
|
|
|
|
key: train_accuracy
|
|
value: [0.93129771 0.91603053 0.98859316 0.90114068 0.8973384 0.91254753
|
|
0.84030418 0.94296578 0.80228137 0.98098859]
|
|
|
|
mean value: 0.91134879400923
|
|
|
|
key: test_fscore
|
|
value: [0.64516129 0.64864865 0.72222222 0.66666667 0.68571429 0.6
|
|
0.68965517 0.5625 0.68571429 0.76470588]
|
|
|
|
mean value: 0.6670988454055424
|
|
|
|
key: train_fscore
|
|
value: [0.93571429 0.92253521 0.98876404 0.91034483 0.90721649 0.91986063
|
|
0.86184211 0.94584838 0.8343949 0.98127341]
|
|
|
|
mean value: 0.92077942849477
|
|
|
|
key: test_precision
|
|
value: [0.625 0.54545455 0.59090909 0.57894737 0.57142857 0.5625
|
|
0.71428571 0.52941176 0.6 0.68421053]
|
|
|
|
mean value: 0.6002147581520647
|
|
|
|
key: train_precision
|
|
value: [0.87919463 0.85620915 0.97777778 0.83544304 0.83018868 0.8516129
|
|
0.75722543 0.89726027 0.71584699 0.96323529]
|
|
|
|
mean value: 0.8563994175574612
|
|
|
|
key: test_recall
|
|
value: [0.66666667 0.8 0.92857143 0.78571429 0.85714286 0.64285714
|
|
0.66666667 0.6 0.8 0.86666667]
|
|
|
|
mean value: 0.7614285714285715
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.63333333 0.56666667 0.66428571 0.62619048 0.62857143 0.58809524
|
|
0.69047619 0.51428571 0.61428571 0.71904762]
|
|
|
|
mean value: 0.6245238095238095
|
|
|
|
key: train_roc_auc
|
|
value: [0.93129771 0.91603053 0.98854962 0.90076336 0.89694656 0.91221374
|
|
0.84090909 0.94318182 0.8030303 0.98106061]
|
|
|
|
mean value: 0.9113983344899376
|
|
|
|
key: test_jcc
|
|
value: [0.47619048 0.48 0.56521739 0.5 0.52173913 0.42857143
|
|
0.52631579 0.39130435 0.52173913 0.61904762]
|
|
|
|
mean value: 0.5030125313283208
|
|
|
|
key: train_jcc
|
|
value: [0.87919463 0.85620915 0.97777778 0.83544304 0.83018868 0.8516129
|
|
0.75722543 0.89726027 0.71584699 0.96323529]
|
|
|
|
mean value: 0.8563994175574612
|
|
|
|
MCC on Blind test: 0.14
|
|
|
|
Accuracy on Blind test: 0.64
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01498985 0.01466179 0.03298283 0.03653455 0.01458478 0.01477385
|
|
0.01454568 0.01460361 0.03366661 0.03188801]
|
|
|
|
mean value: 0.02232315540313721
|
|
|
|
key: score_time
|
|
value: [0.01229572 0.01209068 0.01755023 0.02199912 0.01208687 0.01216507
|
|
0.01210165 0.01222658 0.02346396 0.02577353]
|
|
|
|
mean value: 0.016175341606140137
|
|
|
|
key: test_mcc
|
|
value: [0.93541435 0.73994007 0.86190476 1. 0.93333333 0.65714286
|
|
0.81167945 0.7952381 0.79426746 0.93302503]
|
|
|
|
mean value: 0.8461945416676242
|
|
|
|
key: train_mcc
|
|
value: [0.94791916 0.96253342 0.9553594 0.94810134 0.9553594 0.9553594
|
|
0.95537456 0.95537456 0.95537456 0.94812183]
|
|
|
|
mean value: 0.9538877616323723
|
|
|
|
key: test_accuracy
|
|
value: [0.96666667 0.86666667 0.93103448 1. 0.96551724 0.82758621
|
|
0.89655172 0.89655172 0.89655172 0.96551724]
|
|
|
|
mean value: 0.921264367816092
|
|
|
|
key: train_accuracy
|
|
value: [0.97328244 0.98091603 0.97718631 0.97338403 0.97718631 0.97718631
|
|
0.97718631 0.97718631 0.97718631 0.97338403]
|
|
|
|
mean value: 0.9764084404841378
|
|
|
|
key: test_fscore
|
|
value: [0.96774194 0.875 0.92857143 1. 0.96551724 0.82758621
|
|
0.88888889 0.89655172 0.90322581 0.96774194]
|
|
|
|
mean value: 0.9220825167293466
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./katg_sl.py:168: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rus_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./katg_sl.py:171: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rus_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[0.9739777 0.98127341 0.97777778 0.97416974 0.97777778 0.97777778
|
|
0.97761194 0.97761194 0.97761194 0.9739777 ]
|
|
|
|
mean value: 0.9769567694500546
|
|
|
|
key: test_precision
|
|
value: [0.9375 0.82352941 0.92857143 1. 0.93333333 0.8
|
|
1. 0.92857143 0.875 0.9375 ]
|
|
|
|
mean value: 0.9164005602240897
|
|
|
|
key: train_precision
|
|
value: [0.94927536 0.96323529 0.95652174 0.94964029 0.95652174 0.95652174
|
|
0.95620438 0.95620438 0.95620438 0.94927536]
|
|
|
|
mean value: 0.9549604662602549
|
|
|
|
key: test_recall
|
|
value: [1. 0.93333333 0.92857143 1. 1. 0.85714286
|
|
0.8 0.86666667 0.93333333 1. ]
|
|
|
|
mean value: 0.9319047619047619
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96666667 0.86666667 0.93095238 1. 0.96666667 0.82857143
|
|
0.9 0.89761905 0.8952381 0.96428571]
|
|
|
|
mean value: 0.9216666666666667
|
|
|
|
key: train_roc_auc
|
|
value: [0.97328244 0.98091603 0.97709924 0.97328244 0.97709924 0.97709924
|
|
0.97727273 0.97727273 0.97727273 0.97348485]
|
|
|
|
mean value: 0.9764081656257229
|
|
|
|
key: test_jcc
|
|
value: [0.9375 0.77777778 0.86666667 1. 0.93333333 0.70588235
|
|
0.8 0.8125 0.82352941 0.9375 ]
|
|
|
|
mean value: 0.859468954248366
|
|
|
|
key: train_jcc
|
|
value: [0.94927536 0.96323529 0.95652174 0.94964029 0.95652174 0.95652174
|
|
0.95620438 0.95620438 0.95620438 0.94927536]
|
|
|
|
mean value: 0.9549604662602549
|
|
|
|
MCC on Blind test: 0.89
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.128407 0.19808292 0.22386646 0.24572563 0.35035658 0.2539382
|
|
0.24668813 0.2884469 0.25687814 0.14638782]
|
|
|
|
mean value: 0.2338777780532837
|
|
|
|
key: score_time
|
|
value: [0.01221538 0.01226711 0.02225232 0.02302647 0.02310872 0.02319145
|
|
0.02282047 0.02049422 0.02153277 0.01216483]
|
|
|
|
mean value: 0.019307374954223633
|
|
|
|
key: test_mcc
|
|
value: [0.93541435 0.73994007 0.86190476 1. 0.93333333 0.65714286
|
|
0.81167945 0.7952381 0.79426746 0.93302503]
|
|
|
|
mean value: 0.8461945416676242
|
|
|
|
key: train_mcc
|
|
value: [0.94791916 0.96253342 0.9553594 0.94810134 0.9553594 0.9553594
|
|
0.95537456 0.95537456 0.95537456 0.94812183]
|
|
|
|
mean value: 0.9538877616323723
|
|
|
|
key: test_accuracy
|
|
value: [0.96666667 0.86666667 0.93103448 1. 0.96551724 0.82758621
|
|
0.89655172 0.89655172 0.89655172 0.96551724]
|
|
|
|
mean value: 0.921264367816092
|
|
|
|
key: train_accuracy
|
|
value: [0.97328244 0.98091603 0.97718631 0.97338403 0.97718631 0.97718631
|
|
0.97718631 0.97718631 0.97718631 0.97338403]
|
|
|
|
mean value: 0.9764084404841378
|
|
|
|
key: test_fscore
|
|
value: [0.96774194 0.875 0.92857143 1. 0.96551724 0.82758621
|
|
0.88888889 0.89655172 0.90322581 0.96774194]
|
|
|
|
mean value: 0.9220825167293466
|
|
|
|
key: train_fscore
|
|
value: [0.9739777 0.98127341 0.97777778 0.97416974 0.97777778 0.97777778
|
|
0.97761194 0.97761194 0.97761194 0.9739777 ]
|
|
|
|
mean value: 0.9769567694500546
|
|
|
|
key: test_precision
|
|
value: [0.9375 0.82352941 0.92857143 1. 0.93333333 0.8
|
|
1. 0.92857143 0.875 0.9375 ]
|
|
|
|
mean value: 0.9164005602240897
|
|
|
|
key: train_precision
|
|
value: [0.94927536 0.96323529 0.95652174 0.94964029 0.95652174 0.95652174
|
|
0.95620438 0.95620438 0.95620438 0.94927536]
|
|
|
|
mean value: 0.9549604662602549
|
|
|
|
key: test_recall
|
|
value: [1. 0.93333333 0.92857143 1. 1. 0.85714286
|
|
0.8 0.86666667 0.93333333 1. ]
|
|
|
|
mean value: 0.9319047619047619
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96666667 0.86666667 0.93095238 1. 0.96666667 0.82857143
|
|
0.9 0.89761905 0.8952381 0.96428571]
|
|
|
|
mean value: 0.9216666666666667
|
|
|
|
key: train_roc_auc
|
|
value: [0.97328244 0.98091603 0.97709924 0.97328244 0.97709924 0.97709924
|
|
0.97727273 0.97727273 0.97727273 0.97348485]
|
|
|
|
mean value: 0.9764081656257229
|
|
|
|
key: test_jcc
|
|
value: [0.9375 0.77777778 0.86666667 1. 0.93333333 0.70588235
|
|
0.8 0.8125 0.82352941 0.9375 ]
|
|
|
|
mean value: 0.859468954248366
|
|
|
|
key: train_jcc
|
|
value: [0.94927536 0.96323529 0.95652174 0.94964029 0.95652174 0.95652174
|
|
0.95620438 0.95620438 0.95620438 0.94927536]
|
|
|
|
mean value: 0.9549604662602549
|
|
|
|
MCC on Blind test: 0.89
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02885795 0.04506516 0.03932476 0.03861547 0.06617665 0.04848981
|
|
0.07225323 0.03943682 0.03743815 0.03896904]
|
|
|
|
mean value: 0.04546270370483398
|
|
|
|
key: score_time
|
|
value: [0.01223993 0.02304626 0.01439619 0.01436543 0.02252102 0.01312566
|
|
0.02620292 0.0147469 0.01469874 0.01533699]
|
|
|
|
mean value: 0.017068004608154295
|
|
|
|
key: test_mcc
|
|
value: [0.8951918 0.86189955 0.75808552 0.82512315 0.7589669 0.75808552
|
|
0.82512315 0.85960591 0.8951918 0.85960591]
|
|
|
|
mean value: 0.8296879220906809
|
|
|
|
key: train_mcc
|
|
value: [0.90270158 0.92982429 0.9259873 0.88314434 0.90686795 0.90687923
|
|
0.91033728 0.90687923 0.9299395 0.93765105]
|
|
|
|
mean value: 0.9140211735228061
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.92982456 0.87719298 0.9122807 0.87719298 0.87719298
|
|
0.9122807 0.92982456 0.94736842 0.92982456]
|
|
|
|
mean value: 0.9140350877192982
|
|
|
|
key: train_accuracy
|
|
value: [0.95126706 0.96491228 0.96296296 0.94152047 0.95321637 0.95321637
|
|
0.95516569 0.95321637 0.96491228 0.96881092]
|
|
|
|
mean value: 0.9569200779727095
|
|
|
|
key: test_fscore
|
|
value: [0.94545455 0.93103448 0.86792453 0.9122807 0.88135593 0.8852459
|
|
0.9122807 0.93103448 0.94915254 0.93103448]
|
|
|
|
mean value: 0.9146798301756682
|
|
|
|
key: train_fscore
|
|
value: [0.95183044 0.96498054 0.96324952 0.94208494 0.95402299 0.95384615
|
|
0.95499022 0.95384615 0.96511628 0.9688716 ]
|
|
|
|
mean value: 0.9572838832295703
|
|
|
|
key: test_precision
|
|
value: [0.96296296 0.9 0.92 0.89655172 0.83870968 0.84375
|
|
0.92857143 0.93103448 0.93333333 0.93103448]
|
|
|
|
mean value: 0.9085948091942252
|
|
|
|
key: train_precision
|
|
value: [0.94274809 0.96498054 0.95769231 0.9348659 0.93962264 0.93939394
|
|
0.95686275 0.93939394 0.95769231 0.96511628]
|
|
|
|
mean value: 0.9498368696583012
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.96428571 0.82142857 0.92857143 0.92857143 0.93103448
|
|
0.89655172 0.93103448 0.96551724 0.93103448]
|
|
|
|
mean value: 0.9226600985221675
|
|
|
|
key: train_recall
|
|
value: [0.96108949 0.96498054 0.9688716 0.94941634 0.9688716 0.96875
|
|
0.953125 0.96875 0.97265625 0.97265625]
|
|
|
|
mean value: 0.9649167071984436
|
|
|
|
key: test_roc_auc
|
|
value: [0.94704433 0.93041872 0.87623153 0.91256158 0.87807882 0.87623153
|
|
0.91256158 0.92980296 0.94704433 0.92980296]
|
|
|
|
mean value: 0.9139778325123153
|
|
|
|
key: train_roc_auc
|
|
value: [0.95124787 0.96491215 0.96295142 0.94150505 0.9531858 0.9532466
|
|
0.95516172 0.9532466 0.96492735 0.9688184 ]
|
|
|
|
mean value: 0.9569202942607005
|
|
|
|
key: test_jcc
|
|
value: [0.89655172 0.87096774 0.76666667 0.83870968 0.78787879 0.79411765
|
|
0.83870968 0.87096774 0.90322581 0.87096774]
|
|
|
|
mean value: 0.8438763212838983
|
|
|
|
key: train_jcc
|
|
value: [0.90808824 0.93233083 0.92910448 0.89051095 0.91208791 0.91176471
|
|
0.91385768 0.91176471 0.93258427 0.93962264]
|
|
|
|
mean value: 0.9181716401806431
|
|
|
|
MCC on Blind test: 0.89
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.90133214 0.98307157 0.94539571 0.88999653 1.07535005 0.92793036
|
|
1.03368664 0.91905904 0.98550344 0.9788034 ]
|
|
|
|
mean value: 0.964012885093689
|
|
|
|
key: score_time
|
|
value: [0.01478505 0.01508641 0.01475453 0.01491857 0.01558185 0.0156281
|
|
0.01624322 0.01675773 0.01718378 0.01678681]
|
|
|
|
mean value: 0.01577260494232178
|
|
|
|
key: test_mcc
|
|
value: [0.86789789 0.9321832 0.8615634 0.93202124 0.8951918 0.96551724
|
|
0.82512315 0.89988258 1. 0.85960591]
|
|
|
|
mean value: 0.9038986422320657
|
|
|
|
key: train_mcc
|
|
value: [0.98831147 0.9922027 1. 0.98443509 0.98831147 1.
|
|
0.98831165 1. 0.98443556 0.99610889]
|
|
|
|
mean value: 0.9922116831378749
|
|
|
|
key: test_accuracy
|
|
value: [0.92982456 0.96491228 0.92982456 0.96491228 0.94736842 0.98245614
|
|
0.9122807 0.94736842 1. 0.92982456]
|
|
|
|
mean value: 0.9508771929824561
|
|
|
|
key: train_accuracy
|
|
value: [0.99415205 0.99610136 1. 0.99220273 0.99415205 1.
|
|
0.99415205 1. 0.99220273 0.99805068]
|
|
|
|
mean value: 0.9961013645224172
|
|
|
|
key: test_fscore
|
|
value: [0.92307692 0.96551724 0.92592593 0.96296296 0.94545455 0.98245614
|
|
0.9122807 0.94545455 1. 0.93103448]
|
|
|
|
mean value: 0.9494163469118098
|
|
|
|
key: train_fscore
|
|
value: [0.99417476 0.99610895 1. 0.99224806 0.99417476 1.
|
|
0.99415205 1. 0.9922179 0.99804305]
|
|
|
|
mean value: 0.9961119524448837
|
|
|
|
key: test_precision
|
|
value: [1. 0.93333333 0.96153846 1. 0.96296296 1.
|
|
0.92857143 1. 1. 0.93103448]
|
|
|
|
mean value: 0.9717440669164807
|
|
|
|
key: train_precision
|
|
value: [0.99224806 0.99610895 1. 0.98841699 0.99224806 1.
|
|
0.9922179 1. 0.98837209 1. ]
|
|
|
|
mean value: 0.9949612053720279
|
|
|
|
key: test_recall
|
|
value: [0.85714286 1. 0.89285714 0.92857143 0.92857143 0.96551724
|
|
0.89655172 0.89655172 1. 0.93103448]
|
|
|
|
mean value: 0.929679802955665
|
|
|
|
key: train_recall
|
|
value: [0.99610895 0.99610895 1. 0.99610895 0.99610895 1.
|
|
0.99609375 1. 0.99609375 0.99609375]
|
|
|
|
mean value: 0.9972717047665369
|
|
|
|
key: test_roc_auc
|
|
value: [0.92857143 0.96551724 0.92918719 0.96428571 0.94704433 0.98275862
|
|
0.91256158 0.94827586 1. 0.92980296]
|
|
|
|
mean value: 0.9508004926108374
|
|
|
|
key: train_roc_auc
|
|
value: [0.99414822 0.99610135 1. 0.9921951 0.99414822 1.
|
|
0.99415582 1. 0.9922103 0.99804688]
|
|
|
|
mean value: 0.996100589737354
|
|
|
|
key: test_jcc
|
|
value: [0.85714286 0.93333333 0.86206897 0.92857143 0.89655172 0.96551724
|
|
0.83870968 0.89655172 1. 0.87096774]
|
|
|
|
mean value: 0.9049414693574872
|
|
|
|
key: train_jcc
|
|
value: [0.98841699 0.99224806 1. 0.98461538 0.98841699 1.
|
|
0.98837209 1. 0.98455598 0.99609375]
|
|
|
|
mean value: 0.9922719251044105
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01561379 0.01096916 0.01066613 0.01034665 0.01063323 0.01133728
|
|
0.01132131 0.01226974 0.01155472 0.01094699]
|
|
|
|
mean value: 0.011565899848937989
|
|
|
|
key: score_time
|
|
value: [0.01313066 0.00961709 0.00921512 0.0092473 0.00938988 0.01334214
|
|
0.0097208 0.00976133 0.00919127 0.00969601]
|
|
|
|
mean value: 0.01023116111755371
|
|
|
|
key: test_mcc
|
|
value: [0.40394089 0.57536175 0.55091314 0.57881773 0.65104858 0.57073542
|
|
0.55091314 0.61589458 0.58358651 0.47348988]
|
|
|
|
mean value: 0.5554701616909862
|
|
|
|
key: train_mcc
|
|
value: [0.59070488 0.60335508 0.59951056 0.64777118 0.67228322 0.61705269
|
|
0.65940526 0.63436328 0.63208504 0.59569245]
|
|
|
|
mean value: 0.625222364018471
|
|
|
|
key: test_accuracy
|
|
value: [0.70175439 0.77192982 0.77192982 0.78947368 0.8245614 0.77192982
|
|
0.77192982 0.78947368 0.78947368 0.73684211]
|
|
|
|
mean value: 0.7719298245614035
|
|
|
|
key: train_accuracy
|
|
value: [0.79337232 0.79922027 0.79922027 0.82261209 0.83430799 0.80506823
|
|
0.82846004 0.81481481 0.8128655 0.79532164]
|
|
|
|
mean value: 0.8105263157894737
|
|
|
|
key: test_fscore
|
|
value: [0.70175439 0.8 0.74509804 0.78571429 0.82758621 0.80597015
|
|
0.79365079 0.82352941 0.80645161 0.74576271]
|
|
|
|
mean value: 0.78355175972283
|
|
|
|
key: train_fscore
|
|
value: [0.80514706 0.81170018 0.79358717 0.83054004 0.84288355 0.81818182
|
|
0.83520599 0.82504604 0.82481752 0.80733945]
|
|
|
|
mean value: 0.819444882121119
|
|
|
|
key: test_precision
|
|
value: [0.68965517 0.7027027 0.82608696 0.78571429 0.8 0.71052632
|
|
0.73529412 0.71794872 0.75757576 0.73333333]
|
|
|
|
mean value: 0.7458837359646862
|
|
|
|
key: train_precision
|
|
value: [0.7630662 0.76551724 0.81818182 0.79642857 0.8028169 0.76530612
|
|
0.80215827 0.7804878 0.7739726 0.76124567]
|
|
|
|
mean value: 0.7829181212677276
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.92857143 0.67857143 0.78571429 0.85714286 0.93103448
|
|
0.86206897 0.96551724 0.86206897 0.75862069]
|
|
|
|
mean value: 0.8343596059113301
|
|
|
|
key: train_recall
|
|
value: [0.85214008 0.86381323 0.77042802 0.86770428 0.88715953 0.87890625
|
|
0.87109375 0.875 0.8828125 0.859375 ]
|
|
|
|
mean value: 0.860843263618677
|
|
|
|
key: test_roc_auc
|
|
value: [0.70197044 0.77463054 0.7703202 0.78940887 0.82512315 0.76908867
|
|
0.7703202 0.78633005 0.78817734 0.7364532 ]
|
|
|
|
mean value: 0.7711822660098522
|
|
|
|
key: train_roc_auc
|
|
value: [0.79325754 0.79909411 0.79927651 0.82252402 0.83420477 0.80521188
|
|
0.82854298 0.81493191 0.81300158 0.79544625]
|
|
|
|
mean value: 0.8105491549124514
|
|
|
|
key: test_jcc
|
|
value: [0.54054054 0.66666667 0.59375 0.64705882 0.70588235 0.675
|
|
0.65789474 0.7 0.67567568 0.59459459]
|
|
|
|
mean value: 0.6457063390790171
|
|
|
|
key: train_jcc
|
|
value: [0.67384615 0.68307692 0.65780731 0.71019108 0.7284345 0.69230769
|
|
0.7170418 0.70219436 0.70186335 0.67692308]
|
|
|
|
mean value: 0.6943686254765951
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.011518 0.01054788 0.01172829 0.01120687 0.0108285 0.01070571
|
|
0.0116632 0.01078939 0.01071167 0.01073694]
|
|
|
|
mean value: 0.011043643951416016
|
|
|
|
key: score_time
|
|
value: [0.00911999 0.00915504 0.00916815 0.00972795 0.00929093 0.00973797
|
|
0.00984049 0.00906992 0.00910163 0.00909853]
|
|
|
|
mean value: 0.009331059455871583
|
|
|
|
key: test_mcc
|
|
value: [0.65104858 0.58076493 0.7257422 0.65018988 0.61453202 0.54377353
|
|
0.44019762 0.43881637 0.61453202 0.68736396]
|
|
|
|
mean value: 0.594696112208837
|
|
|
|
key: train_mcc
|
|
value: [0.653245 0.66081987 0.64199455 0.65105088 0.66139765 0.67260512
|
|
0.66473119 0.64928315 0.64523042 0.6456446 ]
|
|
|
|
mean value: 0.6546002422198144
|
|
|
|
key: test_accuracy
|
|
value: [0.8245614 0.78947368 0.85964912 0.8245614 0.80701754 0.77192982
|
|
0.71929825 0.71929825 0.80701754 0.84210526]
|
|
|
|
mean value: 0.7964912280701755
|
|
|
|
key: train_accuracy
|
|
value: [0.82651072 0.83040936 0.82066277 0.8245614 0.83040936 0.83625731
|
|
0.83235867 0.8245614 0.82261209 0.82261209]
|
|
|
|
mean value: 0.8270955165692008
|
|
|
|
key: test_fscore
|
|
value: [0.82758621 0.79310345 0.84615385 0.81481481 0.80701754 0.77966102
|
|
0.71428571 0.73333333 0.80701754 0.85245902]
|
|
|
|
mean value: 0.7975432484822016
|
|
|
|
key: train_fscore
|
|
value: [0.82917466 0.83106796 0.82509506 0.83146067 0.83428571 0.8372093
|
|
0.83137255 0.82213439 0.82261209 0.82533589]
|
|
|
|
mean value: 0.8289748287731117
|
|
|
|
key: test_precision
|
|
value: [0.8 0.76666667 0.91666667 0.84615385 0.79310345 0.76666667
|
|
0.74074074 0.70967742 0.82142857 0.8125 ]
|
|
|
|
mean value: 0.7973604025953859
|
|
|
|
key: train_precision
|
|
value: [0.81818182 0.82945736 0.80669145 0.80144404 0.81716418 0.83076923
|
|
0.83464567 0.832 0.82101167 0.81132075]
|
|
|
|
mean value: 0.8202686182692108
|
|
|
|
key: test_recall
|
|
value: [0.85714286 0.82142857 0.78571429 0.78571429 0.82142857 0.79310345
|
|
0.68965517 0.75862069 0.79310345 0.89655172]
|
|
|
|
mean value: 0.8002463054187192
|
|
|
|
key: train_recall
|
|
value: [0.84046693 0.83268482 0.84435798 0.86381323 0.85214008 0.84375
|
|
0.828125 0.8125 0.82421875 0.83984375]
|
|
|
|
mean value: 0.8381900535019455
|
|
|
|
key: test_roc_auc
|
|
value: [0.82512315 0.79002463 0.85837438 0.82389163 0.80726601 0.77155172
|
|
0.71982759 0.71859606 0.80726601 0.841133 ]
|
|
|
|
mean value: 0.7963054187192118
|
|
|
|
key: train_roc_auc
|
|
value: [0.82648346 0.83040491 0.82061649 0.82448474 0.83036691 0.83627189
|
|
0.83235044 0.82453794 0.82261521 0.82264561]
|
|
|
|
mean value: 0.8270777602140078
|
|
|
|
key: test_jcc
|
|
value: [0.70588235 0.65714286 0.73333333 0.6875 0.67647059 0.63888889
|
|
0.55555556 0.57894737 0.67647059 0.74285714]
|
|
|
|
mean value: 0.6653048675610596
|
|
|
|
key: train_jcc
|
|
value: [0.70819672 0.71096346 0.70226537 0.71153846 0.71568627 0.72
|
|
0.7114094 0.69798658 0.6986755 0.70261438]
|
|
|
|
mean value: 0.7079336133605598
|
|
|
|
MCC on Blind test: 0.47
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.01020646 0.01080585 0.01096845 0.0110321 0.01112604 0.01121211
|
|
0.01174545 0.01151824 0.01136279 0.01082182]
|
|
|
|
mean value: 0.011079931259155273
|
|
|
|
key: score_time
|
|
value: [0.01764226 0.01442289 0.01384616 0.01482534 0.01364756 0.01321912
|
|
0.01372838 0.01377702 0.01924062 0.01841283]
|
|
|
|
mean value: 0.015276217460632324
|
|
|
|
key: test_mcc
|
|
value: [0.43842365 0.47348988 0.47348988 0.51250867 0.47519927 0.58562417
|
|
0.38672631 0.33292257 0.44418104 0.37345948]
|
|
|
|
mean value: 0.4496024911170981
|
|
|
|
key: train_mcc
|
|
value: [0.67328032 0.68809535 0.68846183 0.68084694 0.67723397 0.68271507
|
|
0.69060158 0.68277413 0.674131 0.69135529]
|
|
|
|
mean value: 0.6829495483259833
|
|
|
|
key: test_accuracy
|
|
value: [0.71929825 0.73684211 0.73684211 0.75438596 0.73684211 0.78947368
|
|
0.68421053 0.66666667 0.71929825 0.68421053]
|
|
|
|
mean value: 0.7228070175438597
|
|
|
|
key: train_accuracy
|
|
value: [0.83625731 0.84210526 0.84405458 0.83820663 0.83625731 0.83820663
|
|
0.84405458 0.84015595 0.83625731 0.84210526]
|
|
|
|
mean value: 0.839766081871345
|
|
|
|
key: test_fscore
|
|
value: [0.71428571 0.72727273 0.72727273 0.73076923 0.71698113 0.77777778
|
|
0.64 0.6779661 0.7037037 0.66666667]
|
|
|
|
mean value: 0.7082695781518935
|
|
|
|
key: train_fscore
|
|
value: [0.83266932 0.83367556 0.84189723 0.82886598 0.82644628 0.82599581
|
|
0.83673469 0.83265306 0.82995951 0.82947368]
|
|
|
|
mean value: 0.8318371141576139
|
|
|
|
key: test_precision
|
|
value: [0.71428571 0.74074074 0.74074074 0.79166667 0.76 0.84
|
|
0.76190476 0.66666667 0.76 0.72 ]
|
|
|
|
mean value: 0.7496005291005291
|
|
|
|
key: train_precision
|
|
value: [0.85306122 0.8826087 0.85542169 0.88157895 0.88105727 0.89140271
|
|
0.87606838 0.87179487 0.86134454 0.89954338]
|
|
|
|
mean value: 0.8753881702585781
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.71428571 0.71428571 0.67857143 0.67857143 0.72413793
|
|
0.55172414 0.68965517 0.65517241 0.62068966]
|
|
|
|
mean value: 0.6741379310344828
|
|
|
|
key: train_recall
|
|
value: [0.81322957 0.78988327 0.82879377 0.78210117 0.77821012 0.76953125
|
|
0.80078125 0.796875 0.80078125 0.76953125]
|
|
|
|
mean value: 0.7929717898832684
|
|
|
|
key: test_roc_auc
|
|
value: [0.71921182 0.7364532 0.7364532 0.75307882 0.73583744 0.79064039
|
|
0.68657635 0.66625616 0.72044335 0.68534483]
|
|
|
|
mean value: 0.7230295566502463
|
|
|
|
key: train_roc_auc
|
|
value: [0.83630229 0.84220726 0.84408439 0.83831621 0.83637068 0.83807302
|
|
0.84397039 0.84007174 0.83618829 0.84196407]
|
|
|
|
mean value: 0.8397548334143968
|
|
|
|
key: test_jcc
|
|
value: [0.55555556 0.57142857 0.57142857 0.57575758 0.55882353 0.63636364
|
|
0.47058824 0.51282051 0.54285714 0.5 ]
|
|
|
|
mean value: 0.5495623330917448
|
|
|
|
key: train_jcc
|
|
value: [0.71331058 0.71478873 0.72696246 0.70774648 0.70422535 0.70357143
|
|
0.71929825 0.71328671 0.70934256 0.70863309]
|
|
|
|
mean value: 0.7121165642473934
|
|
|
|
MCC on Blind test: 0.12
|
|
|
|
Accuracy on Blind test: 0.56
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02768993 0.02304482 0.02387285 0.02293777 0.02385044 0.02296257
|
|
0.0241065 0.02292609 0.02482152 0.0247345 ]
|
|
|
|
mean value: 0.024094700813293457
|
|
|
|
key: score_time
|
|
value: [0.01283813 0.0125947 0.01315451 0.01290488 0.01273608 0.01245785
|
|
0.01373458 0.01328564 0.0125618 0.01255894]
|
|
|
|
mean value: 0.012882709503173828
|
|
|
|
key: test_mcc
|
|
value: [0.72706729 0.70694956 0.68472906 0.75462449 0.68850906 0.79682005
|
|
0.68434084 0.7257422 0.80685836 0.68736396]
|
|
|
|
mean value: 0.7263004883826937
|
|
|
|
key: train_mcc
|
|
value: [0.80629722 0.81344801 0.7991838 0.8284734 0.82487701 0.81647956
|
|
0.82136234 0.82736541 0.82494487 0.81710876]
|
|
|
|
mean value: 0.8179540383128404
|
|
|
|
key: test_accuracy
|
|
value: [0.85964912 0.84210526 0.84210526 0.87719298 0.84210526 0.89473684
|
|
0.84210526 0.85964912 0.89473684 0.84210526]
|
|
|
|
mean value: 0.8596491228070176
|
|
|
|
key: train_accuracy
|
|
value: [0.9005848 0.90448343 0.89668616 0.9122807 0.91033138 0.90643275
|
|
0.90838207 0.9122807 0.91033138 0.90643275]
|
|
|
|
mean value: 0.90682261208577
|
|
|
|
key: test_fscore
|
|
value: [0.86666667 0.85714286 0.84210526 0.87272727 0.84745763 0.90322581
|
|
0.84745763 0.87096774 0.90625 0.85245902]
|
|
|
|
mean value: 0.8666459878712518
|
|
|
|
key: train_fscore
|
|
value: [0.90607735 0.90942699 0.90275229 0.91651206 0.91481481 0.91044776
|
|
0.91280148 0.91557223 0.91449814 0.91078067]
|
|
|
|
mean value: 0.9113683791367706
|
|
|
|
key: test_precision
|
|
value: [0.8125 0.77142857 0.82758621 0.88888889 0.80645161 0.84848485
|
|
0.83333333 0.81818182 0.82857143 0.8125 ]
|
|
|
|
mean value: 0.8247926708688667
|
|
|
|
key: train_precision
|
|
value: [0.86013986 0.86619718 0.85416667 0.87588652 0.87279152 0.87142857
|
|
0.86925795 0.88086643 0.87234043 0.86879433]
|
|
|
|
mean value: 0.8691869453886878
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.96428571 0.85714286 0.85714286 0.89285714 0.96551724
|
|
0.86206897 0.93103448 1. 0.89655172]
|
|
|
|
mean value: 0.9155172413793103
|
|
|
|
key: train_recall
|
|
value: [0.95719844 0.95719844 0.95719844 0.96108949 0.96108949 0.953125
|
|
0.9609375 0.953125 0.9609375 0.95703125]
|
|
|
|
mean value: 0.9578930569066149
|
|
|
|
key: test_roc_auc
|
|
value: [0.86083744 0.84421182 0.84236453 0.87684729 0.8429803 0.89347291
|
|
0.84174877 0.85837438 0.89285714 0.841133 ]
|
|
|
|
mean value: 0.8594827586206897
|
|
|
|
key: train_roc_auc
|
|
value: [0.90047422 0.90438047 0.89656797 0.91218537 0.91023225 0.90652359
|
|
0.90848431 0.91236017 0.91042984 0.90653119]
|
|
|
|
mean value: 0.9068169382295721
|
|
|
|
key: test_jcc
|
|
value: [0.76470588 0.75 0.72727273 0.77419355 0.73529412 0.82352941
|
|
0.73529412 0.77142857 0.82857143 0.74285714]
|
|
|
|
mean value: 0.7653146947928732
|
|
|
|
key: train_jcc
|
|
value: [0.82828283 0.83389831 0.82274247 0.84589041 0.84300341 0.83561644
|
|
0.83959044 0.84429066 0.84246575 0.83617747]
|
|
|
|
mean value: 0.8371958199521154
|
|
|
|
MCC on Blind test: 0.82
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [2.05961466 2.23099542 2.19438291 2.01152205 2.39882874 2.11831498
|
|
2.21130466 2.09625578 2.12573075 2.13453245]
|
|
|
|
mean value: 2.158148241043091
|
|
|
|
key: score_time
|
|
value: [0.01481295 0.01604438 0.01651335 0.01263881 0.01380038 0.01449418
|
|
0.01461935 0.01480651 0.01514316 0.02394223]
|
|
|
|
mean value: 0.015681529045104982
|
|
|
|
key: test_mcc
|
|
value: [0.8615634 0.92980296 0.96547546 0.92980296 0.8951918 0.96551724
|
|
0.8953202 0.86189955 0.96547546 0.78940887]
|
|
|
|
mean value: 0.9059457880082712
|
|
|
|
key: train_mcc
|
|
value: [0.99610895 0.99610895 1. 1. 1. 1.
|
|
0.99610889 0.99610889 1. 0.99610889]
|
|
|
|
mean value: 0.9980544569999117
|
|
|
|
key: test_accuracy
|
|
value: [0.92982456 0.96491228 0.98245614 0.96491228 0.94736842 0.98245614
|
|
0.94736842 0.92982456 0.98245614 0.89473684]
|
|
|
|
mean value: 0.9526315789473684
|
|
|
|
key: train_accuracy
|
|
value: [0.99805068 0.99805068 1. 1. 1. 1.
|
|
0.99805068 0.99805068 1. 0.99805068]
|
|
|
|
mean value: 0.9990253411306043
|
|
|
|
key: test_fscore
|
|
value: [0.92592593 0.96428571 0.98181818 0.96428571 0.94545455 0.98245614
|
|
0.94736842 0.92857143 0.98305085 0.89655172]
|
|
|
|
mean value: 0.9519768643340577
|
|
|
|
key: train_fscore
|
|
value: [0.99805068 0.99805068 1. 1. 1. 1.
|
|
0.99804305 0.99804305 1. 0.99804305]
|
|
|
|
mean value: 0.9990230523035137
|
|
|
|
key: test_precision
|
|
value: [0.96153846 0.96428571 1. 0.96428571 0.96296296 1.
|
|
0.96428571 0.96296296 0.96666667 0.89655172]
|
|
|
|
mean value: 0.9643539921126127
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.89285714 0.96428571 0.96428571 0.96428571 0.92857143 0.96551724
|
|
0.93103448 0.89655172 1. 0.89655172]
|
|
|
|
mean value: 0.9403940886699508
|
|
|
|
key: train_recall
|
|
value: [0.99610895 0.99610895 1. 1. 1. 1.
|
|
0.99609375 0.99609375 1. 0.99609375]
|
|
|
|
mean value: 0.9980499148832684
|
|
|
|
key: test_roc_auc
|
|
value: [0.92918719 0.96490148 0.98214286 0.96490148 0.94704433 0.98275862
|
|
0.9476601 0.93041872 0.98214286 0.89470443]
|
|
|
|
mean value: 0.9525862068965518
|
|
|
|
key: train_roc_auc
|
|
value: [0.99805447 0.99805447 1. 1. 1. 1.
|
|
0.99804688 0.99804688 1. 0.99804688]
|
|
|
|
mean value: 0.9990249574416342
|
|
|
|
key: test_jcc
|
|
value: [0.86206897 0.93103448 0.96428571 0.93103448 0.89655172 0.96551724
|
|
0.9 0.86666667 0.96666667 0.8125 ]
|
|
|
|
mean value: 0.9096325944170772
|
|
|
|
key: train_jcc
|
|
value: [0.99610895 0.99610895 1. 1. 1. 1.
|
|
0.99609375 0.99609375 1. 0.99609375]
|
|
|
|
mean value: 0.9980499148832684
|
|
|
|
MCC on Blind test: 0.77
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02568269 0.02369571 0.01899219 0.01885223 0.02013445 0.0189333
|
|
0.01874685 0.01869583 0.01888299 0.01789403]
|
|
|
|
mean value: 0.020051026344299318
|
|
|
|
key: score_time
|
|
value: [0.01234221 0.00959468 0.00914192 0.00899601 0.00909853 0.00905037
|
|
0.00905776 0.00910354 0.00909448 0.00924468]
|
|
|
|
mean value: 0.009472417831420898
|
|
|
|
key: test_mcc
|
|
value: [0.92980296 0.96551724 0.96551724 0.96547546 0.8951918 0.96551724
|
|
0.96551724 0.9321832 0.92980296 0.85960591]
|
|
|
|
mean value: 0.937413124669358
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.98245614 0.98245614 0.98245614 0.94736842 0.98245614
|
|
0.98245614 0.96491228 0.96491228 0.92982456]
|
|
|
|
mean value: 0.968421052631579
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96428571 0.98245614 0.98245614 0.98181818 0.94545455 0.98245614
|
|
0.98245614 0.96428571 0.96551724 0.93103448]
|
|
|
|
mean value: 0.9682220441385595
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96428571 0.96551724 0.96551724 1. 0.96296296 1.
|
|
1. 1. 0.96551724 0.93103448]
|
|
|
|
mean value: 0.9754834884145229
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96428571 1. 1. 0.96428571 0.92857143 0.96551724
|
|
0.96551724 0.93103448 0.96551724 0.93103448]
|
|
|
|
mean value: 0.961576354679803
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96490148 0.98275862 0.98275862 0.98214286 0.94704433 0.98275862
|
|
0.98275862 0.96551724 0.96490148 0.92980296]
|
|
|
|
mean value: 0.9685344827586208
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.93103448 0.96551724 0.96551724 0.96428571 0.89655172 0.96551724
|
|
0.96551724 0.93103448 0.93333333 0.87096774]
|
|
|
|
mean value: 0.9389276444726945
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.12201738 0.12300944 0.12244582 0.12890625 0.13370728 0.12423348
|
|
0.12228918 0.12689996 0.13040471 0.1344378 ]
|
|
|
|
mean value: 0.12683513164520263
|
|
|
|
key: score_time
|
|
value: [0.01851535 0.01844716 0.01833797 0.02005148 0.01993847 0.01833677
|
|
0.0182426 0.01870298 0.01829839 0.01937294]
|
|
|
|
mean value: 0.0188244104385376
|
|
|
|
key: test_mcc
|
|
value: [0.86189955 0.96551724 0.96551724 0.93202124 0.82490815 0.92980296
|
|
0.92980296 0.8953202 0.93202124 0.79110556]
|
|
|
|
mean value: 0.9027916331150521
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.92982456 0.98245614 0.98245614 0.96491228 0.9122807 0.96491228
|
|
0.96491228 0.94736842 0.96491228 0.89473684]
|
|
|
|
mean value: 0.9508771929824561
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.93103448 0.98245614 0.98245614 0.96296296 0.90909091 0.96551724
|
|
0.96551724 0.94736842 0.96666667 0.9 ]
|
|
|
|
mean value: 0.9513070205992166
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.9 0.96551724 0.96551724 1. 0.92592593 0.96551724
|
|
0.96551724 0.96428571 0.93548387 0.87096774]
|
|
|
|
mean value: 0.9458732218632108
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96428571 1. 1. 0.92857143 0.89285714 0.96551724
|
|
0.96551724 0.93103448 1. 0.93103448]
|
|
|
|
mean value: 0.9578817733990148
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.93041872 0.98275862 0.98275862 0.96428571 0.91194581 0.96490148
|
|
0.96490148 0.9476601 0.96428571 0.89408867]
|
|
|
|
mean value: 0.9508004926108374
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.87096774 0.96551724 0.96551724 0.92857143 0.83333333 0.93333333
|
|
0.93333333 0.9 0.93548387 0.81818182]
|
|
|
|
mean value: 0.9084239342415094
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.7
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01195002 0.01176 0.01078629 0.01067519 0.0106318 0.01081967
|
|
0.01062989 0.01064992 0.01090121 0.01058793]
|
|
|
|
mean value: 0.010939192771911622
|
|
|
|
key: score_time
|
|
value: [0.00949717 0.00912499 0.00906086 0.00918531 0.00918293 0.00911808
|
|
0.00913429 0.00925946 0.00899553 0.00907516]
|
|
|
|
mean value: 0.009163379669189453
|
|
|
|
key: test_mcc
|
|
value: [0.38056438 0.75808552 0.65466436 0.7257422 0.61453202 0.72706729
|
|
0.72242731 0.79778885 0.56277738 0.65104858]
|
|
|
|
mean value: 0.659469790472868
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.68421053 0.87719298 0.8245614 0.85964912 0.80701754 0.85964912
|
|
0.84210526 0.89473684 0.77192982 0.8245614 ]
|
|
|
|
mean value: 0.8245614035087719
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.625 0.86792453 0.80769231 0.84615385 0.80701754 0.85185185
|
|
0.81632653 0.88888889 0.74509804 0.82142857]
|
|
|
|
mean value: 0.8077382108004934
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.75 0.92 0.875 0.91666667 0.79310345 0.92
|
|
1. 0.96 0.86363636 0.85185185]
|
|
|
|
mean value: 0.8850258330430745
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.53571429 0.82142857 0.75 0.78571429 0.82142857 0.79310345
|
|
0.68965517 0.82758621 0.65517241 0.79310345]
|
|
|
|
mean value: 0.7472906403940887
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.68165025 0.87623153 0.82327586 0.85837438 0.80726601 0.86083744
|
|
0.84482759 0.89593596 0.77401478 0.82512315]
|
|
|
|
mean value: 0.8247536945812808
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.45454545 0.76666667 0.67741935 0.73333333 0.67647059 0.74193548
|
|
0.68965517 0.8 0.59375 0.6969697 ]
|
|
|
|
mean value: 0.6830745750873917
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.48
|
|
|
|
Accuracy on Blind test: 0.78
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.81883883 1.82968402 1.83166051 1.8007412 1.78529477 1.97610807
|
|
1.88550544 1.8663559 1.89368105 1.85617685]
|
|
|
|
mean value: 1.854404664039612
|
|
|
|
key: score_time
|
|
value: [0.0972929 0.09529185 0.09653831 0.09236741 0.10142279 0.10048246
|
|
0.09244108 0.09494305 0.10010195 0.09162307]
|
|
|
|
mean value: 0.09625048637390136
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 0.96551724 1. 1. 0.92980296 0.96547546
|
|
0.96547546 0.8953202 0.96547546 0.93202124]
|
|
|
|
mean value: 0.9584605246454952
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.98245614 1. 1. 0.96491228 0.98245614
|
|
0.98245614 0.94736842 0.98245614 0.96491228]
|
|
|
|
mean value: 0.9789473684210526
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 0.98245614 1. 1. 0.96428571 0.98305085
|
|
0.98305085 0.94736842 0.98305085 0.96666667]
|
|
|
|
mean value: 0.9792385625079648
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96551724 0.96551724 1. 1. 0.96428571 0.96666667
|
|
0.96666667 0.96428571 0.96666667 0.93548387]
|
|
|
|
mean value: 0.9695089782297791
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 1. 0.96428571 1.
|
|
1. 0.93103448 1. 1. ]
|
|
|
|
mean value: 0.9895320197044335
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.98275862 1. 1. 0.96490148 0.98214286
|
|
0.98214286 0.9476601 0.98214286 0.96428571]
|
|
|
|
mean value: 0.9788793103448277
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 0.96551724 1. 1. 0.93103448 0.96666667
|
|
0.96666667 0.9 0.96666667 0.93548387]
|
|
|
|
mean value: 0.9597552836484984
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.95496798 0.96763968 0.97145748 0.96551633 0.96309352 1.08149958
|
|
0.96972513 0.97088814 1.0209651 1.06677747]
|
|
|
|
mean value: 0.9932530403137207
|
|
|
|
key: score_time
|
|
value: [0.24825764 0.25175595 0.20534968 0.27622366 0.27167201 0.19710779
|
|
0.2121675 0.23378062 0.21063542 0.2486136 ]
|
|
|
|
mean value: 0.23555638790130615
|
|
|
|
key: test_mcc
|
|
value: [0.96551724 0.89988258 1. 0.92980296 0.92980296 0.96547546
|
|
0.93202124 0.8953202 0.96547546 0.8951918 ]
|
|
|
|
mean value: 0.9378489884746699
|
|
|
|
key: train_mcc
|
|
value: [0.96907457 0.97289329 0.97672617 0.96907457 0.98069236 0.97289533
|
|
0.97672758 0.98057426 0.96907736 0.9922027 ]
|
|
|
|
mean value: 0.9759938179304501
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.94736842 1. 0.96491228 0.96491228 0.98245614
|
|
0.96491228 0.94736842 0.98245614 0.94736842]
|
|
|
|
mean value: 0.968421052631579
|
|
|
|
key: train_accuracy
|
|
value: [0.98440546 0.98635478 0.98830409 0.98440546 0.99025341 0.98635478
|
|
0.98830409 0.99025341 0.98440546 0.99610136]
|
|
|
|
mean value: 0.9879142300194932
|
|
|
|
key: test_fscore
|
|
value: [0.98245614 0.94915254 1. 0.96428571 0.96428571 0.98305085
|
|
0.96666667 0.94736842 0.98305085 0.94915254]
|
|
|
|
mean value: 0.9689469436302621
|
|
|
|
key: train_fscore
|
|
value: [0.98461538 0.98651252 0.98841699 0.98461538 0.99036609 0.98646035
|
|
0.98837209 0.99029126 0.98455598 0.99609375]
|
|
|
|
mean value: 0.9880299808242159
|
|
|
|
key: test_precision
|
|
value: [0.96551724 0.90322581 1. 0.96428571 0.96428571 0.96666667
|
|
0.93548387 0.96428571 0.96666667 0.93333333]
|
|
|
|
mean value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
0.9563750728322474
|
|
|
|
key: train_precision
|
|
value: [0.97338403 0.97709924 0.98084291 0.97338403 0.98091603 0.97701149
|
|
0.98076923 0.98455598 0.97328244 0.99609375]
|
|
|
|
mean value: 0.979733914221565
|
|
|
|
key: test_recall
|
|
value: [1. 1. 1. 0.96428571 0.96428571 1.
|
|
1. 0.93103448 1. 0.96551724]
|
|
|
|
mean value: 0.982512315270936
|
|
|
|
key: train_recall
|
|
value: [0.99610895 0.99610895 0.99610895 0.99610895 1. 0.99609375
|
|
0.99609375 0.99609375 0.99609375 0.99609375]
|
|
|
|
mean value: 0.996490454766537
|
|
|
|
key: test_roc_auc
|
|
value: [0.98275862 0.94827586 1. 0.96490148 0.96490148 0.98214286
|
|
0.96428571 0.9476601 0.98214286 0.94704433]
|
|
|
|
mean value: 0.9684113300492612
|
|
|
|
key: train_roc_auc
|
|
value: [0.9843826 0.98633572 0.98828885 0.9843826 0.99023438 0.98637372
|
|
0.98831925 0.99026477 0.9844282 0.99610135]
|
|
|
|
mean value: 0.9879111442120623
|
|
|
|
key: test_jcc
|
|
value: [0.96551724 0.90322581 1. 0.93103448 0.93103448 0.96666667
|
|
0.93548387 0.9 0.96666667 0.90322581]
|
|
|
|
mean value: 0.9402855024100852
|
|
|
|
key: train_jcc
|
|
value: [0.96969697 0.97338403 0.97709924 0.96969697 0.98091603 0.97328244
|
|
0.97701149 0.98076923 0.96958175 0.9922179 ]
|
|
|
|
mean value: 0.9763656052640073
|
|
|
|
MCC on Blind test: 0.89
|
|
|
|
Accuracy on Blind test: 0.94
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02656937 0.01063704 0.01065159 0.01167321 0.01107359 0.0106473
|
|
0.01130176 0.01063609 0.01054382 0.01125193]
|
|
|
|
mean value: 0.012498569488525391
|
|
|
|
key: score_time
|
|
value: [0.00993228 0.00957012 0.01008916 0.00943708 0.00952315 0.00921941
|
|
0.00988579 0.00899458 0.00932455 0.00942326]
|
|
|
|
mean value: 0.009539937973022461
|
|
|
|
key: test_mcc
|
|
value: [0.65104858 0.58076493 0.7257422 0.65018988 0.61453202 0.54377353
|
|
0.44019762 0.43881637 0.61453202 0.68736396]
|
|
|
|
mean value: 0.594696112208837
|
|
|
|
key: train_mcc
|
|
value: [0.653245 0.66081987 0.64199455 0.65105088 0.66139765 0.67260512
|
|
0.66473119 0.64928315 0.64523042 0.6456446 ]
|
|
|
|
mean value: 0.6546002422198144
|
|
|
|
key: test_accuracy
|
|
value: [0.8245614 0.78947368 0.85964912 0.8245614 0.80701754 0.77192982
|
|
0.71929825 0.71929825 0.80701754 0.84210526]
|
|
|
|
mean value: 0.7964912280701755
|
|
|
|
key: train_accuracy
|
|
value: [0.82651072 0.83040936 0.82066277 0.8245614 0.83040936 0.83625731
|
|
0.83235867 0.8245614 0.82261209 0.82261209]
|
|
|
|
mean value: 0.8270955165692008
|
|
|
|
key: test_fscore
|
|
value: [0.82758621 0.79310345 0.84615385 0.81481481 0.80701754 0.77966102
|
|
0.71428571 0.73333333 0.80701754 0.85245902]
|
|
|
|
mean value: 0.7975432484822016
|
|
|
|
key: train_fscore
|
|
value: [0.82917466 0.83106796 0.82509506 0.83146067 0.83428571 0.8372093
|
|
0.83137255 0.82213439 0.82261209 0.82533589]
|
|
|
|
mean value: 0.8289748287731117
|
|
|
|
key: test_precision
|
|
value: [0.8 0.76666667 0.91666667 0.84615385 0.79310345 0.76666667
|
|
0.74074074 0.70967742 0.82142857 0.8125 ]
|
|
|
|
mean value: 0.7973604025953859
|
|
|
|
key: train_precision
|
|
value: [0.81818182 0.82945736 0.80669145 0.80144404 0.81716418 0.83076923
|
|
0.83464567 0.832 0.82101167 0.81132075]
|
|
|
|
mean value: 0.8202686182692108
|
|
|
|
key: test_recall
|
|
value: [0.85714286 0.82142857 0.78571429 0.78571429 0.82142857 0.79310345
|
|
0.68965517 0.75862069 0.79310345 0.89655172]
|
|
|
|
mean value: 0.8002463054187192
|
|
|
|
key: train_recall
|
|
value: [0.84046693 0.83268482 0.84435798 0.86381323 0.85214008 0.84375
|
|
0.828125 0.8125 0.82421875 0.83984375]
|
|
|
|
mean value: 0.8381900535019455
|
|
|
|
key: test_roc_auc
|
|
value: [0.82512315 0.79002463 0.85837438 0.82389163 0.80726601 0.77155172
|
|
0.71982759 0.71859606 0.80726601 0.841133 ]
|
|
|
|
mean value: 0.7963054187192118
|
|
|
|
key: train_roc_auc
|
|
value: [0.82648346 0.83040491 0.82061649 0.82448474 0.83036691 0.83627189
|
|
0.83235044 0.82453794 0.82261521 0.82264561]
|
|
|
|
mean value: 0.8270777602140078
|
|
|
|
key: test_jcc
|
|
value: [0.70588235 0.65714286 0.73333333 0.6875 0.67647059 0.63888889
|
|
0.55555556 0.57894737 0.67647059 0.74285714]
|
|
|
|
mean value: 0.6653048675610596
|
|
|
|
key: train_jcc
|
|
value: [0.70819672 0.71096346 0.70226537 0.71153846 0.71568627 0.72
|
|
0.7114094 0.69798658 0.6986755 0.70261438]
|
|
|
|
mean value: 0.7079336133605598
|
|
|
|
MCC on Blind test: 0.47
|
|
|
|
Accuracy on Blind test: 0.72
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.12568951 0.07826042 0.09557486 0.22482085 0.07972693 0.08724403
|
|
0.07112598 0.08922219 0.62100506 0.09891844]
|
|
|
|
mean value: 0.15715882778167725
|
|
|
|
key: score_time
|
|
value: [0.01179385 0.01150036 0.01145864 0.01169968 0.01133823 0.01128864
|
|
0.01081276 0.01294708 0.01268101 0.01360011]
|
|
|
|
mean value: 0.011912035942077636
|
|
|
|
key: test_mcc
|
|
value: [0.96547546 0.96551724 1. 0.96547546 0.92980296 0.96551724
|
|
0.96551724 0.86189955 0.96547546 0.93202124]
|
|
|
|
mean value: 0.9516701836265801
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.98245614 0.98245614 1. 0.98245614 0.96491228 0.98245614
|
|
0.98245614 0.92982456 0.98245614 0.96491228]
|
|
|
|
mean value: 0.975438596491228
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.98181818 0.98245614 1. 0.98181818 0.96428571 0.98245614
|
|
0.98245614 0.92857143 0.98305085 0.96666667]
|
|
|
|
mean value: 0.9753579441670431
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.96551724 1. 1. 0.96428571 1.
|
|
1. 0.96296296 0.96666667 0.93548387]
|
|
|
|
mean value: 0.9794916456262396
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96428571 1. 1. 0.96428571 0.96428571 0.96551724
|
|
0.96551724 0.89655172 1. 1. ]
|
|
|
|
mean value: 0.9720443349753695
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.98214286 0.98275862 1. 0.98214286 0.96490148 0.98275862
|
|
0.98275862 0.93041872 0.98214286 0.96428571]
|
|
|
|
mean value: 0.9754310344827587
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.96428571 0.96551724 1. 0.96428571 0.93103448 0.96551724
|
|
0.96551724 0.86666667 0.96666667 0.93548387]
|
|
|
|
mean value: 0.9524974839769056
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.0796392 0.06554699 0.07161856 0.04404068 0.07942057 0.05319357
|
|
0.07177877 0.09217787 0.07455873 0.07560158]
|
|
|
|
mean value: 0.07075765132904052
|
|
|
|
key: score_time
|
|
value: [0.01967144 0.0233593 0.01252985 0.01237559 0.01997495 0.01243973
|
|
0.02639699 0.02463031 0.02326989 0.01278424]
|
|
|
|
mean value: 0.018743228912353516
|
|
|
|
key: test_mcc
|
|
value: [0.85960591 0.86189955 0.8953202 0.80685836 0.8615634 1.
|
|
0.86189955 0.85960591 0.96547546 0.8615634 ]
|
|
|
|
mean value: 0.8833791728556848
|
|
|
|
key: train_mcc
|
|
value: [0.97277537 0.9688108 0.96497735 0.96883978 0.96883978 0.96884072
|
|
0.9688108 0.96509685 0.9610433 0.9610433 ]
|
|
|
|
mean value: 0.9669078047012456
|
|
|
|
key: test_accuracy
|
|
value: [0.92982456 0.92982456 0.94736842 0.89473684 0.92982456 1.
|
|
0.92982456 0.92982456 0.98245614 0.92982456]
|
|
|
|
mean value: 0.9403508771929825
|
|
|
|
key: train_accuracy
|
|
value: [0.98635478 0.98440546 0.98245614 0.98440546 0.98440546 0.98440546
|
|
0.98440546 0.98245614 0.98050682 0.98050682]
|
|
|
|
mean value: 0.9834307992202729
|
|
|
|
key: test_fscore
|
|
value: [0.92857143 0.93103448 0.94736842 0.88 0.92592593 1.
|
|
0.92857143 0.93103448 0.98305085 0.93333333]
|
|
|
|
mean value: 0.9388890350429616
|
|
|
|
key: train_fscore
|
|
value: [0.98646035 0.9844358 0.98259188 0.98449612 0.98449612 0.9844358
|
|
0.984375 0.98259188 0.98054475 0.98054475]
|
|
|
|
mean value: 0.9834972438136449
|
|
|
|
key: test_precision
|
|
value: [0.92857143 0.9 0.93103448 1. 0.96153846 1.
|
|
0.96296296 0.93103448 0.96666667 0.90322581]
|
|
|
|
mean value: 0.9485034291708374
|
|
|
|
key: train_precision
|
|
value: [0.98076923 0.9844358 0.97692308 0.98069498 0.98069498 0.98062016
|
|
0.984375 0.97318008 0.97674419 0.97674419]
|
|
|
|
mean value: 0.9795181670507774
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.96428571 0.96428571 0.78571429 0.89285714 1.
|
|
0.89655172 0.93103448 1. 0.96551724]
|
|
|
|
mean value: 0.9328817733990148
|
|
|
|
key: train_recall
|
|
value: [0.9922179 0.9844358 0.98832685 0.98832685 0.98832685 0.98828125
|
|
0.984375 0.9921875 0.984375 0.984375 ]
|
|
|
|
mean value: 0.9875227991245137
|
|
|
|
key: test_roc_auc
|
|
value: [0.92980296 0.93041872 0.9476601 0.89285714 0.92918719 1.
|
|
0.93041872 0.92980296 0.98214286 0.92918719]
|
|
|
|
mean value: 0.9401477832512316
|
|
|
|
key: train_roc_auc
|
|
value: [0.98634332 0.9844054 0.98244467 0.9843978 0.9843978 0.984413
|
|
0.9844054 0.98247507 0.98051435 0.98051435]
|
|
|
|
mean value: 0.9834311162451362
|
|
|
|
key: test_jcc
|
|
value: [0.86666667 0.87096774 0.9 0.78571429 0.86206897 1.
|
|
0.86666667 0.87096774 0.96666667 0.875 ]
|
|
|
|
mean value: 0.8864718735102495
|
|
|
|
key: train_jcc
|
|
value: [0.97328244 0.96934866 0.96577947 0.96946565 0.96946565 0.96934866
|
|
0.96923077 0.96577947 0.96183206 0.96183206]
|
|
|
|
mean value: 0.9675364885195069
|
|
|
|
MCC on Blind test: 0.72
|
|
|
|
Accuracy on Blind test: 0.86
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02528572 0.01048207 0.01013875 0.01027489 0.01146173 0.01026607
|
|
0.01069283 0.01048994 0.01134014 0.01128221]
|
|
|
|
mean value: 0.012171435356140136
|
|
|
|
key: score_time
|
|
value: [0.0208075 0.00920558 0.00884104 0.00893402 0.01020646 0.00880122
|
|
0.00880194 0.00962806 0.00974655 0.00974631]
|
|
|
|
mean value: 0.010471868515014648
|
|
|
|
key: test_mcc
|
|
value: [0.65018988 0.66511153 0.54433498 0.72064772 0.68472906 0.7257422
|
|
0.57881773 0.68736396 0.79682005 0.75462449]
|
|
|
|
mean value: 0.6808381620716417
|
|
|
|
key: train_mcc
|
|
value: [0.66541423 0.69722643 0.67753494 0.674131 0.69108402 0.68465245
|
|
0.69245402 0.70478447 0.70511142 0.68889059]
|
|
|
|
mean value: 0.6881283571497584
|
|
|
|
key: test_accuracy
|
|
value: [0.8245614 0.8245614 0.77192982 0.85964912 0.84210526 0.85964912
|
|
0.78947368 0.84210526 0.89473684 0.87719298]
|
|
|
|
mean value: 0.8385964912280701
|
|
|
|
key: train_accuracy
|
|
value: [0.83235867 0.84795322 0.83820663 0.83625731 0.84405458 0.84210526
|
|
0.8460039 0.85185185 0.85185185 0.84405458]
|
|
|
|
mean value: 0.8434697855750487
|
|
|
|
key: test_fscore
|
|
value: [0.81481481 0.83870968 0.77192982 0.85185185 0.84210526 0.87096774
|
|
0.79310345 0.85245902 0.90322581 0.88135593]
|
|
|
|
mean value: 0.8420523377065111
|
|
|
|
key: train_fscore
|
|
value: [0.8365019 0.85283019 0.84310019 0.84210526 0.85130112 0.84452975
|
|
0.84836852 0.85551331 0.85606061 0.84732824]
|
|
|
|
mean value: 0.8477639088128366
|
|
|
|
key: test_precision
|
|
value: [0.84615385 0.76470588 0.75862069 0.88461538 0.82758621 0.81818182
|
|
0.79310345 0.8125 0.84848485 0.86666667]
|
|
|
|
mean value: 0.8220618791283092
|
|
|
|
key: train_precision
|
|
value: [0.81784387 0.82783883 0.81985294 0.81454545 0.81494662 0.83018868
|
|
0.83396226 0.83333333 0.83088235 0.82835821]
|
|
|
|
mean value: 0.8251752547574799
|
|
|
|
key: test_recall
|
|
value: [0.78571429 0.92857143 0.78571429 0.82142857 0.85714286 0.93103448
|
|
0.79310345 0.89655172 0.96551724 0.89655172]
|
|
|
|
mean value: 0.8661330049261083
|
|
|
|
key: train_recall
|
|
value: [0.85603113 0.87937743 0.86770428 0.87159533 0.89105058 0.859375
|
|
0.86328125 0.87890625 0.8828125 0.8671875 ]
|
|
|
|
mean value: 0.8717321254863813
|
|
|
|
key: test_roc_auc
|
|
value: [0.82389163 0.82635468 0.77216749 0.85899015 0.84236453 0.85837438
|
|
0.78940887 0.841133 0.89347291 0.87684729]
|
|
|
|
mean value: 0.8383004926108375
|
|
|
|
key: train_roc_auc
|
|
value: [0.83231244 0.84789184 0.83814902 0.83618829 0.84396279 0.84213886
|
|
0.84603751 0.85190449 0.85191209 0.84409959]
|
|
|
|
mean value: 0.8434596911478599
|
|
|
|
key: test_jcc
|
|
value: [0.6875 0.72222222 0.62857143 0.74193548 0.72727273 0.77142857
|
|
0.65714286 0.74285714 0.82352941 0.78787879]
|
|
|
|
mean value: 0.7290338633009411
|
|
|
|
key: train_jcc
|
|
value: [0.71895425 0.74342105 0.72875817 0.72727273 0.74110032 0.73089701
|
|
0.73666667 0.74750831 0.74834437 0.73509934]
|
|
|
|
mean value: 0.7358022212720111
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02382302 0.02296519 0.02932572 0.02991176 0.02761936 0.02511883
|
|
0.02549958 0.02771902 0.02718163 0.02515793]
|
|
|
|
mean value: 0.026432204246520995
|
|
|
|
key: score_time
|
|
value: [0.01161408 0.01145077 0.01257825 0.01203704 0.01204038 0.012815
|
|
0.01315665 0.01242328 0.01236939 0.01229191]
|
|
|
|
mean value: 0.012277674674987794
|
|
|
|
key: test_mcc
|
|
value: [0.80685836 0.86189955 0.93202124 0.9321832 0.8951918 0.96551724
|
|
0.79161589 0.86189955 0.96551724 0.89952865]
|
|
|
|
mean value: 0.8912232712036543
|
|
|
|
key: train_mcc
|
|
value: [0.92752109 0.96884072 0.97271705 0.95393799 0.97271663 0.96491975
|
|
0.95729513 0.96907736 0.94209644 0.94273737]
|
|
|
|
mean value: 0.9571859535336262
|
|
|
|
key: test_accuracy
|
|
value: [0.89473684 0.92982456 0.96491228 0.96491228 0.94736842 0.98245614
|
|
0.89473684 0.92982456 0.98245614 0.94736842]
|
|
|
|
mean value: 0.943859649122807
|
|
|
|
key: train_accuracy
|
|
value: [0.96296296 0.98440546 0.98635478 0.97660819 0.98635478 0.98245614
|
|
0.9785575 0.98440546 0.97076023 0.97076023]
|
|
|
|
mean value: 0.9783625730994152
|
|
|
|
key: test_fscore
|
|
value: [0.88 0.93103448 0.96296296 0.96551724 0.94545455 0.98245614
|
|
0.89285714 0.92857143 0.98245614 0.95081967]
|
|
|
|
mean value: 0.9422129756816913
|
|
|
|
key: train_fscore
|
|
value: [0.96192385 0.984375 0.98635478 0.97709924 0.98640777 0.98245614
|
|
0.97830375 0.98455598 0.97017893 0.97142857]
|
|
|
|
mean value: 0.9783083997466665
|
|
|
|
key: test_precision
|
|
value: [1. 0.9 1. 0.93333333 0.96296296 1.
|
|
0.92592593 0.96296296 1. 0.90625 ]
|
|
|
|
mean value: 0.9591435185185185
|
|
|
|
key: train_precision
|
|
value: [0.99173554 0.98823529 0.98828125 0.9588015 0.98449612 0.98054475
|
|
0.98804781 0.97328244 0.98785425 0.94795539]
|
|
|
|
mean value: 0.978923434340754
|
|
|
|
key: test_recall
|
|
value: [0.78571429 0.96428571 0.92857143 1. 0.92857143 0.96551724
|
|
0.86206897 0.89655172 0.96551724 1. ]
|
|
|
|
mean value: 0.929679802955665
|
|
|
|
key: train_recall
|
|
value: [0.93385214 0.98054475 0.9844358 0.99610895 0.98832685 0.984375
|
|
0.96875 0.99609375 0.953125 0.99609375]
|
|
|
|
mean value: 0.9781705982490272
|
|
|
|
key: test_roc_auc
|
|
value: [0.89285714 0.93041872 0.96428571 0.96551724 0.94704433 0.98275862
|
|
0.8953202 0.93041872 0.98275862 0.94642857]
|
|
|
|
mean value: 0.9437807881773399
|
|
|
|
key: train_roc_auc
|
|
value: [0.96301982 0.984413 0.98635852 0.9765701 0.98635092 0.98245987
|
|
0.97853842 0.9844282 0.97072592 0.97080952]
|
|
|
|
mean value: 0.9783674306906615
|
|
|
|
key: test_jcc
|
|
value: [0.78571429 0.87096774 0.92857143 0.93333333 0.89655172 0.96551724
|
|
0.80645161 0.86666667 0.96551724 0.90625 ]
|
|
|
|
mean value: 0.8925541276020976
|
|
|
|
key: train_jcc
|
|
value: [0.92664093 0.96923077 0.97307692 0.95522388 0.97318008 0.96551724
|
|
0.95752896 0.96958175 0.94208494 0.94444444]
|
|
|
|
mean value: 0.9576509910661071
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01926589 0.01759005 0.02008891 0.02133679 0.01895285 0.02249408
|
|
0.01945829 0.02094722 0.02044463 0.0228014 ]
|
|
|
|
mean value: 0.020338010787963868
|
|
|
|
key: score_time
|
|
value: [0.01128817 0.01206374 0.01268244 0.01210356 0.01209688 0.01208711
|
|
0.01209283 0.0120666 0.01214123 0.01211452]
|
|
|
|
mean value: 0.012073707580566407
|
|
|
|
key: test_mcc
|
|
value: [0.9321832 0.77903565 0.86189955 0.92980296 0.76550573 0.89988258
|
|
0.79161589 0.77903565 0.77903565 0.79161589]
|
|
|
|
mean value: 0.8309612752998973
|
|
|
|
key: train_mcc
|
|
value: [0.95018762 0.86475876 0.8926403 0.9454189 0.79780407 0.90819008
|
|
0.97271663 0.81742544 0.71471052 0.97307046]
|
|
|
|
mean value: 0.8836922787721689
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.87719298 0.92982456 0.96491228 0.87719298 0.94736842
|
|
0.89473684 0.87719298 0.87719298 0.89473684]
|
|
|
|
mean value: 0.9105263157894736
|
|
|
|
key: train_accuracy
|
|
value: [0.97465887 0.92787524 0.94346979 0.97270955 0.88888889 0.95321637
|
|
0.98635478 0.9005848 0.83820663 0.98635478]
|
|
|
|
mean value: 0.9372319688109161
|
|
|
|
key: test_fscore
|
|
value: [0.96551724 0.88888889 0.93103448 0.96428571 0.8627451 0.94545455
|
|
0.89285714 0.8627451 0.8627451 0.89285714]
|
|
|
|
mean value: 0.9069130452599012
|
|
|
|
key: train_fscore
|
|
value: [0.9752381 0.93284936 0.946593 0.97276265 0.87527352 0.9516129
|
|
0.98630137 0.88937093 0.80652681 0.98613861]
|
|
|
|
mean value: 0.9322667256993225
|
|
|
|
key: test_precision
|
|
value: [0.93333333 0.8 0.9 0.96428571 0.95652174 1.
|
|
0.92592593 1. 1. 0.92592593]
|
|
|
|
mean value: 0.9405992638601335
|
|
|
|
key: train_precision
|
|
value: [0.95522388 0.87414966 0.8986014 0.97276265 1. 0.98333333
|
|
0.98823529 1. 1. 1. ]
|
|
|
|
mean value: 0.9672306212427736
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.96428571 0.96428571 0.78571429 0.89655172
|
|
0.86206897 0.75862069 0.75862069 0.86206897]
|
|
|
|
mean value: 0.8852216748768473
|
|
|
|
key: train_recall
|
|
value: [0.99610895 1. 1. 0.97276265 0.77821012 0.921875
|
|
0.984375 0.80078125 0.67578125 0.97265625]
|
|
|
|
mean value: 0.9102550462062257
|
|
|
|
key: test_roc_auc
|
|
value: [0.96551724 0.87931034 0.93041872 0.96490148 0.87561576 0.94827586
|
|
0.8953202 0.87931034 0.87931034 0.8953202 ]
|
|
|
|
mean value: 0.9113300492610837
|
|
|
|
key: train_roc_auc
|
|
value: [0.97461697 0.92773438 0.94335938 0.97270945 0.88910506 0.9531554
|
|
0.98635092 0.90039062 0.83789062 0.98632812]
|
|
|
|
mean value: 0.9371640928988327
|
|
|
|
key: test_jcc
|
|
value: [0.93333333 0.8 0.87096774 0.93103448 0.75862069 0.89655172
|
|
0.80645161 0.75862069 0.75862069 0.80645161]
|
|
|
|
mean value: 0.8320652576937337
|
|
|
|
key: train_jcc
|
|
value: [0.95167286 0.87414966 0.8986014 0.9469697 0.77821012 0.90769231
|
|
0.97297297 0.80078125 0.67578125 0.97265625]
|
|
|
|
mean value: 0.8779487765285371
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.19932365 0.18372822 0.17802 0.17789721 0.17805314 0.17863345
|
|
0.17918444 0.17962122 0.17842484 0.17965174]
|
|
|
|
mean value: 0.18125379085540771
|
|
|
|
key: score_time
|
|
value: [0.01686788 0.0161252 0.01570821 0.01544094 0.01524329 0.01548839
|
|
0.0152936 0.0156951 0.01531601 0.01533294]
|
|
|
|
mean value: 0.01565115451812744
|
|
|
|
key: test_mcc
|
|
value: [0.8951918 0.96551724 1. 0.96547546 0.92980296 0.96551724
|
|
0.96551724 0.92980296 0.92980296 0.93202124]
|
|
|
|
mean value: 0.9478649091676867
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.98245614 1. 0.98245614 0.96491228 0.98245614
|
|
0.98245614 0.96491228 0.96491228 0.96491228]
|
|
|
|
mean value: 0.9736842105263157
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.94545455 0.98245614 1. 0.98181818 0.96428571 0.98245614
|
|
0.98245614 0.96551724 0.96551724 0.96666667]
|
|
|
|
mean value: 0.973662801203636
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96296296 0.96551724 1. 1. 0.96428571 1.
|
|
1. 0.96551724 0.96551724 0.93548387]
|
|
|
|
mean value: 0.975928427235435
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.92857143 1. 1. 0.96428571 0.96428571 0.96551724
|
|
0.96551724 0.96551724 0.96551724 1. ]
|
|
|
|
mean value: 0.9719211822660099
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.94704433 0.98275862 1. 0.98214286 0.96490148 0.98275862
|
|
0.98275862 0.96490148 0.96490148 0.96428571]
|
|
|
|
mean value: 0.9736453201970444
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.89655172 0.96551724 1. 0.96428571 0.93103448 0.96551724
|
|
0.96551724 0.93333333 0.93333333 0.93548387]
|
|
|
|
mean value: 0.9490574182954605
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.94
|
|
|
|
Accuracy on Blind test: 0.97
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.05743885 0.07335186 0.06502676 0.06323075 0.06822038 0.06512856
|
|
0.06676459 0.08469748 0.08225036 0.09285426]
|
|
|
|
mean value: 0.07189638614654541
|
|
|
|
key: score_time
|
|
value: [0.01954103 0.01935101 0.02997541 0.0232482 0.01905513 0.02338171
|
|
0.02898955 0.04204869 0.02914691 0.03998756]
|
|
|
|
mean value: 0.027472519874572755
|
|
|
|
key: test_mcc
|
|
value: [0.92980296 1. 1. 0.93202124 0.8951918 1.
|
|
0.92980296 0.89988258 0.92980296 0.8951918 ]
|
|
|
|
mean value: 0.9411696291544969
|
|
|
|
key: train_mcc
|
|
value: [0.99610895 0.99610895 1. 0.99223298 0.99223298 1.
|
|
0.99610895 1. 0.99610889 1. ]
|
|
|
|
mean value: 0.9968901699256791
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 1. 1. 0.96491228 0.94736842 1.
|
|
0.96491228 0.94736842 0.96491228 0.94736842]
|
|
|
|
mean value: 0.9701754385964912
|
|
|
|
key: train_accuracy
|
|
value: [0.99805068 0.99805068 1. 0.99610136 0.99610136 1.
|
|
0.99805068 1. 0.99805068 1. ]
|
|
|
|
mean value: 0.9984405458089668
|
|
|
|
key: test_fscore
|
|
value: [0.96428571 1. 1. 0.96296296 0.94545455 1.
|
|
0.96551724 0.94545455 0.96551724 0.94915254]
|
|
|
|
mean value: 0.9698344793289271
|
|
|
|
key: train_fscore
|
|
value: [0.99805068 0.99805068 1. 0.99609375 0.99609375 1.
|
|
0.99805068 1. 0.99804305 1. ]
|
|
|
|
mean value: 0.9984382599621199
|
|
|
|
key: test_precision
|
|
value: [0.96428571 1. 1. 1. 0.96296296 1.
|
|
0.96551724 1. 0.96551724 0.93333333]
|
|
|
|
mean value: 0.9791616493340631
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
0.99610895 1. 1. 1. ]
|
|
|
|
mean value: 0.9996108949416342
|
|
|
|
key: test_recall
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:747: UserWarning: Some inputs do not have OOB scores. This probably means too few estimators were used to compute any reliable oob estimates.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_bagging.py:753: RuntimeWarning: invalid value encountered in true_divide
|
|
oob_decision_function = predictions / predictions.sum(axis=1)[:, np.newaxis]
|
|
[0.96428571 1. 1. 0.92857143 0.92857143 1.
|
|
0.96551724 0.89655172 0.96551724 0.96551724]
|
|
|
|
mean value: 0.9614532019704434
|
|
|
|
key: train_recall
|
|
value: [0.99610895 0.99610895 1. 0.9922179 0.9922179 1.
|
|
1. 1. 0.99609375 1. ]
|
|
|
|
mean value: 0.9972747446498055
|
|
|
|
key: test_roc_auc
|
|
value: [0.96490148 1. 1. 0.96428571 0.94704433 1.
|
|
0.96490148 0.94827586 0.96490148 0.94704433]
|
|
|
|
mean value: 0.9701354679802956
|
|
|
|
key: train_roc_auc
|
|
value: [0.99805447 0.99805447 1. 0.99610895 0.99610895 1.
|
|
0.99805447 1. 0.99804688 1. ]
|
|
|
|
mean value: 0.9984428197957198
|
|
|
|
key: test_jcc
|
|
value: [0.93103448 1. 1. 0.92857143 0.89655172 1.
|
|
0.93333333 0.89655172 0.93333333 0.90322581]
|
|
|
|
mean value: 0.9422601832724191
|
|
|
|
key: train_jcc
|
|
value: [0.99610895 0.99610895 1. 0.9922179 0.9922179 1.
|
|
0.99610895 1. 0.99609375 1. ]
|
|
|
|
mean value: 0.9968856395914397
|
|
|
|
MCC on Blind test: 0.79
|
|
|
|
Accuracy on Blind test: 0.89
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.18004179 0.19584107 0.23010325 0.21314287 0.2015121 0.20719767
|
|
0.16908312 0.20237637 0.17533731 0.1983788 ]
|
|
|
|
mean value: 0.19730143547058104
|
|
|
|
key: score_time
|
|
value: [0.02584887 0.02601671 0.02588081 0.02573323 0.02579021 0.02653241
|
|
0.01559448 0.01567507 0.02591038 0.02642345]
|
|
|
|
mean value: 0.023940563201904297
|
|
|
|
key: test_mcc
|
|
value: [0.78940887 0.92980296 0.82880708 0.82490815 0.75492611 0.82942474
|
|
0.79778885 0.75492611 0.78940887 0.68472906]
|
|
|
|
mean value: 0.7984130796385527
|
|
|
|
key: train_mcc
|
|
value: [0.9766081 0.97663814 0.98057426 0.96883978 0.98831165 0.98069236
|
|
0.9844054 0.97289329 0.98443509 0.98051405]
|
|
|
|
mean value: 0.979391211760016
|
|
|
|
key: test_accuracy
|
|
value: [0.89473684 0.96491228 0.9122807 0.9122807 0.87719298 0.9122807
|
|
0.89473684 0.87719298 0.89473684 0.84210526]
|
|
|
|
mean value: 0.8982456140350877
|
|
|
|
key: train_accuracy
|
|
value: [0.98830409 0.98830409 0.99025341 0.98440546 0.99415205 0.99025341
|
|
0.99220273 0.98635478 0.99220273 0.99025341]
|
|
|
|
mean value: 0.9896686159844055
|
|
|
|
key: test_fscore
|
|
value: [0.89285714 0.96428571 0.90566038 0.90909091 0.87719298 0.90909091
|
|
0.88888889 0.87719298 0.89655172 0.84210526]
|
|
|
|
mean value: 0.8962916893780161
|
|
|
|
key: train_fscore
|
|
value: [0.98832685 0.98828125 0.99021526 0.98449612 0.99415205 0.99013807
|
|
0.9921875 0.98619329 0.99215686 0.99021526]
|
|
|
|
mean value: 0.9896362521131238
|
|
|
|
key: test_precision
|
|
value: [0.89285714 0.96428571 0.96 0.92592593 0.86206897 0.96153846
|
|
0.96 0.89285714 0.89655172 0.85714286]
|
|
|
|
mean value: 0.9173227934262417
|
|
|
|
key: train_precision
|
|
value: [0.98832685 0.99215686 0.99606299 0.98069498 0.99609375 1.
|
|
0.9921875 0.99601594 0.99606299 0.99215686]
|
|
|
|
mean value: 0.9929758724941152
|
|
|
|
key: test_recall
|
|
value: [0.89285714 0.96428571 0.85714286 0.89285714 0.89285714 0.86206897
|
|
0.82758621 0.86206897 0.89655172 0.82758621]
|
|
|
|
mean value: 0.8775862068965518
|
|
|
|
key: train_recall
|
|
value: [0.98832685 0.9844358 0.9844358 0.98832685 0.9922179 0.98046875
|
|
0.9921875 0.9765625 0.98828125 0.98828125]
|
|
|
|
mean value: 0.9863524440661479
|
|
|
|
key: test_roc_auc
|
|
value: [0.89470443 0.96490148 0.91133005 0.91194581 0.87746305 0.91317734
|
|
0.89593596 0.87746305 0.89470443 0.84236453]
|
|
|
|
mean value: 0.8983990147783252
|
|
|
|
key: train_roc_auc
|
|
value: [0.98830405 0.98831165 0.99026477 0.9843978 0.99415582 0.99023438
|
|
0.9922027 0.98633572 0.9921951 0.99024957]
|
|
|
|
mean value: 0.9896651568579766
|
|
|
|
key: test_jcc
|
|
value: [0.80645161 0.93103448 0.82758621 0.83333333 0.78125 0.83333333
|
|
0.8 0.78125 0.8125 0.72727273]
|
|
|
|
mean value: 0.8134011696497793
|
|
|
|
key: train_jcc
|
|
value: [0.97692308 0.97683398 0.98062016 0.96946565 0.98837209 0.98046875
|
|
0.98449612 0.97276265 0.9844358 0.98062016]
|
|
|
|
mean value: 0.9794998423323565
|
|
|
|
MCC on Blind test: 0.45
|
|
|
|
Accuracy on Blind test: 0.75
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.73218632 0.71230316 0.71463799 0.71179533 0.70079851 0.72155094
|
|
0.72182274 0.72095418 0.71527052 0.71961784]
|
|
|
|
mean value: 0.7170937538146973
|
|
|
|
key: score_time
|
|
value: [0.00951099 0.00945258 0.01003933 0.01020122 0.00973988 0.00968409
|
|
0.00953937 0.00969291 0.01006198 0.01011348]
|
|
|
|
mean value: 0.009803581237792968
|
|
|
|
key: test_mcc
|
|
value: [0.92980296 0.96551724 1. 0.96547546 0.92980296 1.
|
|
0.96551724 0.89988258 0.96547546 0.93202124]
|
|
|
|
mean value: 0.9553495126484416
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.96491228 0.98245614 1. 0.98245614 0.96491228 1.
|
|
0.98245614 0.94736842 0.98245614 0.96491228]
|
|
|
|
mean value: 0.9771929824561403
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.96428571 0.98245614 1. 0.98181818 0.96428571 1.
|
|
0.98245614 0.94545455 0.98305085 0.96666667]
|
|
|
|
mean value: 0.9770473950670204
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.96428571 0.96551724 1. 1. 0.96428571 1.
|
|
1. 1. 0.96666667 0.93548387]
|
|
|
|
mean value: 0.9796239207585148
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.96428571 1. 1. 0.96428571 0.96428571 1.
|
|
0.96551724 0.89655172 1. 1. ]
|
|
|
|
mean value: 0.9754926108374384
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.96490148 0.98275862 1. 0.98214286 0.96490148 1.
|
|
0.98275862 0.94827586 0.98214286 0.96428571]
|
|
|
|
mean value: 0.977216748768473
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.93103448 0.96551724 1. 0.96428571 0.93103448 1.
|
|
0.96551724 0.89655172 0.96666667 0.93548387]
|
|
|
|
mean value: 0.9556091424333916
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.84
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.03368616 0.03474903 0.03299999 0.04718065 0.03263092 0.03254867
|
|
0.0505898 0.04946899 0.05832887 0.04263043]
|
|
|
|
mean value: 0.04148135185241699
|
|
|
|
key: score_time
|
|
value: [0.01290917 0.01676774 0.01531792 0.02634883 0.01512575 0.01687789
|
|
0.03070521 0.01287818 0.01288342 0.01452231]
|
|
|
|
mean value: 0.017433643341064453
|
|
|
|
key: test_mcc
|
|
value: [0.60497779 0.75808552 0.78940887 0.553659 0.65104858 0.54592083
|
|
0.69397486 0.58562417 0.75462449 0.54433498]
|
|
|
|
mean value: 0.6481659094831543
|
|
|
|
key: train_mcc
|
|
value: [0.89961107 0.92485373 0.910898 0.85981783 0.9321313 0.88066244
|
|
0.82869394 0.81410875 0.95018762 0.89587763]
|
|
|
|
mean value: 0.8896842312566976
|
|
|
|
key: test_accuracy
|
|
value: [0.78947368 0.87719298 0.89473684 0.77192982 0.8245614 0.77192982
|
|
0.84210526 0.78947368 0.87719298 0.77192982]
|
|
|
|
mean value: 0.8210526315789474
|
|
|
|
key: train_accuracy
|
|
value: [0.94931774 0.96101365 0.95516569 0.92787524 0.96491228 0.93957115
|
|
0.91423002 0.89863548 0.97465887 0.94736842]
|
|
|
|
mean value: 0.9432748538011696
|
|
|
|
key: test_fscore
|
|
value: [0.8125 0.86792453 0.89285714 0.78688525 0.82758621 0.78688525
|
|
0.85714286 0.77777778 0.88135593 0.77192982]
|
|
|
|
mean value: 0.8262844761544288
|
|
|
|
key: train_fscore
|
|
value: [0.95057034 0.95951417 0.95445545 0.93135436 0.96370968 0.94117647
|
|
0.91505792 0.88695652 0.9740519 0.94589178]
|
|
|
|
mean value: 0.9422738582295507
|
|
|
|
key: test_precision
|
|
value: [0.72222222 0.92 0.89285714 0.72727273 0.8 0.75
|
|
0.79411765 0.84 0.86666667 0.78571429]
|
|
|
|
mean value: 0.8098850691791868
|
|
|
|
key: train_precision
|
|
value: [0.92936803 1. 0.97177419 0.89007092 1. 0.91512915
|
|
0.90458015 1. 0.99591837 0.97119342]
|
|
|
|
mean value: 0.9578034232222047
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.82142857 0.89285714 0.85714286 0.85714286 0.82758621
|
|
0.93103448 0.72413793 0.89655172 0.75862069]
|
|
|
|
mean value: 0.8495073891625615
|
|
|
|
key: train_recall
|
|
value: [0.97276265 0.92217899 0.93774319 0.9766537 0.92996109 0.96875
|
|
0.92578125 0.796875 0.953125 0.921875 ]
|
|
|
|
mean value: 0.9305705860894942
|
|
|
|
key: test_roc_auc
|
|
value: [0.79187192 0.87623153 0.89470443 0.77339901 0.82512315 0.77093596
|
|
0.84051724 0.79064039 0.87684729 0.77216749]
|
|
|
|
mean value: 0.8212438423645321
|
|
|
|
key: train_roc_auc
|
|
value: [0.94927195 0.96108949 0.95519972 0.92777997 0.96498054 0.93962792
|
|
0.91425249 0.8984375 0.97461697 0.94731882]
|
|
|
|
mean value: 0.9432575389105058
|
|
|
|
key: test_jcc
|
|
value: [0.68421053 0.76666667 0.80645161 0.64864865 0.70588235 0.64864865
|
|
0.75 0.63636364 0.78787879 0.62857143]
|
|
|
|
mean value: 0.7063322308938008
|
|
|
|
key: train_jcc
|
|
value: [0.9057971 0.92217899 0.91287879 0.87152778 0.92996109 0.88888889
|
|
0.84341637 0.796875 0.94941634 0.8973384 ]
|
|
|
|
mean value: 0.891827874937678
|
|
|
|
MCC on Blind test: 0.16
|
|
|
|
Accuracy on Blind test: 0.58
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01642084 0.016325 0.02404594 0.03962636 0.03637481 0.03999496
|
|
0.04035878 0.02521992 0.02431512 0.03965163]
|
|
|
|
mean value: 0.030233335494995118
|
|
|
|
key: score_time
|
|
value: [0.01417303 0.01231766 0.0191412 0.01912379 0.01904559 0.0190649
|
|
0.01907992 0.01245284 0.01895809 0.01929522]
|
|
|
|
mean value: 0.01726522445678711
|
|
|
|
key: test_mcc
|
|
value: [0.8951918 0.89988258 0.96551724 0.96547546 0.92980296 0.96551724
|
|
0.8951918 0.85960591 0.96547546 0.8615634 ]
|
|
|
|
mean value: 0.9203223848137981
|
|
|
|
key: train_mcc
|
|
value: [0.96127477 0.97672617 0.9611292 0.96892768 0.96509421 0.96127828
|
|
0.94967392 0.95770742 0.95718129 0.95718129]
|
|
|
|
mean value: 0.9616174236606619
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.94736842 0.98245614 0.98245614 0.96491228 0.98245614
|
|
0.94736842 0.92982456 0.98245614 0.92982456]
|
|
|
|
mean value: 0.9596491228070175
|
|
|
|
key: train_accuracy
|
|
value: [0.98050682 0.98830409 0.98050682 0.98440546 0.98245614 0.98050682
|
|
0.97465887 0.9785575 0.9785575 0.9785575 ]
|
|
|
|
mean value: 0.9807017543859649
|
|
|
|
key: test_fscore
|
|
value: [0.94545455 0.94915254 0.98245614 0.98181818 0.96428571 0.98245614
|
|
0.94915254 0.93103448 0.98305085 0.93333333]
|
|
|
|
mean value: 0.960219447055554
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./katg_sl.py:188: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rouC_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./katg_sl.py:191: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rouC_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[0.98076923 0.98841699 0.98069498 0.98455598 0.98265896 0.98069498
|
|
0.97495183 0.97888676 0.97864078 0.97864078]
|
|
|
|
mean value: 0.980891126474896
|
|
|
|
key: test_precision
|
|
value: [0.96296296 0.90322581 0.96551724 1. 0.96428571 1.
|
|
0.93333333 0.93103448 0.96666667 0.90322581]
|
|
|
|
mean value: 0.9530252014289834
|
|
|
|
key: train_precision
|
|
value: [0.96958175 0.98084291 0.97318008 0.97701149 0.97328244 0.96946565
|
|
0.96197719 0.96226415 0.97297297 0.97297297]
|
|
|
|
mean value: 0.9713551606612233
|
|
|
|
key: test_recall
|
|
value: [0.92857143 1. 1. 0.96428571 0.96428571 0.96551724
|
|
0.96551724 0.93103448 1. 0.96551724]
|
|
|
|
mean value: 0.9684729064039409
|
|
|
|
key: train_recall
|
|
value: [0.9922179 0.99610895 0.98832685 0.9922179 0.9922179 0.9921875
|
|
0.98828125 0.99609375 0.984375 0.984375 ]
|
|
|
|
mean value: 0.9906401994163424
|
|
|
|
key: test_roc_auc
|
|
value: [0.94704433 0.94827586 0.98275862 0.98214286 0.96490148 0.98275862
|
|
0.94704433 0.92980296 0.98214286 0.92918719]
|
|
|
|
mean value: 0.9596059113300494
|
|
|
|
key: train_roc_auc
|
|
value: [0.98048395 0.98828885 0.98049155 0.9843902 0.98243707 0.98052955
|
|
0.97468537 0.97859162 0.97856882 0.97856882]
|
|
|
|
mean value: 0.9807035809824902
|
|
|
|
key: test_jcc
|
|
value: [0.89655172 0.90322581 0.96551724 0.96428571 0.93103448 0.96551724
|
|
0.90322581 0.87096774 0.96666667 0.875 ]
|
|
|
|
mean value: 0.9241992425446263
|
|
|
|
key: train_jcc
|
|
value: [0.96226415 0.97709924 0.96212121 0.96958175 0.96590909 0.96212121
|
|
0.95112782 0.95864662 0.9581749 0.9581749 ]
|
|
|
|
mean value: 0.9625220897761719
|
|
|
|
MCC on Blind test: 0.82
|
|
|
|
Accuracy on Blind test: 0.92
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_ppi2_affinity', 'interface_dist',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=168)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.15240073 0.28842759 0.29213166 0.31818414 0.29039454 0.24698877
|
|
0.31169581 0.2405808 0.31258917 0.31262183]
|
|
|
|
mean value: 0.276601505279541
|
|
|
|
key: score_time
|
|
value: [0.01224947 0.0193646 0.02494502 0.01928473 0.02496862 0.02136517
|
|
0.01929235 0.01257014 0.02258611 0.01934028]
|
|
|
|
mean value: 0.01959664821624756
|
|
|
|
key: test_mcc
|
|
value: [0.8951918 0.8953202 0.96551724 0.96547546 0.92980296 0.96551724
|
|
0.8951918 0.85960591 0.96547546 0.8615634 ]
|
|
|
|
mean value: 0.9198661467253358
|
|
|
|
key: train_mcc
|
|
value: [0.96127477 0.98051435 0.9611292 0.96892768 0.96509421 0.96127828
|
|
0.94967392 0.95770742 0.95718129 0.95718129]
|
|
|
|
mean value: 0.961996241747861
|
|
|
|
key: test_accuracy
|
|
value: [0.94736842 0.94736842 0.98245614 0.98245614 0.96491228 0.98245614
|
|
0.94736842 0.92982456 0.98245614 0.92982456]
|
|
|
|
mean value: 0.9596491228070175
|
|
|
|
key: train_accuracy
|
|
value: [0.98050682 0.99025341 0.98050682 0.98440546 0.98245614 0.98050682
|
|
0.97465887 0.9785575 0.9785575 0.9785575 ]
|
|
|
|
mean value: 0.980896686159844
|
|
|
|
key: test_fscore
|
|
value: [0.94545455 0.94736842 0.98245614 0.98181818 0.96428571 0.98245614
|
|
0.94915254 0.93103448 0.98305085 0.93333333]
|
|
|
|
mean value: 0.960041034923529
|
|
|
|
key: train_fscore
|
|
value: [0.98076923 0.99025341 0.98069498 0.98455598 0.98265896 0.98069498
|
|
0.97495183 0.97888676 0.97864078 0.97864078]
|
|
|
|
mean value: 0.9810747687638014
|
|
|
|
key: test_precision
|
|
value: [0.96296296 0.93103448 0.96551724 1. 0.96428571 1.
|
|
0.93333333 0.93103448 0.96666667 0.90322581]
|
|
|
|
mean value: 0.9558060690596841
|
|
|
|
key: train_precision
|
|
value: [0.96958175 0.9921875 0.97318008 0.97701149 0.97328244 0.96946565
|
|
0.96197719 0.96226415 0.97297297 0.97297297]
|
|
|
|
mean value: 0.9724896194734839
|
|
|
|
key: test_recall
|
|
value: [0.92857143 0.96428571 1. 0.96428571 0.96428571 0.96551724
|
|
0.96551724 0.93103448 1. 0.96551724]
|
|
|
|
mean value: 0.9649014778325123
|
|
|
|
key: train_recall
|
|
value: [0.9922179 0.98832685 0.98832685 0.9922179 0.9922179 0.9921875
|
|
0.98828125 0.99609375 0.984375 0.984375 ]
|
|
|
|
mean value: 0.9898619892996109
|
|
|
|
key: test_roc_auc
|
|
value: [0.94704433 0.9476601 0.98275862 0.98214286 0.96490148 0.98275862
|
|
0.94704433 0.92980296 0.98214286 0.92918719]
|
|
|
|
mean value: 0.9595443349753695
|
|
|
|
key: train_roc_auc
|
|
value: [0.98048395 0.99025717 0.98049155 0.9843902 0.98243707 0.98052955
|
|
0.97468537 0.97859162 0.97856882 0.97856882]
|
|
|
|
mean value: 0.9809004134241245
|
|
|
|
key: test_jcc
|
|
value: [0.89655172 0.9 0.96551724 0.96428571 0.93103448 0.96551724
|
|
0.90322581 0.87096774 0.96666667 0.875 ]
|
|
|
|
mean value: 0.923876661899465
|
|
|
|
key: train_jcc
|
|
value: [0.96226415 0.98069498 0.96212121 0.96958175 0.96590909 0.96212121
|
|
0.95112782 0.95864662 0.9581749 0.9581749 ]
|
|
|
|
mean value: 0.9628816641815479
|
|
|
|
MCC on Blind test: 0.82
|
|
|
|
Accuracy on Blind test: 0.92
|