18779 lines
894 KiB
Text
18779 lines
894 KiB
Text
/home/tanu/git/LSHTM_analysis/scripts/ml/ml_data_sl.py:549: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
mask_check.sort_values(by = ['ligand_distance'], ascending = True, inplace = True)
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.
|
|
from pandas import MultiIndex, Int64Index
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
1.22.4
|
|
1.4.1
|
|
|
|
aaindex_df contains non-numerical data
|
|
|
|
Total no. of non-numerial columns: 2
|
|
|
|
Selecting numerical data only
|
|
|
|
PASS: successfully selected numerical columns only for aaindex_df
|
|
|
|
Now checking for NA in the remaining aaindex_cols
|
|
|
|
Counting aaindex_df cols with NA
|
|
ncols with NA: 4 columns
|
|
Dropping these...
|
|
Original ncols: 127
|
|
|
|
Revised df ncols: 123
|
|
|
|
Checking NA in revised df...
|
|
|
|
PASS: cols with NA successfully dropped from aaindex_df
|
|
Proceeding with combining aa_df with other features_df
|
|
|
|
PASS: ncols match
|
|
Expected ncols: 123
|
|
Got: 123
|
|
|
|
Total no. of columns in clean aa_df: 123
|
|
|
|
Proceeding to merge, expected nrows in merged_df: 531
|
|
|
|
PASS: my_features_df and aa_df successfully combined
|
|
nrows: 531
|
|
ncols: 286
|
|
count of NULL values before imputation
|
|
|
|
or_mychisq 263
|
|
log10_or_mychisq 263
|
|
dtype: int64
|
|
count of NULL values AFTER imputation
|
|
|
|
mutationinformation 0
|
|
or_rawI 0
|
|
logorI 0
|
|
dtype: int64
|
|
|
|
PASS: OR values imputed, data ready for ML
|
|
|
|
Total no. of features for aaindex: 123
|
|
|
|
No. of numerical features: 167
|
|
No. of categorical features: 7
|
|
|
|
PASS: x_features has no target variable
|
|
|
|
No. of columns for x_features: 174
|
|
|
|
-------------------------------------------------------------
|
|
Successfully split data according to scaling law: 1/np.sqrt(x_ncols)
|
|
Train data size: (109, 174)
|
|
Test data size: 0.07580980435789034 (10, 174)
|
|
y_train numbers: Counter({0: 70, 1: 39})
|
|
y_train ratio: 1.794871794871795
|
|
|
|
y_test_numbers: Counter({0: 6, 1: 4})
|
|
y_test ratio: 1.5
|
|
-------------------------------------------------------------
|
|
|
|
Simple Random OverSampling
|
|
Counter({0: 70, 1: 70})
|
|
(140, 174)
|
|
|
|
Simple Random UnderSampling
|
|
Counter({0: 39, 1: 39})
|
|
(78, 174)
|
|
|
|
Simple Combined Over and UnderSampling
|
|
Counter({0: 70, 1: 70})
|
|
(140, 174)
|
|
|
|
SMOTE_NC OverSampling
|
|
Counter({0: 70, 1: 70})
|
|
(140, 174)
|
|
|
|
#####################################################################
|
|
|
|
Running ML analysis: scaling law split
|
|
Gene name: gid
|
|
Drug name: streptomycin
|
|
|
|
Output directory: /home/tanu/git/Data/streptomycin/output/ml/tts_sl/
|
|
Sanity checks:
|
|
ML source data size: (119, 174)
|
|
Total input features: (109, 174)
|
|
Target feature numbers: Counter({0: 70, 1: 39})
|
|
Target features ratio: 1.794871794871795
|
|
|
|
#####################################################################
|
|
|
|
|
|
================================================================
|
|
|
|
Strucutral features (n): 35
|
|
These are:
|
|
Common stablity features: ['ligand_distance', 'ligand_affinity_change', 'duet_stability_change', 'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts', 'mcsm_na_affinity']
|
|
FoldX columns: ['electro_rr', 'electro_mm', 'electro_sm', 'electro_ss', 'disulfide_rr', 'disulfide_mm', 'disulfide_sm', 'disulfide_ss', 'hbonds_rr', 'hbonds_mm', 'hbonds_sm', 'hbonds_ss', 'partcov_rr', 'partcov_mm', 'partcov_sm', 'partcov_ss', 'vdwclashes_rr', 'vdwclashes_mm', 'vdwclashes_sm', 'vdwclashes_ss', 'volumetric_rr', 'volumetric_mm', 'volumetric_ss']
|
|
Other struc columns: ['rsa', 'kd_values', 'rd_values']
|
|
================================================================
|
|
|
|
AAindex features (n): 123
|
|
================================================================
|
|
|
|
Evolutionary features (n): 3
|
|
These are:
|
|
['consurf_score', 'snap2_score', 'provean_score']
|
|
================================================================
|
|
|
|
Genomic features (n): 6
|
|
These are:
|
|
['maf', 'logorI']
|
|
['lineage_proportion', 'dist_lineage_proportion', 'lineage_count_all', 'lineage_count_unique']
|
|
================================================================
|
|
|
|
Categorical features (n): 7
|
|
These are:
|
|
['ss_class', 'aa_prop_change', 'electrostatics_change', 'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site']
|
|
================================================================
|
|
|
|
|
|
Pass: No. of features match
|
|
|
|
#####################################################################
|
|
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02453518 0.030195 0.02754903 0.02711487 0.0307312 0.02705741
|
|
0.02809882 0.02859855 0.04155993 0.03637624]
|
|
|
|
mean value: 0.030181622505187987
|
|
|
|
key: score_time
|
|
value: [0.01295424 0.01227617 0.01204991 0.01202321 0.01229239 0.01206994
|
|
0.01218534 0.01203108 0.01242352 0.01243353]
|
|
|
|
mean value: 0.012273931503295898
|
|
|
|
key: test_mcc
|
|
value: [0. 0.21428571 0.21428571 0.21428571 1. 0.38575837
|
|
0.44854261 0.62360956 0.13363062 0.52380952]
|
|
|
|
mean value: 0.37582078405629626
|
|
|
|
key: train_mcc
|
|
value: [0.86978258 0.86680492 0.88878629 0.86680492 0.8449259 0.86680492
|
|
0.89113279 0.8449259 0.84852814 0.86933707]
|
|
|
|
mean value: 0.8657833441636803
|
|
|
|
key: test_accuracy
|
|
value: [0.63636364 0.63636364 0.63636364 0.63636364 1. 0.72727273
|
|
0.72727273 0.81818182 0.63636364 0.8 ]
|
|
|
|
mean value: 0.7254545454545455
|
|
|
|
key: train_accuracy
|
|
value: [0.93877551 0.93877551 0.94897959 0.93877551 0.92857143 0.93877551
|
|
0.94897959 0.92857143 0.92857143 0.93939394]
|
|
|
|
mean value: 0.9378169449598022
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
|
|
key: test_fscore
|
|
value: [0. 0.5 0.5 0.5 1. 0.57142857
|
|
0.66666667 0.66666667 0.33333333 0.66666667]
|
|
|
|
mean value: 0.5404761904761904
|
|
|
|
key: train_fscore
|
|
value: [0.90625 0.90909091 0.92537313 0.90909091 0.89230769 0.90909091
|
|
0.92307692 0.89230769 0.88888889 0.91176471]
|
|
|
|
mean value: 0.9067241764064634
|
|
|
|
key: test_precision
|
|
value: [0. 0.5 0.5 0.5 1. 0.66666667
|
|
0.6 1. 0.5 0.66666667]
|
|
|
|
mean value: 0.5933333333333333
|
|
|
|
key: train_precision
|
|
value: [1. 0.96774194 0.96875 0.96774194 0.96666667 0.96774194
|
|
1. 0.96666667 1. 0.96875 ]
|
|
|
|
mean value: 0.9774059139784946
|
|
|
|
key: test_recall
|
|
value: [0. 0.5 0.5 0.5 1. 0.5
|
|
0.75 0.5 0.25 0.66666667]
|
|
|
|
mean value: 0.5166666666666666
|
|
|
|
key: train_recall
|
|
value: [0.82857143 0.85714286 0.88571429 0.85714286 0.82857143 0.85714286
|
|
0.85714286 0.82857143 0.8 0.86111111]
|
|
|
|
mean value: 0.8461111111111111
|
|
|
|
key: test_roc_auc
|
|
value: [0.5 0.60714286 0.60714286 0.60714286 1. 0.67857143
|
|
0.73214286 0.75 0.55357143 0.76190476]
|
|
|
|
mean value: 0.6797619047619048
|
|
|
|
key: train_roc_auc
|
|
value: [0.91428571 0.92063492 0.93492063 0.92063492 0.90634921 0.92063492
|
|
0.92857143 0.90634921 0.9 0.92261905]
|
|
|
|
mean value: 0.9175000000000001
|
|
|
|
key: test_jcc
|
|
value: [0. 0.33333333 0.33333333 0.33333333 1. 0.4
|
|
0.5 0.5 0.2 0.5 ]
|
|
|
|
mean value: 0.41
|
|
|
|
key: train_jcc
|
|
value: [0.82857143 0.83333333 0.86111111 0.83333333 0.80555556 0.83333333
|
|
0.85714286 0.80555556 0.8 0.83783784]
|
|
|
|
mean value: 0.8295774345774346
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.6318655 0.6620369 0.63318753 0.63215232 0.62510443 0.76653695
|
|
0.64627314 0.66265655 1.02622437 0.67012167]
|
|
|
|
mean value: 0.6956159353256226
|
|
|
|
key: score_time
|
|
value: [0.01334572 0.0123713 0.0124464 0.01250243 0.0123589 0.01255822
|
|
0.01497769 0.01635933 0.01404405 0.01240587]
|
|
|
|
mean value: 0.013336992263793946
|
|
|
|
key: test_mcc
|
|
value: [0.41833001 0.38575837 0.21428571 0.21428571 0.82807867 0.38575837
|
|
0.69006556 0.81009259 0.38575837 0.52380952]
|
|
|
|
mean value: 0.48562229082178415
|
|
|
|
key: train_mcc
|
|
value: [0.95595301 0.93314074 0.91089125 0.75799004 0.91089125 0.88878629
|
|
0.95595301 1. 1. 0.91253747]
|
|
|
|
mean value: 0.9226143059319446
|
|
|
|
key: test_accuracy
|
|
value: [0.72727273 0.72727273 0.63636364 0.63636364 0.90909091 0.72727273
|
|
0.81818182 0.90909091 0.72727273 0.8 ]
|
|
|
|
mean value: 0.7618181818181818
|
|
|
|
key: train_accuracy
|
|
value: [0.97959184 0.96938776 0.95918367 0.8877551 0.95918367 0.94897959
|
|
0.97959184 1. 1. 0.95959596]
|
|
|
|
mean value: 0.9643269428983714
|
|
|
|
key: test_fscore
|
|
value: [0.4 0.57142857 0.5 0.5 0.88888889 0.57142857
|
|
0.8 0.85714286 0.57142857 0.66666667]
|
|
|
|
mean value: 0.6326984126984128
|
|
|
|
key: train_fscore
|
|
value: [0.97058824 0.95652174 0.94117647 0.81967213 0.94117647 0.92537313
|
|
0.97058824 1. 1. 0.94285714]
|
|
|
|
mean value: 0.9467953559228183
|
|
|
|
key: test_precision
|
|
value: [1. 0.66666667 0.5 0.5 0.8 0.66666667
|
|
0.66666667 1. 0.66666667 0.66666667]
|
|
|
|
mean value: 0.7133333333333334
|
|
|
|
key: train_precision
|
|
value: [1. 0.97058824 0.96969697 0.96153846 0.96969697 0.96875
|
|
1. 1. 1. 0.97058824]
|
|
|
|
mean value: 0.9810858871520636
|
|
|
|
key: test_recall
|
|
value: [0.25 0.5 0.5 0.5 1. 0.5
|
|
1. 0.75 0.5 0.66666667]
|
|
|
|
mean value: 0.6166666666666667
|
|
|
|
key: train_recall
|
|
value: [0.94285714 0.94285714 0.91428571 0.71428571 0.91428571 0.88571429
|
|
0.94285714 1. 1. 0.91666667]
|
|
|
|
mean value: 0.9173809523809524
|
|
|
|
key: test_roc_auc
|
|
value: [0.625 0.67857143 0.60714286 0.60714286 0.92857143 0.67857143
|
|
0.85714286 0.875 0.67857143 0.76190476]
|
|
|
|
mean value: 0.7297619047619047
|
|
|
|
key: train_roc_auc
|
|
value: [0.97142857 0.96349206 0.94920635 0.84920635 0.94920635 0.93492063
|
|
0.97142857 1. 1. 0.95039683]
|
|
|
|
mean value: 0.9539285714285715
|
|
|
|
key: test_jcc
|
|
value: [0.25 0.4 0.33333333 0.33333333 0.8 0.4
|
|
0.66666667 0.75 0.4 0.5 ]
|
|
|
|
mean value: 0.48333333333333334
|
|
|
|
key: train_jcc
|
|
value: [0.94285714 0.91666667 0.88888889 0.69444444 0.88888889 0.86111111
|
|
0.94285714 1. 1. 0.89189189]
|
|
|
|
mean value: 0.9027606177606178
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01464987 0.00926733 0.00862837 0.00862694 0.0085156 0.00847125
|
|
0.00865388 0.0086422 0.00871873 0.00910592]
|
|
|
|
mean value: 0.009328007698059082
|
|
|
|
key: score_time
|
|
value: [0.0125277 0.00906873 0.00880456 0.00855637 0.00875044 0.0086751
|
|
0.00854564 0.00900102 0.00869846 0.00857162]
|
|
|
|
mean value: 0.009119963645935059
|
|
|
|
key: test_mcc
|
|
value: [-0.06900656 -0.13363062 0.03857584 0.06900656 0.3105295 -0.21428571
|
|
0.17857143 0.21428571 0.21428571 0.32732684]
|
|
|
|
mean value: 0.09356586964495016
|
|
|
|
key: train_mcc
|
|
value: [0.30219324 0.40233433 0.47469288 0.66470299 0.35901099 0.38729833
|
|
0.34426519 0.47336463 0.32995002 0.42431986]
|
|
|
|
mean value: 0.41621324666625875
|
|
|
|
key: test_accuracy
|
|
value: [0.45454545 0.36363636 0.45454545 0.54545455 0.63636364 0.36363636
|
|
0.54545455 0.63636364 0.63636364 0.5 ]
|
|
|
|
mean value: 0.5136363636363637
|
|
|
|
key: train_accuracy
|
|
value: [0.63265306 0.68367347 0.7244898 0.84693878 0.64285714 0.66326531
|
|
0.64285714 0.70408163 0.63265306 0.6969697 ]
|
|
|
|
mean value: 0.68704390847248
|
|
|
|
key: test_fscore
|
|
value: [0.4 0.46153846 0.5 0.44444444 0.6 0.36363636
|
|
0.54545455 0.5 0.5 0.54545455]
|
|
|
|
mean value: 0.48605283605283606
|
|
|
|
key: train_fscore
|
|
value: [0.59090909 0.64367816 0.68235294 0.7826087 0.62365591 0.63736264
|
|
0.61538462 0.68131868 0.60869565 0.65909091]
|
|
|
|
mean value: 0.6525057297966527
|
|
|
|
key: test_precision
|
|
value: [0.33333333 0.33333333 0.375 0.4 0.5 0.28571429
|
|
0.42857143 0.5 0.5 0.375 ]
|
|
|
|
mean value: 0.40309523809523806
|
|
|
|
key: train_precision
|
|
value: [0.49056604 0.53846154 0.58 0.79411765 0.5 0.51785714
|
|
0.5 0.55357143 0.49122807 0.55769231]
|
|
|
|
mean value: 0.5523494172552529
|
|
|
|
key: test_recall
|
|
value: [0.5 0.75 0.75 0.5 0.75 0.5 0.75 0.5 0.5 1. ]
|
|
|
|
mean value: 0.65
|
|
|
|
key: train_recall
|
|
value: [0.74285714 0.8 0.82857143 0.77142857 0.82857143 0.82857143
|
|
0.8 0.88571429 0.8 0.80555556]
|
|
|
|
mean value: 0.8091269841269841
|
|
|
|
key: test_roc_auc
|
|
value: [0.46428571 0.44642857 0.51785714 0.53571429 0.66071429 0.39285714
|
|
0.58928571 0.60714286 0.60714286 0.64285714]
|
|
|
|
mean value: 0.5464285714285714
|
|
|
|
key: train_roc_auc
|
|
value: [0.65714286 0.70952381 0.74761905 0.83015873 0.68412698 0.7
|
|
0.67777778 0.74444444 0.66984127 0.7202381 ]
|
|
|
|
mean value: 0.7140873015873016
|
|
|
|
key: test_jcc
|
|
value: [0.25 0.3 0.33333333 0.28571429 0.42857143 0.22222222
|
|
0.375 0.33333333 0.33333333 0.375 ]
|
|
|
|
mean value: 0.32365079365079363
|
|
|
|
key: train_jcc
|
|
value: [0.41935484 0.47457627 0.51785714 0.64285714 0.453125 0.46774194
|
|
0.44444444 0.51666667 0.4375 0.49152542]
|
|
|
|
mean value: 0.48656488659341995
|
|
|
|
MCC on Blind test: 0.67
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00878239 0.00921321 0.00866842 0.00868893 0.00864553 0.00862384
|
|
0.00863862 0.0087378 0.00851679 0.00869918]
|
|
|
|
mean value: 0.008721470832824707
|
|
|
|
key: score_time
|
|
value: [0.00925446 0.00853109 0.00853133 0.00859833 0.00865149 0.00865412
|
|
0.00859571 0.00850534 0.0085783 0.00853562]
|
|
|
|
mean value: 0.008643579483032227
|
|
|
|
key: test_mcc
|
|
value: [ 0.41833001 -0.03857584 -0.35634832 0.13363062 0.81009259 0.13363062
|
|
0.44854261 -0.35634832 -0.23904572 0.04761905]
|
|
|
|
mean value: 0.10015272992148226
|
|
|
|
key: train_mcc
|
|
value: [0.58972429 0.61549072 0.62652663 0.56504712 0.54087139 0.5890183
|
|
0.64246907 0.61976141 0.58972429 0.58367897]
|
|
|
|
mean value: 0.5962312182242981
|
|
|
|
key: test_accuracy
|
|
value: [0.72727273 0.54545455 0.45454545 0.63636364 0.90909091 0.63636364
|
|
0.72727273 0.45454545 0.54545455 0.6 ]
|
|
|
|
mean value: 0.6236363636363637
|
|
|
|
key: train_accuracy
|
|
value: [0.81632653 0.82653061 0.82653061 0.80612245 0.79591837 0.81632653
|
|
0.83673469 0.82653061 0.81632653 0.80808081]
|
|
|
|
mean value: 0.8175427746856319
|
|
|
|
key: test_fscore
|
|
value: [0.4 0.28571429 0. 0.33333333 0.85714286 0.33333333
|
|
0.66666667 0. 0. 0.33333333]
|
|
|
|
mean value: 0.32095238095238093
|
|
|
|
key: train_fscore
|
|
value: [0.7 0.71186441 0.69090909 0.68852459 0.66666667 0.70967742
|
|
0.72413793 0.70175439 0.7 0.66666667]
|
|
|
|
mean value: 0.6960201157540253
|
|
|
|
key: test_precision
|
|
value: [1. 0.33333333 0. 0.5 1. 0.5
|
|
0.6 0. 0. 0.33333333]
|
|
|
|
mean value: 0.42666666666666664
|
|
|
|
key: train_precision
|
|
value: [0.84 0.875 0.95 0.80769231 0.8 0.81481481
|
|
0.91304348 0.90909091 0.84 0.9047619 ]
|
|
|
|
mean value: 0.8654403414620806
|
|
|
|
key: test_recall
|
|
value: [0.25 0.25 0. 0.25 0.75 0.25
|
|
0.75 0. 0. 0.33333333]
|
|
|
|
mean value: 0.2833333333333333
|
|
|
|
key: train_recall
|
|
value: [0.6 0.6 0.54285714 0.6 0.57142857 0.62857143
|
|
0.6 0.57142857 0.6 0.52777778]
|
|
|
|
mean value: 0.5842063492063492
|
|
|
|
key: test_roc_auc
|
|
value: [0.625 0.48214286 0.35714286 0.55357143 0.875 0.55357143
|
|
0.73214286 0.35714286 0.42857143 0.52380952]
|
|
|
|
mean value: 0.5488095238095239
|
|
|
|
key: train_roc_auc
|
|
value: [0.76825397 0.77619048 0.76349206 0.76031746 0.74603175 0.77460317
|
|
0.78412698 0.76984127 0.76825397 0.74801587]
|
|
|
|
mean value: 0.7659126984126984
|
|
|
|
key: test_jcc
|
|
value: [0.25 0.16666667 0. 0.2 0.75 0.2
|
|
0.5 0. 0. 0.2 ]
|
|
|
|
mean value: 0.22666666666666668
|
|
|
|
key: train_jcc
|
|
value: [0.53846154 0.55263158 0.52777778 0.525 0.5 0.55
|
|
0.56756757 0.54054054 0.53846154 0.5 ]
|
|
|
|
mean value: 0.5340440541756332
|
|
|
|
MCC on Blind test: 0.36
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00848818 0.01176858 0.00879884 0.00929117 0.0093472 0.00941753
|
|
0.0095644 0.00954175 0.00954604 0.0095284 ]
|
|
|
|
mean value: 0.009529209136962891
|
|
|
|
key: score_time
|
|
value: [0.05046916 0.03067589 0.01002026 0.01012683 0.0101347 0.0102005
|
|
0.01021075 0.01025558 0.01041079 0.01048422]
|
|
|
|
mean value: 0.016298866271972655
|
|
|
|
key: test_mcc
|
|
value: [-0.23904572 0.13363062 0.38575837 -0.03857584 0.81009259 0.13363062
|
|
-0.03857584 0.38575837 -0.03857584 0.21821789]
|
|
|
|
mean value: 0.1712315234921411
|
|
|
|
key: train_mcc
|
|
value: [0.49201849 0.44316559 0.36232865 0.44248774 0.36787951 0.49201849
|
|
0.41560471 0.46664388 0.49368586 0.45456865]
|
|
|
|
mean value: 0.4430401554934932
|
|
|
|
key: test_accuracy
|
|
value: [0.54545455 0.63636364 0.72727273 0.54545455 0.90909091 0.63636364
|
|
0.54545455 0.72727273 0.54545455 0.7 ]
|
|
|
|
mean value: 0.6518181818181819
|
|
|
|
key: train_accuracy
|
|
value: [0.7755102 0.75510204 0.7244898 0.75510204 0.7244898 0.7755102
|
|
0.74489796 0.76530612 0.7755102 0.75757576]
|
|
|
|
mean value: 0.7553494124922696
|
|
|
|
key: test_fscore
|
|
value: [0. 0.33333333 0.57142857 0.28571429 0.85714286 0.33333333
|
|
0.28571429 0.57142857 0.28571429 0.4 ]
|
|
|
|
mean value: 0.3923809523809524
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
[0.63333333 0.6 0.50909091 0.55555556 0.54237288 0.63333333
|
|
0.56140351 0.59649123 0.64516129 0.5862069 ]
|
|
|
|
mean value: 0.5862948936385473
|
|
|
|
key: test_precision
|
|
value: [0. 0.5 0.66666667 0.33333333 1. 0.5
|
|
0.33333333 0.66666667 0.33333333 0.5 ]
|
|
|
|
mean value: 0.48333333333333334
|
|
|
|
key: train_precision
|
|
value: [0.76 0.72 0.7 0.78947368 0.66666667 0.76
|
|
0.72727273 0.77272727 0.74074074 0.77272727]
|
|
|
|
mean value: 0.7409608364345206
|
|
|
|
key: test_recall
|
|
value: [0. 0.25 0.5 0.25 0.75 0.25
|
|
0.25 0.5 0.25 0.33333333]
|
|
|
|
mean value: 0.3333333333333333
|
|
|
|
key: train_recall
|
|
value: [0.54285714 0.51428571 0.4 0.42857143 0.45714286 0.54285714
|
|
0.45714286 0.48571429 0.57142857 0.47222222]
|
|
|
|
mean value: 0.4872222222222222
|
|
|
|
key: test_roc_auc
|
|
value: [0.42857143 0.55357143 0.67857143 0.48214286 0.875 0.55357143
|
|
0.48214286 0.67857143 0.48214286 0.5952381 ]
|
|
|
|
mean value: 0.580952380952381
|
|
|
|
key: train_roc_auc
|
|
value: [0.72380952 0.7015873 0.65238095 0.68253968 0.66507937 0.72380952
|
|
0.68095238 0.7031746 0.73015873 0.69642857]
|
|
|
|
mean value: 0.6959920634920634
|
|
|
|
key: test_jcc
|
|
value: [0. 0.2 0.4 0.16666667 0.75 0.2
|
|
0.16666667 0.4 0.16666667 0.25 ]
|
|
|
|
mean value: 0.27
|
|
|
|
key: train_jcc
|
|
value: [0.46341463 0.42857143 0.34146341 0.38461538 0.37209302 0.46341463
|
|
0.3902439 0.425 0.47619048 0.41463415]
|
|
|
|
mean value: 0.415964104434042
|
|
|
|
MCC on Blind test: 0.17
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01114202 0.01079059 0.01085377 0.01060557 0.01079392 0.00984693
|
|
0.01081705 0.01012373 0.01036024 0.01075292]
|
|
|
|
mean value: 0.010608673095703125
|
|
|
|
key: score_time
|
|
value: [0.00997567 0.0099051 0.01002312 0.00994086 0.010077 0.01012874
|
|
0.00993395 0.00938058 0.00973964 0.00926208]
|
|
|
|
mean value: 0.009836673736572266
|
|
|
|
key: test_mcc
|
|
value: [0. 0.13363062 0.13363062 0.41833001 0. 0.41833001
|
|
0.41833001 0. 0.41833001 0.21821789]
|
|
|
|
mean value: 0.21587991852165678
|
|
|
|
key: train_mcc
|
|
value: [0.57035183 0.6363961 0.6146363 0.59263776 0.59263776 0.54772256
|
|
0.57035183 0.54772256 0.6146363 0.55901699]
|
|
|
|
mean value: 0.5846109973682458
|
|
|
|
key: test_accuracy
|
|
value: [0.63636364 0.63636364 0.63636364 0.72727273 0.63636364 0.72727273
|
|
0.72727273 0.63636364 0.72727273 0.7 ]
|
|
|
|
mean value: 0.6790909090909091
|
|
|
|
key: train_accuracy
|
|
value: [0.79591837 0.82653061 0.81632653 0.80612245 0.80612245 0.78571429
|
|
0.79591837 0.78571429 0.81632653 0.78787879]
|
|
|
|
mean value: 0.8022572665429808
|
|
|
|
key: test_fscore
|
|
value: [0. 0.33333333 0.33333333 0.4 0. 0.4
|
|
0.4 0. 0.4 0.4 ]
|
|
|
|
mean value: 0.26666666666666666
|
|
|
|
key: train_fscore
|
|
value: [0.6 0.67924528 0.65384615 0.62745098 0.62745098 0.57142857
|
|
0.6 0.57142857 0.65384615 0.58823529]
|
|
|
|
mean value: 0.6172931988470279
|
|
|
|
key: test_precision
|
|
value: [0. 0.5 0.5 1. 0. 1. 1. 0. 1. 0.5]
|
|
|
|
mean value: 0.55
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0. 0.25 0.25 0.25 0. 0.25
|
|
0.25 0. 0.25 0.33333333]
|
|
|
|
mean value: 0.18333333333333332
|
|
|
|
key: train_recall
|
|
value: [0.42857143 0.51428571 0.48571429 0.45714286 0.45714286 0.4
|
|
0.42857143 0.4 0.48571429 0.41666667]
|
|
|
|
mean value: 0.4473809523809524
|
|
|
|
key: test_roc_auc
|
|
value: [0.5 0.55357143 0.55357143 0.625 0.5 0.625
|
|
0.625 0.5 0.625 0.5952381 ]
|
|
|
|
mean value: 0.5702380952380952
|
|
|
|
key: train_roc_auc
|
|
value: [0.71428571 0.75714286 0.74285714 0.72857143 0.72857143 0.7
|
|
0.71428571 0.7 0.74285714 0.70833333]
|
|
|
|
mean value: 0.7236904761904762
|
|
|
|
key: test_jcc
|
|
value: [0. 0.2 0.2 0.25 0. 0.25 0.25 0. 0.25 0.25]
|
|
|
|
mean value: 0.165
|
|
|
|
key: train_jcc
|
|
value: [0.42857143 0.51428571 0.48571429 0.45714286 0.45714286 0.4
|
|
0.42857143 0.4 0.48571429 0.41666667]
|
|
|
|
mean value: 0.4473809523809524
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py:702: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (500) reached and the optimization hasn't converged yet.
|
|
warnings.warn(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.52748394 0.58608747 0.57734966 0.51142216 0.66457558 0.65659547
|
|
0.45914388 0.5262475 0.48919296 0.69721627]
|
|
|
|
mean value: 0.5695314884185791
|
|
|
|
key: score_time
|
|
value: [0.01240587 0.01868153 0.0124526 0.01238823 0.01570678 0.01239204
|
|
0.01241803 0.01240945 0.01244521 0.01239586]
|
|
|
|
mean value: 0.013369560241699219
|
|
|
|
key: test_mcc
|
|
value: [0.60714286 0.13363062 0.21428571 0.21428571 0.60714286 0.44854261
|
|
0.46291005 0.60714286 0.38575837 0.21821789]
|
|
|
|
mean value: 0.389905954955624
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.81818182 0.63636364 0.63636364 0.63636364 0.81818182 0.72727273
|
|
0.63636364 0.81818182 0.72727273 0.7 ]
|
|
|
|
mean value: 0.7154545454545455
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.75 0.33333333 0.5 0.5 0.75 0.66666667
|
|
0.66666667 0.75 0.57142857 0.4 ]
|
|
|
|
mean value: 0.5888095238095238
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.75 0.5 0.5 0.5 0.75 0.6
|
|
0.5 0.75 0.66666667 0.5 ]
|
|
|
|
mean value: 0.6016666666666667
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.75 0.25 0.5 0.5 0.75 0.75
|
|
1. 0.75 0.5 0.33333333]
|
|
|
|
mean value: 0.6083333333333333
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.80357143 0.55357143 0.60714286 0.60714286 0.80357143 0.73214286
|
|
0.71428571 0.80357143 0.67857143 0.5952381 ]
|
|
|
|
mean value: 0.6898809523809524
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.6 0.2 0.33333333 0.33333333 0.6 0.5
|
|
0.5 0.6 0.4 0.25 ]
|
|
|
|
mean value: 0.43166666666666664
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0163362 0.01244402 0.01193404 0.01061082 0.01098156 0.0116725
|
|
0.01134181 0.01184058 0.01182842 0.01150846]
|
|
|
|
mean value: 0.01204984188079834
|
|
|
|
key: score_time
|
|
value: [0.01191998 0.0099411 0.00993514 0.0090301 0.0094378 0.00917912
|
|
0.00951123 0.00942707 0.00954127 0.01003218]
|
|
|
|
mean value: 0.009795498847961426
|
|
|
|
key: test_mcc
|
|
value: [0.38575837 0.82807867 0.82807867 1. 1. 0.60714286
|
|
0.60714286 0.82807867 0.60714286 0.76376262]
|
|
|
|
mean value: 0.745518557579225
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.72727273 0.90909091 0.90909091 1. 1. 0.81818182
|
|
0.81818182 0.90909091 0.81818182 0.9 ]
|
|
|
|
mean value: 0.8809090909090909
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.57142857 0.88888889 0.88888889 1. 1. 0.75
|
|
0.75 0.88888889 0.75 0.8 ]
|
|
|
|
mean value: 0.8288095238095239
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.8 0.8 1. 1. 0.75
|
|
0.75 0.8 0.75 1. ]
|
|
|
|
mean value: 0.8316666666666667
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.5 1. 1. 1. 1. 0.75
|
|
0.75 1. 0.75 0.66666667]
|
|
|
|
mean value: 0.8416666666666667
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.67857143 0.92857143 0.92857143 1. 1. 0.80357143
|
|
0.80357143 0.92857143 0.80357143 0.83333333]
|
|
|
|
mean value: 0.8708333333333333
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.4 0.8 0.8 1. 1. 0.6
|
|
0.6 0.8 0.6 0.66666667]
|
|
|
|
mean value: 0.7266666666666667
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.67
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.09402609 0.09229231 0.09156442 0.08829975 0.08816385 0.08814311
|
|
0.08624172 0.08659744 0.0860436 0.08706236]
|
|
|
|
mean value: 0.0888434648513794
|
|
|
|
key: score_time
|
|
value: [0.01869106 0.01854372 0.01791668 0.01841903 0.01806092 0.01795483
|
|
0.01827717 0.01738906 0.01821613 0.01852036]
|
|
|
|
mean value: 0.018198895454406738
|
|
|
|
key: test_mcc
|
|
value: [0.62360956 0.06900656 0.38575837 0.44854261 0.38575837 0.21428571
|
|
0.21428571 0.62360956 0.62360956 0.50917508]
|
|
|
|
mean value: 0.409764111849294
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.81818182 0.54545455 0.72727273 0.72727273 0.72727273 0.63636364
|
|
0.63636364 0.81818182 0.81818182 0.8 ]
|
|
|
|
mean value: 0.7254545454545455
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.44444444 0.57142857 0.66666667 0.57142857 0.5
|
|
0.5 0.66666667 0.66666667 0.5 ]
|
|
|
|
mean value: 0.5753968253968254
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.4 0.66666667 0.6 0.66666667 0.5
|
|
0.5 1. 1. 1. ]
|
|
|
|
mean value: 0.7333333333333333
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.5 0.5 0.5 0.75 0.5 0.5
|
|
0.5 0.5 0.5 0.33333333]
|
|
|
|
mean value: 0.5083333333333333
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.53571429 0.67857143 0.73214286 0.67857143 0.60714286
|
|
0.60714286 0.75 0.75 0.66666667]
|
|
|
|
mean value: 0.6755952380952381
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.28571429 0.4 0.5 0.4 0.33333333
|
|
0.33333333 0.5 0.5 0.33333333]
|
|
|
|
mean value: 0.4085714285714286
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00997877 0.00966382 0.00917029 0.009238 0.0097146 0.00969219
|
|
0.00935483 0.00956774 0.00983381 0.0093317 ]
|
|
|
|
mean value: 0.009554576873779298
|
|
|
|
key: score_time
|
|
value: [0.00899959 0.008461 0.00906372 0.00885987 0.00935435 0.00913429
|
|
0.00945234 0.0089314 0.0090282 0.00912428]
|
|
|
|
mean value: 0.009040904045104981
|
|
|
|
key: test_mcc
|
|
value: [ 0.62360956 -0.3105295 0.38575837 0.57142857 0.82807867 0.13363062
|
|
0.21428571 1. -0.03857584 -0.08908708]
|
|
|
|
mean value: 0.3318599097416819
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.81818182 0.36363636 0.72727273 0.72727273 0.90909091 0.63636364
|
|
0.63636364 1. 0.54545455 0.5 ]
|
|
|
|
mean value: 0.6863636363636364
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.22222222 0.57142857 0.72727273 0.88888889 0.33333333
|
|
0.5 1. 0.28571429 0.28571429]
|
|
|
|
mean value: 0.5481240981240981
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.2 0.66666667 0.57142857 0.8 0.5
|
|
0.5 1. 0.33333333 0.25 ]
|
|
|
|
mean value: 0.5821428571428572
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.5 0.25 0.5 1. 1. 0.25
|
|
0.5 1. 0.25 0.33333333]
|
|
|
|
mean value: 0.5583333333333333
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.33928571 0.67857143 0.78571429 0.92857143 0.55357143
|
|
0.60714286 1. 0.48214286 0.45238095]
|
|
|
|
mean value: 0.6577380952380952
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.125 0.4 0.57142857 0.8 0.2
|
|
0.33333333 1. 0.16666667 0.16666667]
|
|
|
|
mean value: 0.4263095238095238
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.1096015 1.08083749 1.08970332 1.12753987 1.06541085 1.056283
|
|
1.06098294 1.083395 1.07819152 1.06367922]
|
|
|
|
mean value: 1.0815624713897705
|
|
|
|
key: score_time
|
|
value: [0.09377718 0.09102583 0.0940125 0.09139752 0.08611059 0.08667684
|
|
0.08917809 0.09408355 0.08660865 0.09382653]
|
|
|
|
mean value: 0.09066972732543946
|
|
|
|
key: test_mcc
|
|
value: [0.41833001 0.60714286 0.62360956 0.81009259 0.60714286 0.38575837
|
|
0.44854261 1. 0.41833001 0.50917508]
|
|
|
|
mean value: 0.5828123958278172
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.72727273 0.81818182 0.81818182 0.90909091 0.81818182 0.72727273
|
|
0.72727273 1. 0.72727273 0.8 ]
|
|
|
|
mean value: 0.8072727272727273
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.4 0.75 0.66666667 0.85714286 0.75 0.57142857
|
|
0.66666667 1. 0.4 0.5 ]
|
|
|
|
mean value: 0.6561904761904762
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.75 1. 1. 0.75 0.66666667
|
|
0.6 1. 1. 1. ]
|
|
|
|
mean value: 0.8766666666666667
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.25 0.75 0.5 0.75 0.75 0.5
|
|
0.75 1. 0.25 0.33333333]
|
|
|
|
mean value: 0.5833333333333334
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.625 0.80357143 0.75 0.875 0.80357143 0.67857143
|
|
0.73214286 1. 0.625 0.66666667]
|
|
|
|
mean value: 0.7559523809523809
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.25 0.6 0.5 0.75 0.6 0.4
|
|
0.5 1. 0.25 0.33333333]
|
|
|
|
mean value: 0.5183333333333333
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
|
|
key: fit_time
|
|
value: [1.7521224 0.92771101 0.87955499 0.98863387 0.87796545 0.96173191
|
|
0.81847405 0.84688282 0.88875556 0.87859821]
|
|
|
|
mean value: 0.9820430278778076
|
|
|
|
key: score_time
|
|
value: [0.22265458 0.22081852 0.21812773 0.23905349 0.11643434 0.23435545
|
|
0.21912909 0.1910212 0.23325014 0.12879086]
|
|
|
|
mean value: 0.20236353874206542
|
|
|
|
key: test_mcc
|
|
value: [0. 0.81009259 0.62360956 0.81009259 0.60714286 0.62360956
|
|
0.41833001 0.81009259 0.41833001 0.50917508]
|
|
|
|
mean value: 0.5630474851721843
|
|
|
|
key: train_mcc
|
|
value: [0.91259839 0.86680492 0.93419873 0.89113279 0.89113279 0.91259839
|
|
0.86978258 0.91259839 0.93419873 0.89319321]
|
|
|
|
mean value: 0.901823893018008
|
|
|
|
key: test_accuracy
|
|
value: [0.63636364 0.90909091 0.81818182 0.90909091 0.81818182 0.81818182
|
|
0.72727273 0.90909091 0.72727273 0.8 ]
|
|
|
|
mean value: 0.8072727272727273
|
|
|
|
key: train_accuracy
|
|
value: [0.95918367 0.93877551 0.96938776 0.94897959 0.94897959 0.95918367
|
|
0.93877551 0.95918367 0.96938776 0.94949495]
|
|
|
|
mean value: 0.9541331684188827
|
|
|
|
key: test_fscore
|
|
value: [0. 0.85714286 0.66666667 0.85714286 0.75 0.66666667
|
|
0.4 0.85714286 0.4 0.5 ]
|
|
|
|
mean value: 0.5954761904761905
|
|
|
|
key: train_fscore
|
|
value: [0.93939394 0.90909091 0.95522388 0.92307692 0.92307692 0.93939394
|
|
0.90625 0.93939394 0.95522388 0.92537313]
|
|
|
|
mean value: 0.9315497468948961
|
|
|
|
key: test_precision
|
|
value: [0. 1. 1. 1. 0.75 1. 1. 1. 1. 1. ]
|
|
|
|
mean value: 0.875
|
|
|
|
key: train_precision
|
|
value: [1. 0.96774194 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9967741935483871
|
|
|
|
key: test_recall
|
|
value: [0. 0.75 0.5 0.75 0.75 0.5
|
|
0.25 0.75 0.25 0.33333333]
|
|
|
|
mean value: 0.48333333333333334
|
|
|
|
key: train_recall
|
|
value: [0.88571429 0.85714286 0.91428571 0.85714286 0.85714286 0.88571429
|
|
0.82857143 0.88571429 0.91428571 0.86111111]
|
|
|
|
mean value: 0.8746825396825396
|
|
|
|
key: test_roc_auc
|
|
value: [0.5 0.875 0.75 0.875 0.80357143 0.75
|
|
0.625 0.875 0.625 0.66666667]
|
|
|
|
mean value: 0.7345238095238096
|
|
|
|
key: train_roc_auc
|
|
value: [0.94285714 0.92063492 0.95714286 0.92857143 0.92857143 0.94285714
|
|
0.91428571 0.94285714 0.95714286 0.93055556]
|
|
|
|
mean value: 0.9365476190476191
|
|
|
|
key: test_jcc
|
|
value: [0. 0.75 0.5 0.75 0.6 0.5
|
|
0.25 0.75 0.25 0.33333333]
|
|
|
|
mean value: 0.4683333333333333
|
|
|
|
key: train_jcc
|
|
value: [0.88571429 0.83333333 0.91428571 0.85714286 0.85714286 0.88571429
|
|
0.82857143 0.88571429 0.91428571 0.86111111]
|
|
|
|
mean value: 0.8723015873015872
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02311182 0.00977755 0.00971484 0.00987244 0.00976491 0.00973606
|
|
0.00970364 0.00972319 0.01002288 0.00982809]
|
|
|
|
mean value: 0.011125540733337403
|
|
|
|
key: score_time
|
|
value: [0.01044178 0.00948119 0.00961089 0.00950241 0.00949287 0.0095408
|
|
0.00925827 0.00945497 0.00956225 0.0095191 ]
|
|
|
|
mean value: 0.009586453437805176
|
|
|
|
key: test_mcc
|
|
value: [ 0.41833001 -0.03857584 -0.35634832 0.13363062 0.81009259 0.13363062
|
|
0.44854261 -0.35634832 -0.23904572 0.04761905]
|
|
|
|
mean value: 0.10015272992148226
|
|
|
|
key: train_mcc
|
|
value: [0.58972429 0.61549072 0.62652663 0.56504712 0.54087139 0.5890183
|
|
0.64246907 0.61976141 0.58972429 0.58367897]
|
|
|
|
mean value: 0.5962312182242981
|
|
|
|
key: test_accuracy
|
|
value: [0.72727273 0.54545455 0.45454545 0.63636364 0.90909091 0.63636364
|
|
0.72727273 0.45454545 0.54545455 0.6 ]
|
|
|
|
mean value: 0.6236363636363637
|
|
|
|
key: train_accuracy
|
|
value: [0.81632653 0.82653061 0.82653061 0.80612245 0.79591837 0.81632653
|
|
0.83673469 0.82653061 0.81632653 0.80808081]
|
|
|
|
mean value: 0.8175427746856319
|
|
|
|
key: test_fscore
|
|
value: [0.4 0.28571429 0. 0.33333333 0.85714286 0.33333333
|
|
0.66666667 0. 0. 0.33333333]
|
|
|
|
mean value: 0.32095238095238093
|
|
|
|
key: train_fscore
|
|
value: [0.7 0.71186441 0.69090909 0.68852459 0.66666667 0.70967742
|
|
0.72413793 0.70175439 0.7 0.66666667]
|
|
|
|
mean value: 0.6960201157540253
|
|
|
|
key: test_precision
|
|
value: [1. 0.33333333 0. 0.5 1. 0.5
|
|
0.6 0. 0. 0.33333333]
|
|
|
|
mean value: 0.42666666666666664
|
|
|
|
key: train_precision
|
|
value: [0.84 0.875 0.95 0.80769231 0.8 0.81481481
|
|
0.91304348 0.90909091 0.84 0.9047619 ]
|
|
|
|
mean value: 0.8654403414620806
|
|
|
|
key: test_recall
|
|
value: [0.25 0.25 0. 0.25 0.75 0.25
|
|
0.75 0. 0. 0.33333333]
|
|
|
|
mean value: 0.2833333333333333
|
|
|
|
key: train_recall
|
|
value: [0.6 0.6 0.54285714 0.6 0.57142857 0.62857143
|
|
0.6 0.57142857 0.6 0.52777778]
|
|
|
|
mean value: 0.5842063492063492
|
|
|
|
key: test_roc_auc
|
|
value: [0.625 0.48214286 0.35714286 0.55357143 0.875 0.55357143
|
|
0.73214286 0.35714286 0.42857143 0.52380952]
|
|
|
|
mean value: 0.5488095238095239
|
|
|
|
key: train_roc_auc
|
|
value: [0.76825397 0.77619048 0.76349206 0.76031746 0.74603175 0.77460317
|
|
0.78412698 0.76984127 0.76825397 0.74801587]
|
|
|
|
mean value: 0.7659126984126984
|
|
|
|
key: test_jcc
|
|
value: [0.25 0.16666667 0. 0.2 0.75 0.2
|
|
0.5 0. 0. 0.2 ]
|
|
|
|
mean value: 0.22666666666666668
|
|
|
|
key: train_jcc
|
|
value: [0.53846154 0.55263158 0.52777778 0.525 0.5 0.55
|
|
0.56756757 0.54054054 0.53846154 0.5 ]
|
|
|
|
mean value: 0.5340440541756332
|
|
|
|
MCC on Blind test: 0.36
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.07823944 0.05303431 0.04373765 0.04487467 0.04556298 0.04342461
|
|
0.06485868 0.04147911 0.04427385 0.04476452]
|
|
|
|
mean value: 0.05042498111724854
|
|
|
|
key: score_time
|
|
value: [0.01064777 0.01035261 0.01131892 0.01060677 0.01130295 0.01109171
|
|
0.01039028 0.01056647 0.01087427 0.01112318]
|
|
|
|
mean value: 0.01082749366760254
|
|
|
|
key: test_mcc
|
|
value: [0.38575837 0.82807867 1. 1. 1. 0.60714286
|
|
0.81009259 0.82807867 0.81009259 1. ]
|
|
|
|
mean value: 0.8269243749071702
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.72727273 0.90909091 1. 1. 1. 0.81818182
|
|
0.90909091 0.90909091 0.90909091 1. ]
|
|
|
|
mean value: 0.9181818181818182
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.57142857 0.88888889 1. 1. 1. 0.75
|
|
0.85714286 0.88888889 0.85714286 1. ]
|
|
|
|
mean value: 0.8813492063492063
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.8 1. 1. 1. 0.75
|
|
1. 0.8 1. 1. ]
|
|
|
|
mean value: 0.9016666666666666
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.5 1. 1. 1. 1. 0.75 0.75 1. 0.75 1. ]
|
|
|
|
mean value: 0.875
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.67857143 0.92857143 1. 1. 1. 0.80357143
|
|
0.875 0.92857143 0.875 1. ]
|
|
|
|
mean value: 0.9089285714285714
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.4 0.8 1. 1. 1. 0.6 0.75 0.8 0.75 1. ]
|
|
|
|
mean value: 0.81
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 1.0
|
|
|
|
Accuracy on Blind test: 1.0
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.03964543 0.0346694 0.04520273 0.07001829 0.05058432 0.05238247
|
|
0.04535532 0.04525828 0.04547024 0.04568648]
|
|
|
|
mean value: 0.04742729663848877
|
|
|
|
key: score_time
|
|
value: [0.0121448 0.02086496 0.02063799 0.02355075 0.02194834 0.02153063
|
|
0.02106428 0.02080369 0.02024055 0.02344513]
|
|
|
|
mean value: 0.020623111724853517
|
|
|
|
key: test_mcc
|
|
value: [ 0.38575837 -0.06900656 -0.06900656 -0.17857143 0.3105295 0.21428571
|
|
0.3105295 -0.03857584 0.38575837 0.08908708]
|
|
|
|
mean value: 0.1340788170211345
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.72727273 0.45454545 0.45454545 0.45454545 0.63636364 0.63636364
|
|
0.63636364 0.54545455 0.72727273 0.5 ]
|
|
|
|
mean value: 0.5772727272727273
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.57142857 0.4 0.4 0.25 0.6 0.5
|
|
0.6 0.28571429 0.57142857 0.44444444]
|
|
|
|
mean value: 0.4623015873015873
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.33333333 0.33333333 0.25 0.5 0.5
|
|
0.5 0.33333333 0.66666667 0.33333333]
|
|
|
|
mean value: 0.44166666666666665
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.5 0.5 0.5 0.25 0.75 0.5
|
|
0.75 0.25 0.5 0.66666667]
|
|
|
|
mean value: 0.5166666666666666
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.67857143 0.46428571 0.46428571 0.41071429 0.66071429 0.60714286
|
|
0.66071429 0.48214286 0.67857143 0.54761905]
|
|
|
|
mean value: 0.5654761904761905
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.4 0.25 0.25 0.14285714 0.42857143 0.33333333
|
|
0.42857143 0.16666667 0.4 0.28571429]
|
|
|
|
mean value: 0.30857142857142855
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01844668 0.00899911 0.0086031 0.008533 0.008605 0.00961709
|
|
0.00861883 0.00964332 0.00905418 0.00862718]
|
|
|
|
mean value: 0.009874749183654784
|
|
|
|
key: score_time
|
|
value: [0.00925779 0.00960422 0.0085597 0.00856972 0.00883079 0.00896478
|
|
0.00923276 0.00896192 0.00885677 0.00844169]
|
|
|
|
mean value: 0.008928012847900391
|
|
|
|
key: test_mcc
|
|
value: [ 0.62360956 0.44854261 -0.17857143 -0.06900656 0.38575837 0.62360956
|
|
0.38575837 -0.23904572 0.38575837 0.04761905]
|
|
|
|
mean value: 0.24140322084593716
|
|
|
|
key: train_mcc
|
|
value: [0.35970916 0.34202233 0.45466371 0.47527082 0.40887025 0.34383703
|
|
0.36307678 0.46857566 0.38924947 0.39514539]
|
|
|
|
mean value: 0.40004206061519487
|
|
|
|
key: test_accuracy
|
|
value: [0.81818182 0.72727273 0.45454545 0.45454545 0.72727273 0.81818182
|
|
0.72727273 0.54545455 0.72727273 0.6 ]
|
|
|
|
mean value: 0.66
|
|
|
|
key: train_accuracy
|
|
value: [0.70408163 0.69387755 0.75510204 0.76530612 0.73469388 0.70408163
|
|
0.71428571 0.76530612 0.7244898 0.72727273]
|
|
|
|
mean value: 0.7288497217068646
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.66666667 0.25 0.4 0.57142857 0.66666667
|
|
0.57142857 0. 0.57142857 0.33333333]
|
|
|
|
mean value: 0.46976190476190477
|
|
|
|
key: train_fscore
|
|
value: [0.5915493 0.58333333 0.63636364 0.64615385 0.60606061 0.56716418
|
|
0.57575758 0.62295082 0.59701493 0.59701493]
|
|
|
|
mean value: 0.6023363142966522
|
|
|
|
key: test_precision
|
|
value: [1. 0.6 0.25 0.33333333 0.66666667 1.
|
|
0.66666667 0. 0.66666667 0.33333333]
|
|
|
|
mean value: 0.5516666666666666
|
|
|
|
key: train_precision
|
|
value: [0.58333333 0.56756757 0.67741935 0.7 0.64516129 0.59375
|
|
0.61290323 0.73076923 0.625 0.64516129]
|
|
|
|
mean value: 0.6381065292960454
|
|
|
|
key: test_recall
|
|
value: [0.5 0.75 0.25 0.5 0.5 0.5
|
|
0.5 0. 0.5 0.33333333]
|
|
|
|
mean value: 0.43333333333333335
|
|
|
|
key: train_recall
|
|
value: [0.6 0.6 0.6 0.6 0.57142857 0.54285714
|
|
0.54285714 0.54285714 0.57142857 0.55555556]
|
|
|
|
mean value: 0.5726984126984127
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.73214286 0.41071429 0.46428571 0.67857143 0.75
|
|
0.67857143 0.42857143 0.67857143 0.52380952]
|
|
|
|
mean value: 0.6095238095238095
|
|
|
|
key: train_roc_auc
|
|
value: [0.68095238 0.67301587 0.72063492 0.72857143 0.6984127 0.66825397
|
|
0.67619048 0.71587302 0.69047619 0.69047619]
|
|
|
|
mean value: 0.6942857142857143
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.5 0.14285714 0.25 0.4 0.5
|
|
0.4 0. 0.4 0.2 ]
|
|
|
|
mean value: 0.3292857142857143
|
|
|
|
key: train_jcc
|
|
value: [0.42 0.41176471 0.46666667 0.47727273 0.43478261 0.39583333
|
|
0.40425532 0.45238095 0.42553191 0.42553191]
|
|
|
|
mean value: 0.4314020143167855
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0098691 0.01330066 0.01350284 0.01457024 0.01449943 0.01446772
|
|
0.01424551 0.01335931 0.01429367 0.01426768]
|
|
|
|
mean value: 0.013637614250183106
|
|
|
|
key: score_time
|
|
value: [0.0086205 0.01121116 0.01108789 0.01153731 0.01164913 0.01202822
|
|
0.01170802 0.01162171 0.011621 0.01158166]
|
|
|
|
mean value: 0.011266660690307618
|
|
|
|
key: test_mcc
|
|
value: [-0.03857584 0.13363062 0.21428571 0.69006556 0.82807867 0.38575837
|
|
0.57142857 0.81009259 0.38575837 0.76376262]
|
|
|
|
mean value: 0.474428525267057
|
|
|
|
key: train_mcc
|
|
value: [0.95703496 0.78513588 0.93314074 0.93419873 0.95555556 0.95555556
|
|
0.97788036 0.88878629 1. 0.93435698]
|
|
|
|
mean value: 0.932164505675344
|
|
|
|
key: test_accuracy
|
|
value: [0.54545455 0.63636364 0.63636364 0.81818182 0.90909091 0.72727273
|
|
0.72727273 0.90909091 0.72727273 0.9 ]
|
|
|
|
mean value: 0.7536363636363637
|
|
|
|
key: train_accuracy
|
|
value: [0.97959184 0.89795918 0.96938776 0.96938776 0.97959184 0.97959184
|
|
0.98979592 0.94897959 1. 0.96969697]
|
|
|
|
mean value: 0.9683982683982684
|
|
|
|
key: test_fscore
|
|
value: [0.28571429 0.33333333 0.5 0.8 0.88888889 0.57142857
|
|
0.72727273 0.85714286 0.57142857 0.8 ]
|
|
|
|
mean value: 0.6335209235209236
|
|
|
|
key: train_fscore/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
|
|
value: [0.97222222 0.83333333 0.95652174 0.95522388 0.97142857 0.97142857
|
|
0.98550725 0.92537313 1. 0.95774648]
|
|
|
|
mean value: 0.9528785177718557
|
|
|
|
key: test_precision
|
|
value: [0.33333333 0.5 0.5 0.66666667 0.8 0.66666667
|
|
0.57142857 1. 0.66666667 1. ]
|
|
|
|
mean value: 0.6704761904761904
|
|
|
|
key: train_precision
|
|
value: [0.94594595 1. 0.97058824 1. 0.97142857 0.97142857
|
|
1. 0.96875 1. 0.97142857]
|
|
|
|
mean value: 0.9799569895525778
|
|
|
|
key: test_recall
|
|
value: [0.25 0.25 0.5 1. 1. 0.5
|
|
1. 0.75 0.5 0.66666667]
|
|
|
|
mean value: 0.6416666666666666
|
|
|
|
key: train_recall
|
|
value: [1. 0.71428571 0.94285714 0.91428571 0.97142857 0.97142857
|
|
0.97142857 0.88571429 1. 0.94444444]
|
|
|
|
mean value: 0.9315873015873015
|
|
|
|
key: test_roc_auc
|
|
value: [0.48214286 0.55357143 0.60714286 0.85714286 0.92857143 0.67857143
|
|
0.78571429 0.875 0.67857143 0.83333333]
|
|
|
|
mean value: 0.7279761904761904
|
|
|
|
key: train_roc_auc
|
|
value: [0.98412698 0.85714286 0.96349206 0.95714286 0.97777778 0.97777778
|
|
0.98571429 0.93492063 1. 0.96428571]
|
|
|
|
mean value: 0.9602380952380953
|
|
|
|
key: test_jcc
|
|
value: [0.16666667 0.2 0.33333333 0.66666667 0.8 0.4
|
|
0.57142857 0.75 0.4 0.66666667]
|
|
|
|
mean value: 0.49547619047619046
|
|
|
|
key: train_jcc
|
|
value: [0.94594595 0.71428571 0.91666667 0.91428571 0.94444444 0.94444444
|
|
0.97142857 0.86111111 1. 0.91891892]
|
|
|
|
mean value: 0.9131531531531532
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01281071 0.01247859 0.01289988 0.01285005 0.01277208 0.01324821
|
|
0.01254749 0.01382518 0.012918 0.01293874]
|
|
|
|
mean value: 0.0129288911819458
|
|
|
|
key: score_time
|
|
value: [0.01064682 0.01162243 0.01160169 0.01163006 0.01156688 0.01157141
|
|
0.01158237 0.01171994 0.01172638 0.01154923]
|
|
|
|
mean value: 0.011521720886230468
|
|
|
|
key: test_mcc
|
|
value: [0. 0.13363062 0.38575837 0.35634832 0.69006556 0.3105295
|
|
0.44854261 0.81009259 0.21428571 0.35634832]
|
|
|
|
mean value: 0.3705601617166881
|
|
|
|
key: train_mcc
|
|
value: [0.72183975 0.84366149 0.84852814 0.4361866 0.6876961 0.68680282
|
|
0.91259839 0.84852814 0.84296756 0.87139937]
|
|
|
|
mean value: 0.7700208366419465
|
|
|
|
key: test_accuracy
|
|
value: [0.63636364 0.63636364 0.72727273 0.54545455 0.81818182 0.63636364
|
|
0.72727273 0.90909091 0.63636364 0.7 ]
|
|
|
|
mean value: 0.6972727272727273
|
|
|
|
key: train_accuracy
|
|
value: [0.86734694 0.92857143 0.92857143 0.6122449 0.82653061 0.81632653
|
|
0.95918367 0.92857143 0.91836735 0.93939394]
|
|
|
|
mean value: 0.8725108225108226
|
|
|
|
key: test_fscore
|
|
value: [0. 0.33333333 0.57142857 0.61538462 0.8 0.6
|
|
0.66666667 0.85714286 0.5 0.57142857]
|
|
|
|
mean value: 0.5515384615384615
|
|
|
|
key: train_fscore
|
|
value: [0.77192982 0.89855072 0.88888889 0.64814815 0.8 0.79545455
|
|
0.93939394 0.88888889 0.8974359 0.91891892]
|
|
|
|
mean value: 0.8447609776328312
|
|
|
|
key: test_precision
|
|
value: [0. 0.5 0.66666667 0.44444444 0.66666667 0.5
|
|
0.6 1. 0.5 0.5 ]
|
|
|
|
mean value: 0.5377777777777778
|
|
|
|
key: train_precision
|
|
value: [1. 0.91176471 1. 0.47945205 0.68 0.66037736
|
|
1. 1. 0.81395349 0.89473684]
|
|
|
|
mean value: 0.8440284449644796
|
|
|
|
key: test_recall
|
|
value: [0. 0.25 0.5 1. 1. 0.75
|
|
0.75 0.75 0.5 0.66666667]
|
|
|
|
mean value: 0.6166666666666667
|
|
|
|
key: train_recall
|
|
value: [0.62857143 0.88571429 0.8 1. 0.97142857 1.
|
|
0.88571429 0.8 1. 0.94444444]
|
|
|
|
mean value: 0.8915873015873016
|
|
|
|
key: test_roc_auc
|
|
value: [0.5 0.55357143 0.67857143 0.64285714 0.85714286 0.66071429
|
|
0.73214286 0.875 0.60714286 0.69047619]
|
|
|
|
mean value: 0.6797619047619048
|
|
|
|
key: train_roc_auc
|
|
value: [0.81428571 0.91904762 0.9 0.6984127 0.85873016 0.85714286
|
|
0.94285714 0.9 0.93650794 0.94047619]
|
|
|
|
mean value: 0.8767460317460317
|
|
|
|
key: test_jcc
|
|
value: [0. 0.2 0.4 0.44444444 0.66666667 0.42857143
|
|
0.5 0.75 0.33333333 0.4 ]
|
|
|
|
mean value: 0.4123015873015873
|
|
|
|
key: train_jcc
|
|
value: [0.62857143 0.81578947 0.8 0.47945205 0.66666667 0.66037736
|
|
0.88571429 0.8 0.81395349 0.85 ]
|
|
|
|
mean value: 0.7400524756293771
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.09375739 0.08521581 0.08716512 0.08840275 0.09140325 0.08998203
|
|
0.08835912 0.08615708 0.08548617 0.08902049]
|
|
|
|
mean value: 0.08849492073059081
|
|
|
|
key: score_time
|
|
value: [0.01481247 0.0147016 0.01527953 0.01528096 0.0153563 0.01549649
|
|
0.01509285 0.01497459 0.01454949 0.01461101]
|
|
|
|
mean value: 0.015015530586242675
|
|
|
|
key: test_mcc
|
|
value: [0.38575837 0.82807867 1. 1. 1. 0.44854261
|
|
0.60714286 0.82807867 0.81009259 0.76376262]
|
|
|
|
mean value: 0.7671456391169224
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.72727273 0.90909091 1. 1. 1. 0.72727273
|
|
0.81818182 0.90909091 0.90909091 0.9 ]
|
|
|
|
mean value: 0.89
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.57142857 0.88888889 1. 1. 1. 0.66666667
|
|
0.75 0.88888889 0.85714286 0.8 ]
|
|
|
|
mean value: 0.8423015873015873
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.8 1. 1. 1. 0.6
|
|
0.75 0.8 1. 1. ]
|
|
|
|
mean value: 0.8616666666666667
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.5 1. 1. 1. 1. 0.75
|
|
0.75 1. 0.75 0.66666667]
|
|
|
|
mean value: 0.8416666666666667
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.67857143 0.92857143 1. 1. 1. 0.73214286
|
|
0.80357143 0.92857143 0.875 0.83333333]
|
|
|
|
mean value: 0.8779761904761905
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.4 0.8 1. 1. 1. 0.5
|
|
0.6 0.8 0.75 0.66666667]
|
|
|
|
mean value: 0.7516666666666667
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 1.0
|
|
|
|
Accuracy on Blind test: 1.0
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.04171443 0.03725171 0.0328362 0.03903294 0.05751514 0.0520432
|
|
0.04804182 0.04658794 0.04486704 0.03777409]
|
|
|
|
mean value: 0.043766450881958005
|
|
|
|
key: score_time
|
|
value: [0.02665997 0.02416825 0.02867103 0.02760959 0.03714466 0.03697157
|
|
0.03046608 0.02323723 0.01972747 0.03126526]
|
|
|
|
mean value: 0.02859210968017578
|
|
|
|
key: test_mcc
|
|
value: [0.38575837 0.82807867 1. 1. 1. 0.60714286
|
|
1. 0.82807867 0.81009259 0.76376262]
|
|
|
|
mean value: 0.8222913777596693
|
|
|
|
key: train_mcc
|
|
value: [0.97788036 1. 1. 1. 1. 1.
|
|
1. 0.95555556 0.97788036 1. ]
|
|
|
|
mean value: 0.991131627711635
|
|
|
|
key: test_accuracy
|
|
value: [0.72727273 0.90909091 1. 1. 1. 0.81818182
|
|
1. 0.90909091 0.90909091 0.9 ]
|
|
|
|
mean value: 0.9172727272727272
|
|
|
|
key: train_accuracy
|
|
value: [0.98979592 1. 1. 1. 1. 1.
|
|
1. 0.97959184 0.98979592 1. ]
|
|
|
|
mean value: 0.9959183673469387
|
|
|
|
key: test_fscore
|
|
value: [0.57142857 0.88888889 1. 1. 1. 0.75
|
|
1. 0.88888889 0.85714286 0.8 ]
|
|
|
|
mean value: 0.8756349206349207
|
|
|
|
key: train_fscore
|
|
value: [0.98550725 1. 1. 1. 1. 1.
|
|
1. 0.97142857 0.98550725 1. ]
|
|
|
|
mean value: 0.9942443064182195
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.8 1. 1. 1. 0.75
|
|
1. 0.8 1. 1. ]
|
|
|
|
mean value: 0.9016666666666666
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 0.97142857 1. 1. ]
|
|
|
|
mean value: 0.9971428571428571
|
|
|
|
key: test_recall
|
|
value: [0.5 1. 1. 1. 1. 0.75
|
|
1. 1. 0.75 0.66666667]
|
|
|
|
mean value: 0.8666666666666667
|
|
|
|
key: train_recall
|
|
value: [0.97142857 1. 1. 1. 1. 1.
|
|
1. 0.97142857 0.97142857 1. ]
|
|
|
|
mean value: 0.9914285714285714
|
|
|
|
key: test_roc_auc
|
|
value: [0.67857143 0.92857143 1. 1. 1. 0.80357143
|
|
1. 0.92857143 0.875 0.83333333]
|
|
|
|
mean value: 0.9047619047619048
|
|
|
|
key: train_roc_auc
|
|
value: [0.98571429 1. 1. 1. 1. 1.
|
|
1. 0.97777778 0.98571429 1. ]
|
|
|
|
mean value: 0.994920634920635
|
|
|
|
key: test_jcc
|
|
value: [0.4 0.8 1. 1. 1. 0.6
|
|
1. 0.8 0.75 0.66666667]
|
|
|
|
mean value: 0.8016666666666666
|
|
|
|
key: train_jcc
|
|
value: [0.97142857 1. 1. 1. 1. 1.
|
|
1. 0.94444444 0.97142857 1. ]
|
|
|
|
mean value: 0.9887301587301587
|
|
|
|
MCC on Blind test: 1.0
|
|
|
|
Accuracy on Blind test: 1.0
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02844262 0.05032802 0.04449534 0.04859734 0.04044652 0.04043579
|
|
0.04094768 0.04056787 0.04082155 0.04117346]
|
|
|
|
mean value: 0.04162561893463135
|
|
|
|
key: score_time
|
|
value: [0.02062154 0.02379227 0.02072287 0.02167654 0.02285767 0.02385473
|
|
0.02424192 0.0208075 0.02163959 0.02190232]
|
|
|
|
mean value: 0.022211694717407228
|
|
|
|
key: test_mcc
|
|
value: [0.62360956 0.06900656 0.13363062 0.13363062 0.41833001 0.21428571
|
|
0.21428571 0.62360956 0.41833001 0.52380952]
|
|
|
|
mean value: 0.3372527905686335
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.81818182 0.54545455 0.63636364 0.63636364 0.72727273 0.63636364
|
|
0.63636364 0.81818182 0.72727273 0.8 ]
|
|
|
|
mean value: 0.6981818181818182
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.44444444 0.33333333 0.33333333 0.4 0.5
|
|
0.5 0.66666667 0.4 0.66666667]
|
|
|
|
mean value: 0.4911111111111111
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.4 0.5 0.5 1. 0.5
|
|
0.5 1. 1. 0.66666667]
|
|
|
|
mean value: 0.7066666666666667
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.5 0.5 0.25 0.25 0.25 0.5
|
|
0.5 0.5 0.25 0.66666667]
|
|
|
|
mean value: 0.4166666666666667
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.53571429 0.55357143 0.55357143 0.625 0.60714286
|
|
0.60714286 0.75 0.625 0.76190476]
|
|
|
|
mean value: 0.6369047619047619
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.28571429 0.2 0.2 0.25 0.33333333
|
|
0.33333333 0.5 0.25 0.5 ]
|
|
|
|
mean value: 0.3352380952380952
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.17393565 0.20081019 0.19978285 0.19986486 0.19875121 0.21868134
|
|
0.20449281 0.22600913 0.20545721 0.20233464]
|
|
|
|
mean value: 0.20301198959350586
|
|
|
|
key: score_time
|
|
value: [0.00916028 0.0093646 0.00908971 0.00901794 0.00926399 0.00972033
|
|
0.0092566 0.00972772 0.0093832 0.00969172]
|
|
|
|
mean value: 0.009367609024047851
|
|
|
|
key: test_mcc
|
|
value: [0.38575837 0.82807867 1. 1. 1. 0.60714286
|
|
0.81009259 0.82807867 0.81009259 0.76376262]
|
|
|
|
mean value: 0.8033006364897676
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.72727273 0.90909091 1. 1. 1. 0.81818182
|
|
0.90909091 0.90909091 0.90909091 0.9 ]
|
|
|
|
mean value: 0.9081818181818182
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.57142857 0.88888889 1. 1. 1. 0.75
|
|
0.85714286 0.88888889 0.85714286 0.8 ]
|
|
|
|
mean value: 0.8613492063492063
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.8 1. 1. 1. 0.75
|
|
1. 0.8 1. 1. ]
|
|
|
|
mean value: 0.9016666666666666
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.5 1. 1. 1. 1. 0.75
|
|
0.75 1. 0.75 0.66666667]
|
|
|
|
mean value: 0.8416666666666667
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.67857143 0.92857143 1. 1. 1. 0.80357143
|
|
0.875 0.92857143 0.875 0.83333333]
|
|
|
|
mean value: 0.8922619047619048
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.4 0.8 1. 1. 1. 0.6
|
|
0.75 0.8 0.75 0.66666667]
|
|
|
|
mean value: 0.7766666666666666
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 1.0
|
|
|
|
Accuracy on Blind test: 1.0
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01376772 0.01510024 0.01516438 0.0148809 0.01546669 0.01493096
|
|
0.01515436 0.014992 0.01509452 0.01527882]
|
|
|
|
mean value: 0.014983057975769043
|
|
|
|
key: score_time
|
|
value: [0.01197147 0.01196742 0.01193738 0.01205683 0.0138576 0.01202655
|
|
0.01751709 0.01541352 0.01546621 0.01520991]
|
|
|
|
mean value: 0.013742399215698243
|
|
|
|
key: test_mcc
|
|
value: [ 0.21428571 -0.03857584 -0.46291005 0.60714286 0.13363062 0.06900656
|
|
-0.06900656 0.06900656 -0.46291005 -0.42857143]
|
|
|
|
mean value: -0.0368901617515484
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.63636364 0.54545455 0.36363636 0.81818182 0.63636364 0.54545455
|
|
0.45454545 0.54545455 0.36363636 0.4 ]
|
|
|
|
mean value: 0.5309090909090909
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.5 0.28571429 0. 0.75 0.33333333 0.44444444
|
|
0.4 0.44444444 0. 0. ]
|
|
|
|
mean value: 0.3157936507936508
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.5 0.33333333 0. 0.75 0.5 0.4
|
|
0.33333333 0.4 0. 0. ]
|
|
|
|
mean value: 0.32166666666666666
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.5 0.25 0. 0.75 0.25 0.5 0.5 0.5 0. 0. ]
|
|
|
|
mean value: 0.325
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.60714286 0.48214286 0.28571429 0.80357143 0.55357143 0.53571429
|
|
0.46428571 0.53571429 0.28571429 0.28571429]
|
|
|
|
mean value: 0.48392857142857143
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.33333333 0.16666667 0. 0.6 0.2 0.28571429
|
|
0.25 0.28571429 0. 0. ]
|
|
|
|
mean value: 0.21214285714285713
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: -0.17
|
|
|
|
Accuracy on Blind test: 0.4
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02822781 0.01487136 0.0290575 0.033108 0.01291656 0.01286721
|
|
0.03267503 0.03277683 0.03245783 0.03252196]
|
|
|
|
mean value: 0.026148009300231933
|
|
|
|
key: score_time
|
|
value: [0.01211548 0.01180744 0.0202291 0.01994181 0.01185703 0.0116508
|
|
0.02012253 0.02056456 0.02143598 0.02186871]
|
|
|
|
mean value: 0.017159342765808105
|
|
|
|
key: test_mcc
|
|
value: [0.38575837 0.3105295 0.38575837 0.44854261 0.82807867 0.44854261
|
|
0.82807867 0.60714286 0.38575837 0.52380952]
|
|
|
|
mean value: 0.515199957693884
|
|
|
|
key: train_mcc
|
|
value: [0.97788036 0.93314074 0.93314074 0.93314074 0.95555556 0.93314074
|
|
0.97788036 0.93314074 1. 0.93435698]
|
|
|
|
mean value: 0.9511376935583599
|
|
|
|
key: test_accuracy
|
|
value: [0.72727273 0.63636364 0.72727273 0.72727273 0.90909091 0.72727273
|
|
0.90909091 0.81818182 0.72727273 0.8 ]
|
|
|
|
mean value: 0.7709090909090909
|
|
|
|
key: train_accuracy
|
|
value: [0.98979592 0.96938776 0.96938776 0.96938776 0.97959184 0.96938776
|
|
0.98979592 0.96938776 1. 0.96969697]
|
|
|
|
mean value: 0.9775819418676561
|
|
|
|
key: test_fscore
|
|
value: [0.57142857 0.6 0.57142857 0.66666667 0.88888889 0.66666667
|
|
0.88888889 0.75 0.57142857 0.66666667]
|
|
|
|
mean value: 0.6842063492063493
|
|
|
|
key: train_fscore
|
|
value: [0.98550725 0.95652174 0.95652174 0.95652174 0.97142857 0.95652174
|
|
0.98550725 0.95652174 1. 0.95774648]
|
|
|
|
mean value: 0.9682798238707608
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.5 0.66666667 0.6 0.8 0.6
|
|
0.8 0.75 0.66666667 0.66666667]
|
|
|
|
mean value: 0.6716666666666666
|
|
|
|
key: train_precision
|
|
value: [1. 0.97058824 0.97058824 0.97058824 0.97142857 0.97058824
|
|
1. 0.97058824 1. 0.97142857]
|
|
|
|
mean value: 0.9795798319327731
|
|
|
|
key: test_recall
|
|
value: [0.5 0.75 0.5 0.75 1. 0.75
|
|
1. 0.75 0.5 0.66666667]
|
|
|
|
mean value: 0.7166666666666667
|
|
|
|
key: train_recall
|
|
value: [0.97142857 0.94285714 0.94285714 0.94285714 0.97142857 0.94285714
|
|
0.97142857 0.94285714 1. 0.94444444]
|
|
|
|
mean value: 0.9573015873015873
|
|
|
|
key: test_roc_auc
|
|
value: [0.67857143 0.66071429 0.67857143 0.73214286 0.92857143 0.73214286
|
|
0.92857143 0.80357143 0.67857143 0.76190476]
|
|
|
|
mean value: 0.7583333333333333
|
|
|
|
key: train_roc_auc
|
|
value: [0.98571429 0.96349206 0.96349206 0.96349206 0.97777778 0.96349206
|
|
0.98571429 0.96349206 1. 0.96428571]
|
|
|
|
mean value: 0.9730952380952382
|
|
|
|
key: test_jcc
|
|
value: [0.4 0.42857143 0.4 0.5 0.8 0.5
|
|
0.8 0.6 0.4 0.5 ]
|
|
|
|
mean value: 0.5328571428571429
|
|
|
|
key: train_jcc
|
|
value: [0.97142857 0.91666667 0.91666667 0.91666667 0.94444444 0.91666667
|
|
0.97142857 0.91666667 1. 0.91891892]
|
|
|
|
mean value: 0.9389553839553839
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.16137671 0.18871665 0.19555187 0.18758488 0.1897881 0.20349956
|
|
0.2113049 0.19966245 0.18951869 0.24313569]
|
|
|
|
mean value: 0.1970139503479004
|
|
|
|
key: score_time
|
|
value: [0.02137899 0.02143884 0.02272177 0.01185441 0.02316952 0.02063107
|
|
0.02193546 0.02300262 0.02248096 0.02047634]
|
|
|
|
mean value: 0.0209089994430542
|
|
|
|
key: test_mcc
|
|
value: [0.38575837 0.3105295 0.38575837 0.44854261 0.82807867 0.44854261
|
|
0.21428571 0.60714286 0.38575837 0.52380952]
|
|
|
|
mean value: 0.45382066200137294
|
|
|
|
key: train_mcc
|
|
value: [0.97788036 0.93314074 0.93314074 0.93314074 0.95555556 0.93314074
|
|
0.78513588 0.93314074 1. 0.93435698]
|
|
|
|
mean value: 0.9318632458688522
|
|
|
|
key: test_accuracy
|
|
value: [0.72727273 0.63636364 0.72727273 0.72727273 0.90909091 0.72727273
|
|
0.63636364 0.81818182 0.72727273 0.8 ]
|
|
|
|
mean value: 0.7436363636363637
|
|
|
|
key: train_accuracy
|
|
value: [0.98979592 0.96938776 0.96938776 0.96938776 0.97959184 0.96938776
|
|
0.89795918 0.96938776 1. 0.96969697]
|
|
|
|
mean value: 0.9683982683982684
|
|
|
|
key: test_fscore
|
|
value: [0.57142857 0.6 0.57142857 0.66666667 0.88888889 0.66666667
|
|
0.5 0.75 0.57142857 0.66666667]
|
|
|
|
mean value: 0.6453174603174603
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./gid_sl.py:107: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
baseline_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./gid_sl.py:110: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
baseline_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[0.98550725 0.95652174 0.95652174 0.95652174 0.97142857 0.95652174
|
|
0.83333333 0.95652174 1. 0.95774648]
|
|
|
|
mean value: 0.953062432566413
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.5 0.66666667 0.6 0.8 0.6
|
|
0.5 0.75 0.66666667 0.66666667]
|
|
|
|
mean value: 0.6416666666666666
|
|
|
|
key: train_precision
|
|
value: [1. 0.97058824 0.97058824 0.97058824 0.97142857 0.97058824
|
|
1. 0.97058824 1. 0.97142857]
|
|
|
|
mean value: 0.9795798319327731
|
|
|
|
key: test_recall
|
|
value: [0.5 0.75 0.5 0.75 1. 0.75
|
|
0.5 0.75 0.5 0.66666667]
|
|
|
|
mean value: 0.6666666666666666
|
|
|
|
key: train_recall
|
|
value: [0.97142857 0.94285714 0.94285714 0.94285714 0.97142857 0.94285714
|
|
0.71428571 0.94285714 1. 0.94444444]
|
|
|
|
mean value: 0.9315873015873015
|
|
|
|
key: test_roc_auc
|
|
value: [0.67857143 0.66071429 0.67857143 0.73214286 0.92857143 0.73214286
|
|
0.60714286 0.80357143 0.67857143 0.76190476]
|
|
|
|
mean value: 0.7261904761904762
|
|
|
|
key: train_roc_auc
|
|
value: [0.98571429 0.96349206 0.96349206 0.96349206 0.97777778 0.96349206
|
|
0.85714286 0.96349206 1. 0.96428571]
|
|
|
|
mean value: 0.9602380952380953
|
|
|
|
key: test_jcc
|
|
value: [0.4 0.42857143 0.4 0.5 0.8 0.5
|
|
0.33333333 0.6 0.4 0.5 ]
|
|
|
|
mean value: 0.4861904761904762
|
|
|
|
key: train_jcc
|
|
value: [0.97142857 0.91666667 0.91666667 0.91666667 0.94444444 0.91666667
|
|
0.71428571 0.91666667 1. 0.91891892]
|
|
|
|
mean value: 0.9132410982410982
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02637744 0.0271399 0.01950192 0.02762055 0.02170396 0.02873182
|
|
0.02581763 0.02583599 0.02582002 0.0221951 ]
|
|
|
|
mean value: 0.02507443428039551
|
|
|
|
key: score_time
|
|
value: [0.01182771 0.01178241 0.01171708 0.01184201 0.01175046 0.01182175
|
|
0.01181841 0.01186657 0.01179719 0.01182365]
|
|
|
|
mean value: 0.011804723739624023
|
|
|
|
key: test_mcc
|
|
value: [0.74535599 0.57735027 0.74535599 0.74535599 0.71428571 0.8660254
|
|
0.63245553 0.52223297 0.57735027 0.4472136 ]
|
|
|
|
mean value: 0.6572981729349922
|
|
|
|
key: train_mcc
|
|
value: [0.92075092 0.90521816 0.88900089 0.88989842 0.88989842 0.90659109
|
|
0.90521816 0.87345612 0.92075092 0.87478088]
|
|
|
|
mean value: 0.8975563989339166
|
|
|
|
key: test_accuracy
|
|
value: [0.85714286 0.78571429 0.85714286 0.85714286 0.85714286 0.92857143
|
|
0.78571429 0.71428571 0.78571429 0.71428571]
|
|
|
|
mean value: 0.8142857142857143
|
|
|
|
key: train_accuracy
|
|
value: [0.96031746 0.95238095 0.94444444 0.94444444 0.94444444 0.95238095
|
|
0.95238095 0.93650794 0.96031746 0.93650794]
|
|
|
|
mean value: 0.9484126984126984
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.8 0.875 0.875 0.85714286 0.93333333
|
|
0.82352941 0.6 0.76923077 0.66666667]
|
|
|
|
mean value: 0.8033236371471666
|
|
|
|
key: train_fscore
|
|
value: [0.96 0.9516129 0.944 0.94308943 0.94308943 0.95081967
|
|
0.9516129 0.93548387 0.96 0.93442623]
|
|
|
|
mean value: 0.9474134440847317
|
|
|
|
key: test_precision
|
|
value: [1. 0.75 0.77777778 0.77777778 0.85714286 0.875
|
|
0.7 1. 0.83333333 0.8 ]
|
|
|
|
mean value: 0.8371031746031746
|
|
|
|
key: train_precision
|
|
value: [0.96774194 0.96721311 0.9516129 0.96666667 0.96666667 0.98305085
|
|
0.96721311 0.95081967 0.96774194 0.96610169]
|
|
|
|
mean value: 0.9654828551539107
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.85714286 1. 1. 0.85714286 1.
|
|
1. 0.42857143 0.71428571 0.57142857]
|
|
|
|
mean value: 0.8142857142857143
|
|
|
|
key: train_recall
|
|
value: [0.95238095 0.93650794 0.93650794 0.92063492 0.92063492 0.92063492
|
|
0.93650794 0.92063492 0.95238095 0.9047619 ]
|
|
|
|
mean value: 0.9301587301587302
|
|
|
|
key: test_roc_auc
|
|
value: [0.85714286 0.78571429 0.85714286 0.85714286 0.85714286 0.92857143
|
|
0.78571429 0.71428571 0.78571429 0.71428571]
|
|
|
|
mean value: 0.8142857142857144
|
|
|
|
key: train_roc_auc
|
|
value: [0.96031746 0.95238095 0.94444444 0.94444444 0.94444444 0.95238095
|
|
0.95238095 0.93650794 0.96031746 0.93650794]
|
|
|
|
mean value: 0.9484126984126984
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.66666667 0.77777778 0.77777778 0.75 0.875
|
|
0.7 0.42857143 0.625 0.5 ]
|
|
|
|
mean value: 0.6815079365079365
|
|
|
|
key: train_jcc
|
|
value: [0.92307692 0.90769231 0.89393939 0.89230769 0.89230769 0.90625
|
|
0.90769231 0.87878788 0.92307692 0.87692308]
|
|
|
|
mean value: 0.9002054195804196
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.64419675 0.75507832 0.65785503 0.85170484 0.89698648 0.7391665
|
|
0.64110851 0.8319416 0.65485907 0.68134093]
|
|
|
|
mean value: 0.7354238033294678
|
|
|
|
key: score_time
|
|
value: [0.01294661 0.02320194 0.0123992 0.02117777 0.01676965 0.01436758
|
|
0.01205087 0.01219463 0.01197934 0.01209545]
|
|
|
|
mean value: 0.014918303489685059
|
|
|
|
key: test_mcc
|
|
value: [0.71428571 0.4472136 0.74535599 0.74535599 0.71428571 0.74535599
|
|
0.63245553 0.74535599 0.74535599 0.57735027]
|
|
|
|
mean value: 0.6812370787794337
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 0.96825397 1. 1. 1.
|
|
0.98425098 0.95250095 0.96825397 0.96825397]
|
|
|
|
mean value: 0.984151384151481
|
|
|
|
key: test_accuracy
|
|
value: [0.85714286 0.71428571 0.85714286 0.85714286 0.85714286 0.85714286
|
|
0.78571429 0.85714286 0.85714286 0.78571429]
|
|
|
|
mean value: 0.8285714285714285
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 0.98412698 1. 1. 1.
|
|
0.99206349 0.97619048 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9920634920634921
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 0.75 0.875 0.875 0.85714286 0.875
|
|
0.82352941 0.83333333 0.83333333 0.76923077]
|
|
|
|
mean value: 0.8348712561947856
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 0.98412698 1. 1. 1.
|
|
0.992 0.976 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9920380952380952
|
|
|
|
key: test_precision
|
|
value: [0.85714286 0.66666667 0.77777778 0.77777778 0.85714286 0.77777778
|
|
0.7 1. 1. 0.83333333]
|
|
|
|
mean value: 0.8247619047619048
|
|
|
|
key: train_precision
|
|
value: [1. 1. 0.98412698 1. 1. 1.
|
|
1. 0.98387097 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9936251920122887
|
|
|
|
key: test_recall
|
|
value: [0.85714286 0.85714286 1. 1. 0.85714286 1.
|
|
1. 0.71428571 0.71428571 0.71428571]
|
|
|
|
mean value: 0.8714285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 1. 0.98412698 1. 1. 1.
|
|
0.98412698 0.96825397 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9904761904761905
|
|
|
|
key: test_roc_auc
|
|
value: [0.85714286 0.71428571 0.85714286 0.85714286 0.85714286 0.85714286
|
|
0.78571429 0.85714286 0.85714286 0.78571429]
|
|
|
|
mean value: 0.8285714285714286
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 0.98412698 1. 1. 1.
|
|
0.99206349 0.97619048 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9920634920634921
|
|
|
|
key: test_jcc
|
|
value: [0.75 0.6 0.77777778 0.77777778 0.75 0.77777778
|
|
0.7 0.71428571 0.71428571 0.625 ]
|
|
|
|
mean value: 0.7186904761904762
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 0.96875 1. 1. 1.
|
|
0.98412698 0.953125 0.96875 0.96875 ]
|
|
|
|
mean value: 0.9843501984126984
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01220131 0.01074219 0.00907183 0.0087285 0.00857234 0.008811
|
|
0.00867271 0.00884128 0.00853682 0.00868058]
|
|
|
|
mean value: 0.009285855293273925
|
|
|
|
key: score_time
|
|
value: [0.01339674 0.00990987 0.0088923 0.0088439 0.00893879 0.00867295
|
|
0.0085187 0.00884485 0.00906515 0.00866961]
|
|
|
|
mean value: 0.009375286102294923
|
|
|
|
key: test_mcc
|
|
value: [ 0. -0.17407766 0.4472136 0.31622777 0.4472136 0.40824829
|
|
0.40824829 0.63245553 0.28867513 -0.28867513]
|
|
|
|
mean value: 0.24855294140224576
|
|
|
|
key: train_mcc
|
|
value: [0.50468255 0.40406102 0.50124029 0.55555556 0.43956222 0.44905021
|
|
0.4662283 0.53975054 0.37745516 0.45421998]
|
|
|
|
mean value: 0.4691805826659309
|
|
|
|
key: test_accuracy
|
|
value: [0.5 0.42857143 0.71428571 0.64285714 0.71428571 0.64285714
|
|
0.64285714 0.78571429 0.64285714 0.35714286]
|
|
|
|
mean value: 0.6071428571428572
|
|
|
|
key: train_accuracy
|
|
value: [0.74603175 0.69047619 0.74603175 0.77777778 0.71428571 0.72222222
|
|
0.73015873 0.76984127 0.68253968 0.72222222]
|
|
|
|
mean value: 0.7301587301587301
|
|
|
|
key: test_fscore
|
|
value: [0.53333333 0.55555556 0.75 0.70588235 0.75 0.73684211
|
|
0.73684211 0.72727273 0.66666667 0.4 ]
|
|
|
|
mean value: 0.6562394846295775
|
|
|
|
key: train_fscore
|
|
value: [0.77142857 0.73469388 0.76811594 0.77777778 0.74285714 0.74074074
|
|
0.75 0.77165354 0.71830986 0.74820144]
|
|
|
|
mean value: 0.7523778893695175
|
|
|
|
key: test_precision
|
|
value: [0.5 0.45454545 0.66666667 0.6 0.66666667 0.58333333
|
|
0.58333333 1. 0.625 0.375 ]
|
|
|
|
mean value: 0.6054545454545455
|
|
|
|
key: train_precision
|
|
value: [0.7012987 0.64285714 0.70666667 0.77777778 0.67532468 0.69444444
|
|
0.69863014 0.765625 0.64556962 0.68421053]
|
|
|
|
mean value: 0.6992404691924664
|
|
|
|
key: test_recall
|
|
value: [0.57142857 0.71428571 0.85714286 0.85714286 0.85714286 1.
|
|
1. 0.57142857 0.71428571 0.42857143]
|
|
|
|
mean value: 0.7571428571428571
|
|
|
|
key: train_recall
|
|
value: [0.85714286 0.85714286 0.84126984 0.77777778 0.82539683 0.79365079
|
|
0.80952381 0.77777778 0.80952381 0.82539683]
|
|
|
|
mean value: 0.8174603174603174
|
|
|
|
key: test_roc_auc
|
|
value: [0.5 0.42857143 0.71428571 0.64285714 0.71428571 0.64285714
|
|
0.64285714 0.78571429 0.64285714 0.35714286]
|
|
|
|
mean value: 0.6071428571428571
|
|
|
|
key: train_roc_auc
|
|
value: [0.74603175 0.69047619 0.74603175 0.77777778 0.71428571 0.72222222
|
|
0.73015873 0.76984127 0.68253968 0.72222222]
|
|
|
|
mean value: 0.7301587301587301
|
|
|
|
key: test_jcc
|
|
value: [0.36363636 0.38461538 0.6 0.54545455 0.6 0.58333333
|
|
0.58333333 0.57142857 0.5 0.25 ]
|
|
|
|
mean value: 0.4981801531801532
|
|
|
|
key: train_jcc
|
|
value: [0.62790698 0.58064516 0.62352941 0.63636364 0.59090909 0.58823529
|
|
0.6 0.62820513 0.56043956 0.59770115]
|
|
|
|
mean value: 0.6033935409259565
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00905013 0.00918555 0.00909519 0.00984001 0.00886965 0.00883627
|
|
0.00905323 0.00987196 0.00902081 0.00882959]
|
|
|
|
mean value: 0.009165239334106446
|
|
|
|
key: score_time
|
|
value: [0.00879502 0.008955 0.00891137 0.00917125 0.00870037 0.00867844
|
|
0.00872588 0.00947595 0.0086875 0.0088346 ]
|
|
|
|
mean value: 0.008893537521362304
|
|
|
|
key: test_mcc
|
|
value: [0.28867513 0.1490712 0.57735027 0.52223297 0.57735027 0.31622777
|
|
0.17407766 0.71428571 0.4472136 0.1490712 ]
|
|
|
|
mean value: 0.39155557695993376
|
|
|
|
key: train_mcc
|
|
value: [0.51346746 0.55566238 0.54156609 0.50874391 0.50874391 0.52297636
|
|
0.50874391 0.53724272 0.58399712 0.51890567]
|
|
|
|
mean value: 0.5300049520306789
|
|
|
|
key: test_accuracy
|
|
value: [0.64285714 0.57142857 0.78571429 0.71428571 0.78571429 0.64285714
|
|
0.57142857 0.85714286 0.71428571 0.57142857]
|
|
|
|
mean value: 0.6857142857142857
|
|
|
|
key: train_accuracy
|
|
value: [0.74603175 0.76984127 0.76190476 0.74603175 0.74603175 0.75396825
|
|
0.74603175 0.76190476 0.78571429 0.74603175]
|
|
|
|
mean value: 0.7563492063492063
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.625 0.8 0.77777778 0.8 0.70588235
|
|
0.66666667 0.85714286 0.75 0.625 ]
|
|
|
|
mean value: 0.7274136321195145
|
|
|
|
key: train_fscore
|
|
value: [0.77777778 0.79432624 0.78873239 0.77464789 0.77464789 0.78014184
|
|
0.77464789 0.78571429 0.8057554 0.78082192]
|
|
|
|
mean value: 0.7837213518428147
|
|
|
|
key: test_precision
|
|
value: [0.625 0.55555556 0.75 0.63636364 0.75 0.6
|
|
0.54545455 0.85714286 0.66666667 0.55555556]
|
|
|
|
mean value: 0.6541738816738817
|
|
|
|
key: train_precision
|
|
value: [0.69135802 0.71794872 0.70886076 0.69620253 0.69620253 0.70512821
|
|
0.69620253 0.71428571 0.73684211 0.68674699]
|
|
|
|
mean value: 0.7049778109699341
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.71428571 0.85714286 1. 0.85714286 0.85714286
|
|
0.85714286 0.85714286 0.85714286 0.71428571]
|
|
|
|
mean value: 0.8285714285714285
|
|
|
|
key: train_recall
|
|
value: [0.88888889 0.88888889 0.88888889 0.87301587 0.87301587 0.87301587
|
|
0.87301587 0.87301587 0.88888889 0.9047619 ]
|
|
|
|
mean value: 0.8825396825396825
|
|
|
|
key: test_roc_auc
|
|
value: [0.64285714 0.57142857 0.78571429 0.71428571 0.78571429 0.64285714
|
|
0.57142857 0.85714286 0.71428571 0.57142857]
|
|
|
|
mean value: 0.6857142857142857
|
|
|
|
key: train_roc_auc
|
|
value: [0.74603175 0.76984127 0.76190476 0.74603175 0.74603175 0.75396825
|
|
0.74603175 0.76190476 0.78571429 0.74603175]
|
|
|
|
mean value: 0.7563492063492063
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.45454545 0.66666667 0.63636364 0.66666667 0.54545455
|
|
0.5 0.75 0.6 0.45454545]
|
|
|
|
mean value: 0.5774242424242424
|
|
|
|
key: train_jcc
|
|
value: [0.63636364 0.65882353 0.65116279 0.63218391 0.63218391 0.63953488
|
|
0.63218391 0.64705882 0.6746988 0.64044944]
|
|
|
|
mean value: 0.6444643621244319
|
|
|
|
MCC on Blind test: 0.0
|
|
|
|
Accuracy on Blind test: 0.5
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00863433 0.00944114 0.00866365 0.00854492 0.00874066 0.00937176
|
|
0.00906944 0.00859618 0.00928903 0.00939155]
|
|
|
|
mean value: 0.008974266052246094
|
|
|
|
key: score_time
|
|
value: [0.01479411 0.01020455 0.00980353 0.00954485 0.01014256 0.00992489
|
|
0.01043534 0.00978708 0.01040745 0.01025319]
|
|
|
|
mean value: 0.010529756546020508
|
|
|
|
key: test_mcc
|
|
value: [0.17407766 0.4472136 0.8660254 0. 0.42857143 0.4472136
|
|
0.52223297 0.71428571 0.4472136 0.14285714]
|
|
|
|
mean value: 0.41896910998213893
|
|
|
|
key: train_mcc
|
|
value: [0.62699668 0.57438828 0.59484301 0.68565632 0.63059263 0.6415003
|
|
0.67082039 0.65915036 0.69526879 0.58834841]
|
|
|
|
mean value: 0.6367565156928385
|
|
|
|
key: test_accuracy
|
|
value: [0.57142857 0.71428571 0.92857143 0.5 0.71428571 0.71428571
|
|
0.71428571 0.85714286 0.71428571 0.57142857]
|
|
|
|
mean value: 0.7
|
|
|
|
key: train_accuracy
|
|
value: [0.80952381 0.77777778 0.79365079 0.84126984 0.80952381 0.81746032
|
|
0.83333333 0.82539683 0.84126984 0.78571429]
|
|
|
|
mean value: 0.8134920634920635
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.75 0.93333333 0.58823529 0.71428571 0.75
|
|
0.77777778 0.85714286 0.75 0.57142857]
|
|
|
|
mean value: 0.7358870214752568
|
|
|
|
key: train_fscore
|
|
value: [0.82352941 0.8028169 0.80882353 0.84848485 0.82608696 0.82962963
|
|
0.84210526 0.83823529 0.85507246 0.80851064]
|
|
|
|
mean value: 0.8283294936562668
|
|
|
|
key: test_precision
|
|
value: [0.54545455 0.66666667 0.875 0.5 0.71428571 0.66666667
|
|
0.63636364 0.85714286 0.66666667 0.57142857]
|
|
|
|
mean value: 0.6699675324675325
|
|
|
|
key: train_precision
|
|
value: [0.76712329 0.72151899 0.75342466 0.8115942 0.76 0.77777778
|
|
0.8 0.78082192 0.78666667 0.73076923]
|
|
|
|
mean value: 0.7689696728467696
|
|
|
|
key: test_recall
|
|
value: [0.85714286 0.85714286 1. 0.71428571 0.71428571 0.85714286
|
|
1. 0.85714286 0.85714286 0.57142857]
|
|
|
|
mean value: 0.8285714285714285
|
|
|
|
key: train_recall
|
|
value: [0.88888889 0.9047619 0.87301587 0.88888889 0.9047619 0.88888889
|
|
0.88888889 0.9047619 0.93650794 0.9047619 ]
|
|
|
|
mean value: 0.8984126984126984
|
|
|
|
key: test_roc_auc
|
|
value: [0.57142857 0.71428571 0.92857143 0.5 0.71428571 0.71428571
|
|
0.71428571 0.85714286 0.71428571 0.57142857]
|
|
|
|
mean value: 0.7
|
|
|
|
key: train_roc_auc
|
|
value: [0.80952381 0.77777778 0.79365079 0.84126984 0.80952381 0.81746032
|
|
0.83333333 0.82539683 0.84126984 0.78571429]
|
|
|
|
mean value: 0.8134920634920635
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.6 0.875 0.41666667 0.55555556 0.6
|
|
0.63636364 0.75 0.6 0.4 ]
|
|
|
|
mean value: 0.5933585858585858
|
|
|
|
key: train_jcc
|
|
value: [0.7 0.67058824 0.67901235 0.73684211 0.7037037 0.70886076
|
|
0.72727273 0.72151899 0.74683544 0.67857143]
|
|
|
|
mean value: 0.7073205735657565
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01068568 0.01109004 0.0109911 0.01106501 0.0109973 0.01087308
|
|
0.00957727 0.01096416 0.01100779 0.01103806]
|
|
|
|
mean value: 0.010828948020935059
|
|
|
|
key: score_time
|
|
value: [0.00986814 0.00978184 0.00979042 0.00876927 0.00898647 0.00972056
|
|
0.00877142 0.00978684 0.00980353 0.00994635]
|
|
|
|
mean value: 0.009522485733032226
|
|
|
|
key: test_mcc
|
|
value: [0.74535599 0.57735027 0.31622777 0.71428571 0.4472136 0.71428571
|
|
0.74535599 0.31622777 0.42857143 0. ]
|
|
|
|
mean value: 0.5004874238865976
|
|
|
|
key: train_mcc
|
|
value: [0.92075092 0.93650794 0.85811633 0.88989842 0.85985517 0.88989842
|
|
0.9369802 0.88900089 0.88900089 0.87301587]
|
|
|
|
mean value: 0.89430250462966
|
|
|
|
key: test_accuracy
|
|
value: [0.85714286 0.78571429 0.64285714 0.85714286 0.71428571 0.85714286
|
|
0.85714286 0.64285714 0.71428571 0.5 ]
|
|
|
|
mean value: 0.7428571428571429
|
|
|
|
key: train_accuracy
|
|
value: [0.96031746 0.96825397 0.92857143 0.94444444 0.92857143 0.94444444
|
|
0.96825397 0.94444444 0.94444444 0.93650794]
|
|
|
|
mean value: 0.9468253968253968
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.8 0.70588235 0.85714286 0.66666667 0.85714286
|
|
0.875 0.54545455 0.71428571 0.53333333]
|
|
|
|
mean value: 0.7388241660300483
|
|
|
|
key: train_fscore
|
|
value: [0.96 0.96825397 0.92682927 0.94308943 0.92561983 0.94308943
|
|
0.96774194 0.944 0.944 0.93650794]
|
|
|
|
mean value: 0.945913180503782
|
|
|
|
key: test_precision
|
|
value: [1. 0.75 0.6 0.85714286 0.8 0.85714286
|
|
0.77777778 0.75 0.71428571 0.5 ]
|
|
|
|
mean value: 0.7606349206349207
|
|
|
|
key: train_precision
|
|
value: [0.96774194 0.96825397 0.95 0.96666667 0.96551724 0.96666667
|
|
0.98360656 0.9516129 0.9516129 0.93650794]
|
|
|
|
mean value: 0.9608186778787081
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.85714286 0.85714286 0.85714286 0.57142857 0.85714286
|
|
1. 0.42857143 0.71428571 0.57142857]
|
|
|
|
mean value: 0.7428571428571429
|
|
|
|
key: train_recall
|
|
value: [0.95238095 0.96825397 0.9047619 0.92063492 0.88888889 0.92063492
|
|
0.95238095 0.93650794 0.93650794 0.93650794]
|
|
|
|
mean value: 0.9317460317460318
|
|
|
|
key: test_roc_auc
|
|
value: [0.85714286 0.78571429 0.64285714 0.85714286 0.71428571 0.85714286
|
|
0.85714286 0.64285714 0.71428571 0.5 ]
|
|
|
|
mean value: 0.7428571428571429
|
|
|
|
key: train_roc_auc
|
|
value: [0.96031746 0.96825397 0.92857143 0.94444444 0.92857143 0.94444444
|
|
0.96825397 0.94444444 0.94444444 0.93650794]
|
|
|
|
mean value: 0.9468253968253968
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.66666667 0.54545455 0.75 0.5 0.75
|
|
0.77777778 0.375 0.55555556 0.36363636]
|
|
|
|
mean value: 0.5998376623376623
|
|
|
|
key: train_jcc
|
|
value: [0.92307692 0.93846154 0.86363636 0.89230769 0.86153846 0.89230769
|
|
0.9375 0.89393939 0.89393939 0.88059701]
|
|
|
|
mean value: 0.8977304474132832
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.65259933 0.55132651 0.52340722 0.52382731 0.71559811 0.55424976
|
|
0.4957695 0.56733274 0.7447772 0.56227398]
|
|
|
|
mean value: 0.5891161680221557
|
|
|
|
key: score_time
|
|
value: [0.01228452 0.01230955 0.01233864 0.01230621 0.0232439 0.01236844
|
|
0.01233506 0.01241755 0.01222134 0.0124197 ]
|
|
|
|
mean value: 0.013424491882324219
|
|
|
|
key: test_mcc
|
|
value: [0.57735027 0.1490712 0.74535599 0.74535599 0.71428571 0.4472136
|
|
0.52223297 0.8660254 0.74535599 0.57735027]
|
|
|
|
mean value: 0.6089597395816232
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.78571429 0.57142857 0.85714286 0.85714286 0.85714286 0.71428571
|
|
0.71428571 0.92857143 0.85714286 0.78571429]
|
|
|
|
mean value: 0.7928571428571428
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.625 0.875 0.875 0.85714286 0.75
|
|
0.77777778 0.92307692 0.83333333 0.76923077]
|
|
|
|
mean value: 0.8085561660561661
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.75 0.55555556 0.77777778 0.77777778 0.85714286 0.66666667
|
|
0.63636364 1. 1. 0.83333333]
|
|
|
|
mean value: 0.7854617604617604
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.85714286 0.71428571 1. 1. 0.85714286 0.85714286
|
|
1. 0.85714286 0.71428571 0.71428571]
|
|
|
|
mean value: 0.8571428571428571
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.78571429 0.57142857 0.85714286 0.85714286 0.85714286 0.71428571
|
|
0.71428571 0.92857143 0.85714286 0.78571429]
|
|
|
|
mean value: 0.7928571428571429
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.45454545 0.77777778 0.77777778 0.75 0.6
|
|
0.63636364 0.85714286 0.71428571 0.625 ]
|
|
|
|
mean value: 0.6859559884559885
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01516175 0.01404548 0.01218319 0.0120542 0.01173067 0.01125073
|
|
0.01186585 0.0115962 0.01139712 0.01205778]
|
|
|
|
mean value: 0.012334299087524415
|
|
|
|
key: score_time
|
|
value: [0.01173472 0.009022 0.00886035 0.00956607 0.00870371 0.00868201
|
|
0.00949168 0.00859475 0.00865388 0.00905395]
|
|
|
|
mean value: 0.009236311912536621
|
|
|
|
key: test_mcc
|
|
value: [1. 0.71428571 1. 0.71428571 1. 0.71428571
|
|
0.74535599 0.71428571 0.74535599 0.8660254 ]
|
|
|
|
mean value: 0.8213880245927155
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.85714286 1. 0.85714286 1. 0.85714286
|
|
0.85714286 0.85714286 0.85714286 0.92857143]
|
|
|
|
mean value: 0.9071428571428571
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.85714286 1. 0.85714286 1. 0.85714286
|
|
0.875 0.85714286 0.83333333 0.93333333]
|
|
|
|
mean value: 0.9070238095238095
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.85714286 1. 0.85714286 1. 0.85714286
|
|
0.77777778 0.85714286 1. 0.875 ]
|
|
|
|
mean value: 0.9081349206349206
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.85714286 1. 0.85714286 1. 0.85714286
|
|
1. 0.85714286 0.71428571 1. ]
|
|
|
|
mean value: 0.9142857142857143
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.85714286 1. 0.85714286 1. 0.85714286
|
|
0.85714286 0.85714286 0.85714286 0.92857143]
|
|
|
|
mean value: 0.9071428571428571
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.75 1. 0.75 1. 0.75
|
|
0.77777778 0.75 0.71428571 0.875 ]
|
|
|
|
mean value: 0.8367063492063492
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.67
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.08879137 0.08702278 0.08695221 0.0866487 0.08714247 0.08788347
|
|
0.08967614 0.08844662 0.09147382 0.09041929]
|
|
|
|
mean value: 0.08844568729400634
|
|
|
|
key: score_time
|
|
value: [0.01770091 0.01726842 0.01739073 0.017308 0.01802349 0.0170784
|
|
0.01703453 0.01725245 0.01726747 0.01859736]
|
|
|
|
mean value: 0.017492175102233887
|
|
|
|
key: test_mcc
|
|
value: [0.74535599 0.57735027 0.74535599 0.4472136 0.57735027 0.74535599
|
|
0.31622777 0.74535599 0.8660254 0.42857143]
|
|
|
|
mean value: 0.6194162702251634
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.85714286 0.78571429 0.85714286 0.71428571 0.78571429 0.85714286
|
|
0.64285714 0.85714286 0.92857143 0.71428571]
|
|
|
|
mean value: 0.8
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.8 0.875 0.75 0.76923077 0.875
|
|
0.70588235 0.83333333 0.92307692 0.71428571]
|
|
|
|
mean value: 0.8079142426201249
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.75 0.77777778 0.66666667 0.83333333 0.77777778
|
|
0.6 1. 1. 0.71428571]
|
|
|
|
mean value: 0.811984126984127
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.85714286 1. 0.85714286 0.71428571 1.
|
|
0.85714286 0.71428571 0.85714286 0.71428571]
|
|
|
|
mean value: 0.8285714285714285
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.85714286 0.78571429 0.85714286 0.71428571 0.78571429 0.85714286
|
|
0.64285714 0.85714286 0.92857143 0.71428571]
|
|
|
|
mean value: 0.8
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.66666667 0.77777778 0.6 0.625 0.77777778
|
|
0.54545455 0.71428571 0.85714286 0.55555556]
|
|
|
|
mean value: 0.6833946608946608
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00893784 0.00919008 0.00887132 0.0088284 0.00957584 0.00873303
|
|
0.00881243 0.00870514 0.00895 0.00870538]
|
|
|
|
mean value: 0.00893094539642334
|
|
|
|
key: score_time
|
|
value: [0.00892377 0.0087471 0.00869346 0.00878525 0.00857949 0.00855255
|
|
0.00867772 0.00869751 0.00853705 0.00853515]
|
|
|
|
mean value: 0.00867290496826172
|
|
|
|
key: test_mcc
|
|
value: [0.4472136 0. 0.31622777 0.57735027 0.28867513 0.
|
|
0.28867513 0.1490712 0.8660254 0.63245553]
|
|
|
|
mean value: 0.3565694034214148
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.71428571 0.5 0.64285714 0.78571429 0.64285714 0.5
|
|
0.64285714 0.57142857 0.92857143 0.78571429]
|
|
|
|
mean value: 0.6714285714285715
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.75 0.53333333 0.70588235 0.8 0.66666667 0.36363636
|
|
0.61538462 0.625 0.93333333 0.72727273]
|
|
|
|
mean value: 0.6720509392568216
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.5 0.6 0.75 0.625 0.5
|
|
0.66666667 0.55555556 0.875 1. ]
|
|
|
|
mean value: 0.6738888888888889
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.85714286 0.57142857 0.85714286 0.85714286 0.71428571 0.28571429
|
|
0.57142857 0.71428571 1. 0.57142857]
|
|
|
|
mean value: 0.7
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.71428571 0.5 0.64285714 0.78571429 0.64285714 0.5
|
|
0.64285714 0.57142857 0.92857143 0.78571429]
|
|
|
|
mean value: 0.6714285714285715
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.6 0.36363636 0.54545455 0.66666667 0.5 0.22222222
|
|
0.44444444 0.45454545 0.875 0.57142857]
|
|
|
|
mean value: 0.5243398268398268
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.53
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.13994312 1.12470388 1.12413239 1.10539031 1.11974645 1.11329842
|
|
1.13287234 1.18959332 1.16923952 1.10620642]
|
|
|
|
mean value: 1.132512617111206
|
|
|
|
key: score_time
|
|
value: [0.09226322 0.09364581 0.08707714 0.08611917 0.08741522 0.08654284
|
|
0.09422922 0.09448171 0.08700848 0.09361315]
|
|
|
|
mean value: 0.09023959636688232
|
|
|
|
key: test_mcc
|
|
value: [0.63245553 0.57735027 0.8660254 0.8660254 0.71428571 0.74535599
|
|
0.31622777 0.8660254 0.8660254 0.57735027]
|
|
|
|
mean value: 0.7027127158353165
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.78571429 0.78571429 0.92857143 0.92857143 0.85714286 0.85714286
|
|
0.64285714 0.92857143 0.92857143 0.78571429]
|
|
|
|
mean value: 0.8428571428571429
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.72727273 0.8 0.92307692 0.93333333 0.85714286 0.875
|
|
0.70588235 0.92307692 0.92307692 0.76923077]
|
|
|
|
mean value: 0.8437092809151633
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.75 1. 0.875 0.85714286 0.77777778
|
|
0.6 1. 1. 0.83333333]
|
|
|
|
mean value: 0.8693253968253968
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.57142857 0.85714286 0.85714286 1. 0.85714286 1.
|
|
0.85714286 0.85714286 0.85714286 0.71428571]
|
|
|
|
mean value: 0.8428571428571429
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.78571429 0.78571429 0.92857143 0.92857143 0.85714286 0.85714286
|
|
0.64285714 0.92857143 0.92857143 0.78571429]
|
|
|
|
mean value: 0.8428571428571429
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
[0.57142857 0.66666667 0.85714286 0.875 0.75 0.77777778
|
|
0.54545455 0.85714286 0.85714286 0.625 ]
|
|
|
|
mean value: 0.7382756132756132
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.82786345 0.85628772 0.86232376 0.87840986 0.8767724 0.84112525
|
|
0.86769676 0.88985515 0.93869185 0.93964982]
|
|
|
|
mean value: 0.877867603302002
|
|
|
|
key: score_time
|
|
value: [0.21467733 0.18964672 0.21193528 0.17583776 0.15622592 0.23833346
|
|
0.22759962 0.23133421 0.20800018 0.16814971]
|
|
|
|
mean value: 0.2021740198135376
|
|
|
|
key: test_mcc
|
|
value: [0.74535599 0.57735027 0.8660254 0.74535599 0.71428571 0.74535599
|
|
0.4472136 0.8660254 0.8660254 0.63245553]
|
|
|
|
mean value: 0.720544929986208
|
|
|
|
key: train_mcc
|
|
value: [0.93650794 0.95250095 0.95250095 0.95250095 0.95346259 0.9216805
|
|
0.95250095 0.98425098 0.96825397 0.9216805 ]
|
|
|
|
mean value: 0.94958402941395
|
|
|
|
key: test_accuracy
|
|
value: [0.85714286 0.78571429 0.92857143 0.85714286 0.85714286 0.85714286
|
|
0.71428571 0.92857143 0.92857143 0.78571429]
|
|
|
|
mean value: 0.85
|
|
|
|
key: train_accuracy
|
|
value: [0.96825397 0.97619048 0.97619048 0.97619048 0.97619048 0.96031746
|
|
0.97619048 0.99206349 0.98412698 0.96031746]
|
|
|
|
mean value: 0.9746031746031746
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.8 0.92307692 0.875 0.85714286 0.875
|
|
0.75 0.92307692 0.92307692 0.72727273]
|
|
|
|
mean value: 0.8486979686979687
|
|
|
|
key: train_fscore
|
|
value: [0.96825397 0.976 0.976 0.976 0.97560976 0.95934959
|
|
0.976 0.992 0.98412698 0.95934959]
|
|
|
|
mean value: 0.9742689895470383
|
|
|
|
key: test_precision
|
|
value: [1. 0.75 1. 0.77777778 0.85714286 0.77777778
|
|
0.66666667 1. 1. 1. ]
|
|
|
|
mean value: 0.8829365079365079
|
|
|
|
key: train_precision
|
|
value: [0.96825397 0.98387097 0.98387097 0.98387097 1. 0.98333333
|
|
0.98387097 1. 0.98412698 0.98333333]
|
|
|
|
mean value: 0.9854531490015361
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.85714286 0.85714286 1. 0.85714286 1.
|
|
0.85714286 0.85714286 0.85714286 0.57142857]
|
|
|
|
mean value: 0.8428571428571429
|
|
|
|
key: train_recall
|
|
value: [0.96825397 0.96825397 0.96825397 0.96825397 0.95238095 0.93650794
|
|
0.96825397 0.98412698 0.98412698 0.93650794]
|
|
|
|
mean value: 0.9634920634920635
|
|
|
|
key: test_roc_auc
|
|
value: [0.85714286 0.78571429 0.92857143 0.85714286 0.85714286 0.85714286
|
|
0.71428571 0.92857143 0.92857143 0.78571429]
|
|
|
|
mean value: 0.8500000000000001
|
|
|
|
key: train_roc_auc
|
|
value: [0.96825397 0.97619048 0.97619048 0.97619048 0.97619048 0.96031746
|
|
0.97619048 0.99206349 0.98412698 0.96031746]
|
|
|
|
mean value: 0.9746031746031747
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.66666667 0.85714286 0.77777778 0.75 0.77777778
|
|
0.6 0.85714286 0.85714286 0.57142857]
|
|
|
|
mean value: 0.7429365079365079
|
|
|
|
key: train_jcc
|
|
value: [0.93846154 0.953125 0.953125 0.953125 0.95238095 0.921875
|
|
0.953125 0.98412698 0.96875 0.921875 ]
|
|
|
|
mean value: 0.9499969474969475
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01653957 0.01010847 0.00984597 0.00914907 0.0092411 0.009938
|
|
0.01003265 0.00987983 0.00996757 0.00906849]
|
|
|
|
mean value: 0.010377073287963867
|
|
|
|
key: score_time
|
|
value: [0.00953102 0.00949621 0.00962806 0.00897574 0.00871825 0.00933528
|
|
0.00961423 0.00881052 0.00943184 0.00919628]
|
|
|
|
mean value: 0.009273743629455567
|
|
|
|
key: test_mcc
|
|
value: [0.28867513 0.1490712 0.57735027 0.52223297 0.57735027 0.31622777
|
|
0.17407766 0.71428571 0.4472136 0.1490712 ]
|
|
|
|
mean value: 0.39155557695993376
|
|
|
|
key: train_mcc
|
|
value: [0.51346746 0.55566238 0.54156609 0.50874391 0.50874391 0.52297636
|
|
0.50874391 0.53724272 0.58399712 0.51890567]
|
|
|
|
mean value: 0.5300049520306789
|
|
|
|
key: test_accuracy
|
|
value: [0.64285714 0.57142857 0.78571429 0.71428571 0.78571429 0.64285714
|
|
0.57142857 0.85714286 0.71428571 0.57142857]
|
|
|
|
mean value: 0.6857142857142857
|
|
|
|
key: train_accuracy
|
|
value: [0.74603175 0.76984127 0.76190476 0.74603175 0.74603175 0.75396825
|
|
0.74603175 0.76190476 0.78571429 0.74603175]
|
|
|
|
mean value: 0.7563492063492063
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.625 0.8 0.77777778 0.8 0.70588235
|
|
0.66666667 0.85714286 0.75 0.625 ]
|
|
|
|
mean value: 0.7274136321195145
|
|
|
|
key: train_fscore
|
|
value: [0.77777778 0.79432624 0.78873239 0.77464789 0.77464789 0.78014184
|
|
0.77464789 0.78571429 0.8057554 0.78082192]
|
|
|
|
mean value: 0.7837213518428147
|
|
|
|
key: test_precision
|
|
value: [0.625 0.55555556 0.75 0.63636364 0.75 0.6
|
|
0.54545455 0.85714286 0.66666667 0.55555556]
|
|
|
|
mean value: 0.6541738816738817
|
|
|
|
key: train_precision
|
|
value: [0.69135802 0.71794872 0.70886076 0.69620253 0.69620253 0.70512821
|
|
0.69620253 0.71428571 0.73684211 0.68674699]
|
|
|
|
mean value: 0.7049778109699341
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.71428571 0.85714286 1. 0.85714286 0.85714286
|
|
0.85714286 0.85714286 0.85714286 0.71428571]
|
|
|
|
mean value: 0.8285714285714285
|
|
|
|
key: train_recall
|
|
value: [0.88888889 0.88888889 0.88888889 0.87301587 0.87301587 0.87301587
|
|
0.87301587 0.87301587 0.88888889 0.9047619 ]
|
|
|
|
mean value: 0.8825396825396825
|
|
|
|
key: test_roc_auc
|
|
value: [0.64285714 0.57142857 0.78571429 0.71428571 0.78571429 0.64285714
|
|
0.57142857 0.85714286 0.71428571 0.57142857]
|
|
|
|
mean value: 0.6857142857142857
|
|
|
|
key: train_roc_auc
|
|
value: [0.74603175 0.76984127 0.76190476 0.74603175 0.74603175 0.75396825
|
|
0.74603175 0.76190476 0.78571429 0.74603175]
|
|
|
|
mean value: 0.7563492063492063
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.45454545 0.66666667 0.63636364 0.66666667 0.54545455
|
|
0.5 0.75 0.6 0.45454545]
|
|
|
|
mean value: 0.5774242424242424
|
|
|
|
key: train_jcc
|
|
value: [0.63636364 0.65882353 0.65116279 0.63218391 0.63218391 0.63953488
|
|
0.63218391 0.64705882 0.6746988 0.64044944]
|
|
|
|
mean value: 0.6444643621244319
|
|
|
|
MCC on Blind test: 0.0
|
|
|
|
Accuracy on Blind test: 0.5
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.07304168 0.06317854 0.06117392 0.05121827 0.05126238 0.22504473
|
|
0.04057932 0.04057169 0.0457325 0.04536366]
|
|
|
|
mean value: 0.06971666812896729
|
|
|
|
key: score_time
|
|
value: [0.01124525 0.01068401 0.01110744 0.01069069 0.01071715 0.01070666
|
|
0.0110786 0.01121998 0.01117444 0.01086068]
|
|
|
|
mean value: 0.010948491096496583
|
|
|
|
key: test_mcc
|
|
value: [0.8660254 0.71428571 1. 0.8660254 1. 0.8660254
|
|
0.8660254 0.71428571 0.8660254 1. ]
|
|
|
|
mean value: 0.8758698447493622
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.92857143 0.85714286 1. 0.92857143 1. 0.92857143
|
|
0.92857143 0.85714286 0.92857143 1. ]
|
|
|
|
mean value: 0.9357142857142857
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.85714286 1. 0.92307692 1. 0.93333333
|
|
0.93333333 0.85714286 0.92307692 1. ]
|
|
|
|
mean value: 0.9360439560439561
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.875 0.85714286 1. 1. 1. 0.875
|
|
0.875 0.85714286 1. 1. ]
|
|
|
|
mean value: 0.9339285714285714
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.85714286 1. 0.85714286 1. 1.
|
|
1. 0.85714286 0.85714286 1. ]
|
|
|
|
mean value: 0.9428571428571428
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.92857143 0.85714286 1. 0.92857143 1. 0.92857143
|
|
0.92857143 0.85714286 0.92857143 1. ]
|
|
|
|
mean value: 0.9357142857142857
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.75 1. 0.85714286 1. 0.875
|
|
0.875 0.75 0.85714286 1. ]
|
|
|
|
mean value: 0.8839285714285714
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 1.0
|
|
|
|
Accuracy on Blind test: 1.0
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.02651215 0.04846931 0.04752612 0.04747295 0.04773426 0.05400276
|
|
0.04773259 0.04830027 0.04831672 0.04797649]
|
|
|
|
mean value: 0.046404361724853516
|
|
|
|
key: score_time
|
|
value: [0.02096605 0.02244663 0.02062106 0.02118969 0.02212191 0.02337909
|
|
0.0236187 0.02378559 0.02339005 0.02385402]
|
|
|
|
mean value: 0.02253727912902832
|
|
|
|
key: test_mcc
|
|
value: [0.4472136 0.42857143 0.63245553 0.40824829 0.74535599 0.42857143
|
|
0.8660254 0.57735027 0.74535599 0.63245553]
|
|
|
|
mean value: 0.5911603465147954
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.71428571 0.71428571 0.78571429 0.64285714 0.85714286 0.71428571
|
|
0.92857143 0.78571429 0.85714286 0.78571429]
|
|
|
|
mean value: 0.7785714285714286
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.75 0.71428571 0.82352941 0.73684211 0.875 0.71428571
|
|
0.93333333 0.8 0.875 0.82352941]
|
|
|
|
mean value: 0.8045805690697332
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.71428571 0.7 0.58333333 0.77777778 0.71428571
|
|
0.875 0.75 0.77777778 0.7 ]
|
|
|
|
mean value: 0.7259126984126985
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.85714286 0.71428571 1. 1. 1. 0.71428571
|
|
1. 0.85714286 1. 1. ]
|
|
|
|
mean value: 0.9142857142857143
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.71428571 0.71428571 0.78571429 0.64285714 0.85714286 0.71428571
|
|
0.92857143 0.78571429 0.85714286 0.78571429]
|
|
|
|
mean value: 0.7785714285714286
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.6 0.55555556 0.7 0.58333333 0.77777778 0.55555556
|
|
0.875 0.66666667 0.77777778 0.7 ]
|
|
|
|
mean value: 0.6791666666666667
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01848125 0.00909686 0.00875235 0.00871468 0.00875378 0.00881767
|
|
0.00874782 0.00951767 0.00960565 0.00958204]
|
|
|
|
mean value: 0.010006976127624512
|
|
|
|
key: score_time
|
|
value: [0.00905585 0.00891948 0.00856829 0.0086031 0.00858045 0.00861192
|
|
0.00858426 0.00927019 0.0089469 0.00931072]
|
|
|
|
mean value: 0.008845114707946777
|
|
|
|
key: test_mcc
|
|
value: [0. 0. 0. 0.31622777 0.28867513 0.71428571
|
|
0.1490712 0.42857143 0.57735027 0. ]
|
|
|
|
mean value: 0.24741815111584053
|
|
|
|
key: train_mcc
|
|
value: [0.39562828 0.40700206 0.43656413 0.42177569 0.37444189 0.33296358
|
|
0.37444189 0.49838198 0.35954625 0.31803896]
|
|
|
|
mean value: 0.39187847170747214
|
|
|
|
key: test_accuracy
|
|
value: [0.5 0.5 0.5 0.64285714 0.64285714 0.85714286
|
|
0.57142857 0.71428571 0.78571429 0.5 ]
|
|
|
|
mean value: 0.6214285714285714
|
|
|
|
key: train_accuracy
|
|
value: [0.69047619 0.6984127 0.71428571 0.70634921 0.68253968 0.65873016
|
|
0.68253968 0.74603175 0.67460317 0.65079365]
|
|
|
|
mean value: 0.6904761904761905
|
|
|
|
key: test_fscore
|
|
value: [0.58823529 0.63157895 0.58823529 0.70588235 0.66666667 0.85714286
|
|
0.625 0.71428571 0.8 0.53333333]
|
|
|
|
mean value: 0.6710360459973462
|
|
|
|
key: train_fscore
|
|
value: [0.72727273 0.72857143 0.73913043 0.73381295 0.71428571 0.70344828
|
|
0.71428571 0.76470588 0.70921986 0.69863014]
|
|
|
|
mean value: 0.7233363122195821
|
|
|
|
key: test_precision
|
|
value: [0.5 0.5 0.5 0.6 0.625 0.85714286
|
|
0.55555556 0.71428571 0.75 0.5 ]
|
|
|
|
mean value: 0.6101984126984127
|
|
|
|
key: train_precision
|
|
value: [0.65 0.66233766 0.68 0.67105263 0.64935065 0.62195122
|
|
0.64935065 0.71232877 0.64102564 0.61445783]
|
|
|
|
mean value: 0.6551855051604334
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.85714286 0.71428571 0.85714286 0.71428571 0.85714286
|
|
0.71428571 0.71428571 0.85714286 0.57142857]
|
|
|
|
mean value: 0.7571428571428571
|
|
|
|
key: train_recall
|
|
value: [0.82539683 0.80952381 0.80952381 0.80952381 0.79365079 0.80952381
|
|
0.79365079 0.82539683 0.79365079 0.80952381]
|
|
|
|
mean value: 0.807936507936508
|
|
|
|
key: test_roc_auc
|
|
value: [0.5 0.5 0.5 0.64285714 0.64285714 0.85714286
|
|
0.57142857 0.71428571 0.78571429 0.5 ]
|
|
|
|
mean value: 0.6214285714285714
|
|
|
|
key: train_roc_auc
|
|
value: [0.69047619 0.6984127 0.71428571 0.70634921 0.68253968 0.65873016
|
|
0.68253968 0.74603175 0.67460317 0.65079365]
|
|
|
|
mean value: 0.6904761904761905
|
|
|
|
key: test_jcc
|
|
value: [0.41666667 0.46153846 0.41666667 0.54545455 0.5 0.75
|
|
0.45454545 0.55555556 0.66666667 0.36363636]
|
|
|
|
mean value: 0.5130730380730381
|
|
|
|
key: train_jcc
|
|
value: [0.57142857 0.57303371 0.5862069 0.57954545 0.55555556 0.54255319
|
|
0.55555556 0.61904762 0.54945055 0.53684211]
|
|
|
|
mean value: 0.5669219206752718
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01188612 0.01601815 0.01570535 0.01414609 0.01447415 0.0142417
|
|
0.01480913 0.01494217 0.01382351 0.01519036]
|
|
|
|
mean value: 0.014523673057556152
|
|
|
|
key: score_time
|
|
value: [0.00955725 0.01170158 0.01173687 0.01169157 0.01166677 0.01166749
|
|
0.0117631 0.01186204 0.01185417 0.01204991]
|
|
|
|
mean value: 0.011555075645446777
|
|
|
|
key: test_mcc
|
|
value: [0.57735027 0.28867513 0.74535599 0.52223297 0.71428571 0.8660254
|
|
0.71428571 0.8660254 0.8660254 0.42857143]
|
|
|
|
mean value: 0.6588833432647635
|
|
|
|
key: train_mcc
|
|
value: [0.96825397 1. 0.96825397 0.75828754 0.95346259 0.92354815
|
|
0.88014083 0.95250095 0.95250095 0.98425098]
|
|
|
|
mean value: 0.934119993839683
|
|
|
|
key: test_accuracy
|
|
value: [0.78571429 0.64285714 0.85714286 0.71428571 0.85714286 0.92857143
|
|
0.85714286 0.92857143 0.92857143 0.71428571]
|
|
|
|
mean value: 0.8214285714285714
|
|
|
|
key: train_accuracy
|
|
value: [0.98412698 1. 0.98412698 0.86507937 0.97619048 0.96031746
|
|
0.93650794 0.97619048 0.97619048 0.99206349]
|
|
|
|
mean value: 0.9650793650793651
|
|
|
|
key: test_fscore
|
|
value: [0.76923077 0.66666667 0.875 0.77777778 0.85714286 0.93333333
|
|
0.85714286 0.92307692 0.92307692 0.71428571]
|
|
|
|
mean value: 0.8296733821733822
|
|
|
|
key: train_fscore
|
|
value: [0.98412698 1. 0.98412698 0.88111888 0.97560976 0.95867769
|
|
0.93220339 0.97637795 0.976 0.99212598]
|
|
|
|
mean value: 0.9660367618259206
|
|
|
|
key: test_precision
|
|
value: [0.83333333 0.625 0.77777778 0.63636364 0.85714286 0.875
|
|
0.85714286 1. 1. 0.71428571]
|
|
|
|
mean value: 0.8176046176046176
|
|
|
|
key: train_precision
|
|
value: [0.98412698 1. 0.98412698 0.7875 1. 1.
|
|
1. 0.96875 0.98387097 0.984375 ]
|
|
|
|
mean value: 0.9692749935995904
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.71428571 1. 1. 0.85714286 1.
|
|
0.85714286 0.85714286 0.85714286 0.71428571]
|
|
|
|
mean value: 0.8571428571428571
|
|
|
|
key: train_recall
|
|
value: [0.98412698 1. 0.98412698 1. 0.95238095 0.92063492
|
|
0.87301587 0.98412698 0.96825397 1. ]
|
|
|
|
mean value: 0.9666666666666667
|
|
|
|
key: test_roc_auc
|
|
value: [0.78571429 0.64285714 0.85714286 0.71428571 0.85714286 0.92857143
|
|
0.85714286 0.92857143 0.92857143 0.71428571]
|
|
|
|
mean value: 0.8214285714285715
|
|
|
|
key: train_roc_auc
|
|
value: [0.98412698 1. 0.98412698 0.86507937 0.97619048 0.96031746
|
|
0.93650794 0.97619048 0.97619048 0.99206349]
|
|
|
|
mean value: 0.9650793650793651
|
|
|
|
key: test_jcc
|
|
value: [0.625 0.5 0.77777778 0.63636364 0.75 0.875
|
|
0.75 0.85714286 0.85714286 0.55555556]
|
|
|
|
mean value: 0.7183982683982684
|
|
|
|
key: train_jcc
|
|
value: [0.96875 1. 0.96875 0.7875 0.95238095 0.92063492
|
|
0.87301587 0.95384615 0.953125 0.984375 ]
|
|
|
|
mean value: 0.93623778998779
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01435566 0.01289821 0.01348329 0.01348233 0.01295853 0.01291037
|
|
0.01326108 0.01356745 0.01387191 0.01369071]
|
|
|
|
mean value: 0.013447952270507813
|
|
|
|
key: score_time
|
|
value: [0.00999284 0.01170158 0.01200676 0.011693 0.01173377 0.01174426
|
|
0.01169348 0.0117228 0.01174641 0.0116868 ]
|
|
|
|
mean value: 0.011572170257568359
|
|
|
|
key: test_mcc
|
|
value: [0.52223297 0.42857143 0.74535599 0.74535599 0.52223297 0.63245553
|
|
0.52223297 0.63245553 0.8660254 0.57735027]
|
|
|
|
mean value: 0.6194269054213986
|
|
|
|
key: train_mcc
|
|
value: [0.81110711 0.88014083 0.96874225 0.92354815 0.64477154 0.70710678
|
|
0.63245553 0.85207241 0.96825397 0.96825397]
|
|
|
|
mean value: 0.8356452531817429
|
|
|
|
key: test_accuracy
|
|
value: [0.71428571 0.71428571 0.85714286 0.85714286 0.71428571 0.78571429
|
|
0.71428571 0.78571429 0.92857143 0.78571429]
|
|
|
|
mean value: 0.7857142857142857
|
|
|
|
key: train_accuracy
|
|
value: [0.8968254 0.93650794 0.98412698 0.96031746 0.79365079 0.83333333
|
|
0.78571429 0.92063492 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9079365079365079
|
|
|
|
key: test_fscore
|
|
value: [0.77777778 0.71428571 0.875 0.875 0.6 0.82352941
|
|
0.77777778 0.72727273 0.92307692 0.76923077]
|
|
|
|
mean value: 0.7862951101186395
|
|
|
|
key: train_fscore
|
|
value: [0.90647482 0.93220339 0.984375 0.96183206 0.74 0.85714286
|
|
0.82352941 0.9137931 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9087604611652902
|
|
|
|
key: test_precision
|
|
value: [0.63636364 0.71428571 0.77777778 0.77777778 1. 0.7
|
|
0.63636364 1. 1. 0.83333333]
|
|
|
|
mean value: 0.8075901875901876
|
|
|
|
key: train_precision
|
|
value: [0.82894737 1. 0.96923077 0.92647059 1. 0.75
|
|
0.7 1. 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9142902694141084
|
|
|
|
key: test_recall
|
|
value: [1. 0.71428571 1. 1. 0.42857143 1.
|
|
1. 0.57142857 0.85714286 0.71428571]
|
|
|
|
mean value: 0.8285714285714285
|
|
|
|
key: train_recall
|
|
value: [1. 0.87301587 1. 1. 0.58730159 1.
|
|
1. 0.84126984 0.98412698 0.98412698]
|
|
|
|
mean value: 0.926984126984127
|
|
|
|
key: test_roc_auc
|
|
value: [0.71428571 0.71428571 0.85714286 0.85714286 0.71428571 0.78571429
|
|
0.71428571 0.78571429 0.92857143 0.78571429]
|
|
|
|
mean value: 0.7857142857142857
|
|
|
|
key: train_roc_auc
|
|
value: [0.8968254 0.93650794 0.98412698 0.96031746 0.79365079 0.83333333
|
|
0.78571429 0.92063492 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9079365079365079
|
|
|
|
key: test_jcc
|
|
value: [0.63636364 0.55555556 0.77777778 0.77777778 0.42857143 0.7
|
|
0.63636364 0.57142857 0.85714286 0.625 ]
|
|
|
|
mean value: 0.6565981240981241
|
|
|
|
key: train_jcc
|
|
value: [0.82894737 0.87301587 0.96923077 0.92647059 0.58730159 0.75
|
|
0.7 0.84126984 0.96875 0.96875 ]
|
|
|
|
mean value: 0.8413736027474418
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.11607385 0.1076138 0.10608673 0.10151887 0.10518885 0.10832644
|
|
0.10317087 0.10290027 0.10078859 0.10589075]
|
|
|
|
mean value: 0.10575590133666993
|
|
|
|
key: score_time
|
|
value: [0.01610017 0.01603413 0.01495147 0.01501298 0.01622701 0.0165329
|
|
0.01495528 0.01502419 0.01493001 0.0160799 ]
|
|
|
|
mean value: 0.015584802627563477
|
|
|
|
key: test_mcc
|
|
value: [0.8660254 0.4472136 1. 0.71428571 1. 0.57735027
|
|
0.8660254 0.71428571 0.8660254 1. ]
|
|
|
|
mean value: 0.8051211504614328
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.92857143 0.71428571 1. 0.85714286 1. 0.78571429
|
|
0.92857143 0.85714286 0.92857143 1. ]
|
|
|
|
mean value: 0.9
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.75 1. 0.85714286 1. 0.8
|
|
0.93333333 0.85714286 0.92307692 1. ]
|
|
|
|
mean value: 0.9054029304029304
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.875 0.66666667 1. 0.85714286 1. 0.75
|
|
0.875 0.85714286 1. 1. ]
|
|
|
|
mean value: 0.888095238095238
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.85714286 1. 0.85714286 1. 0.85714286
|
|
1. 0.85714286 0.85714286 1. ]
|
|
|
|
mean value: 0.9285714285714286
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.92857143 0.71428571 1. 0.85714286 1. 0.78571429
|
|
0.92857143 0.85714286 0.92857143 1. ]
|
|
|
|
mean value: 0.9
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.6 1. 0.75 1. 0.66666667
|
|
0.875 0.75 0.85714286 1. ]
|
|
|
|
mean value: 0.8373809523809523
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 1.0
|
|
|
|
Accuracy on Blind test: 1.0
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03131676 0.04441977 0.04395151 0.03373003 0.03642392 0.04966235
|
|
0.04267502 0.03279471 0.038481 0.04732513]
|
|
|
|
mean value: 0.040078020095825194
|
|
|
|
key: score_time
|
|
value: [0.01685858 0.02886391 0.02688169 0.03952265 0.03935456 0.02259636
|
|
0.0331192 0.0186348 0.02615976 0.02439404]
|
|
|
|
mean value: 0.027638554573059082
|
|
|
|
key: test_mcc
|
|
value: [0.8660254 0.71428571 1. 0.71428571 1. 0.8660254
|
|
1. 0.71428571 0.74535599 1. ]
|
|
|
|
mean value: 0.862026394292595
|
|
|
|
key: train_mcc
|
|
value: [0.98425098 0.98425098 0.98425098 1. 0.98425098 1.
|
|
1. 1. 0.96874225 1. ]
|
|
|
|
mean value: 0.9905746182632438
|
|
|
|
key: test_accuracy
|
|
value: [0.92857143 0.85714286 1. 0.85714286 1. 0.92857143
|
|
1. 0.85714286 0.85714286 1. ]
|
|
|
|
mean value: 0.9285714285714286
|
|
|
|
key: train_accuracy
|
|
value: [0.99206349 0.99206349 0.99206349 1. 0.99206349 1.
|
|
1. 1. 0.98412698 1. ]
|
|
|
|
mean value: 0.9952380952380953
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.85714286 1. 0.85714286 1. 0.93333333
|
|
1. 0.85714286 0.83333333 1. ]
|
|
|
|
mean value: 0.9271428571428572
|
|
|
|
key: train_fscore
|
|
value: [0.99212598 0.99212598 0.992 1. 0.99212598 1.
|
|
1. 1. 0.98387097 1. ]
|
|
|
|
mean value: 0.995224892049784
|
|
|
|
key: test_precision
|
|
value: [0.875 0.85714286 1. 0.85714286 1. 0.875
|
|
1. 0.85714286 1. 1. ]
|
|
|
|
mean value: 0.9321428571428572
|
|
|
|
key: train_precision
|
|
value: [0.984375 0.984375 1. 1. 0.984375 1. 1. 1.
|
|
1. 1. ]
|
|
|
|
mean value: 0.9953125
|
|
|
|
key: test_recall
|
|
value: [1. 0.85714286 1. 0.85714286 1. 1.
|
|
1. 0.85714286 0.71428571 1. ]
|
|
|
|
mean value: 0.9285714285714286
|
|
|
|
key: train_recall
|
|
value: [1. 1. 0.98412698 1. 1. 1.
|
|
1. 1. 0.96825397 1. ]
|
|
|
|
mean value: 0.9952380952380953
|
|
|
|
key: test_roc_auc
|
|
value: [0.92857143 0.85714286 1. 0.85714286 1. 0.92857143
|
|
1. 0.85714286 0.85714286 1. ]
|
|
|
|
mean value: 0.9285714285714286
|
|
|
|
key: train_roc_auc
|
|
value: [0.99206349 0.99206349 0.99206349 1. 0.99206349 1.
|
|
1. 1. 0.98412698 1. ]
|
|
|
|
mean value: 0.9952380952380953
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.75 1. 0.75 1. 0.875
|
|
1. 0.75 0.71428571 1. ]
|
|
|
|
mean value: 0.8714285714285714
|
|
|
|
key: train_jcc
|
|
value: [0.984375 0.984375 0.98412698 1. 0.984375 1.
|
|
1. 1. 0.96825397 1. ]
|
|
|
|
mean value: 0.9905505952380952
|
|
|
|
MCC on Blind test: 1.0
|
|
|
|
Accuracy on Blind test: 1.0
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.04321051 0.04301333 0.04298568 0.04127693 0.04286027 0.03904319
|
|
0.0431571 0.04193687 0.04288578 0.04299068]
|
|
|
|
mean value: 0.042336034774780276
|
|
|
|
key: score_time
|
|
value: [0.01759887 0.02214241 0.02269411 0.02224588 0.02418971 0.02241874
|
|
0.02038097 0.02414989 0.01431775 0.02205825]
|
|
|
|
mean value: 0.021219658851623534
|
|
|
|
key: test_mcc
|
|
value: [0.1490712 0. 0.74535599 0.31622777 0.57735027 0.4472136
|
|
0.40824829 0.74535599 0.8660254 0.1490712 ]
|
|
|
|
mean value: 0.4403919706954555
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.57142857 0.5 0.85714286 0.64285714 0.78571429 0.71428571
|
|
0.64285714 0.85714286 0.92857143 0.57142857]
|
|
|
|
mean value: 0.7071428571428572
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.625 0.63157895 0.875 0.70588235 0.8 0.75
|
|
0.73684211 0.83333333 0.93333333 0.625 ]
|
|
|
|
mean value: 0.7515970072239422
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.55555556 0.5 0.77777778 0.6 0.75 0.66666667
|
|
0.58333333 1. 0.875 0.55555556]
|
|
|
|
mean value: 0.6863888888888889
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.85714286 1. 0.85714286 0.85714286 0.85714286
|
|
1. 0.71428571 1. 0.71428571]
|
|
|
|
mean value: 0.8571428571428571
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.57142857 0.5 0.85714286 0.64285714 0.78571429 0.71428571
|
|
0.64285714 0.85714286 0.92857143 0.57142857]
|
|
|
|
mean value: 0.7071428571428572
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.45454545 0.46153846 0.77777778 0.54545455 0.66666667 0.6
|
|
0.58333333 0.71428571 0.875 0.45454545]
|
|
|
|
mean value: 0.6133147408147408
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.25483227 0.27963281 0.24360347 0.20475602 0.24511337 0.2791121
|
|
0.24376607 0.28063536 0.28055573 0.24622631]
|
|
|
|
mean value: 0.25582334995269773
|
|
|
|
key: score_time
|
|
value: [0.00938106 0.00896335 0.00910473 0.00897121 0.00927019 0.0091157
|
|
0.00909758 0.00937986 0.00938296 0.0091424 ]
|
|
|
|
mean value: 0.009180903434753418
|
|
|
|
key: test_mcc
|
|
value: [1. 0.71428571 1. 0.71428571 1. 0.71428571
|
|
1. 0.71428571 0.8660254 1. ]
|
|
|
|
mean value: 0.8723168260927296
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.85714286 1. 0.85714286 1. 0.85714286
|
|
1. 0.85714286 0.92857143 1. ]
|
|
|
|
mean value: 0.9357142857142857
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [1. 0.85714286 1. 0.85714286 1. 0.85714286
|
|
1. 0.85714286 0.92307692 1. ]
|
|
|
|
mean value: 0.9351648351648352
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.85714286 1. 0.85714286 1. 0.85714286
|
|
1. 0.85714286 1. 1. ]
|
|
|
|
mean value: 0.9428571428571428
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.85714286 1. 0.85714286 1. 0.85714286
|
|
1. 0.85714286 0.85714286 1. ]
|
|
|
|
mean value: 0.9285714285714286
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.85714286 1. 0.85714286 1. 0.85714286
|
|
1. 0.85714286 0.92857143 1. ]
|
|
|
|
mean value: 0.9357142857142857
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [1. 0.75 1. 0.75 1. 0.75
|
|
1. 0.75 0.85714286 1. ]
|
|
|
|
mean value: 0.8857142857142857
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 1.0
|
|
|
|
Accuracy on Blind test: 1.0
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01523948 0.01647639 0.0168469 0.01656938 0.01645827 0.01667857
|
|
0.01640034 0.01653886 0.01657081 0.01692367]
|
|
|
|
mean value: 0.01647026538848877
|
|
|
|
key: score_time
|
|
value: [0.0127914 0.01215816 0.01195741 0.01439118 0.01406884 0.01425385
|
|
0.01452756 0.01448226 0.0144906 0.01466036]
|
|
|
|
mean value: 0.013778162002563477
|
|
|
|
key: test_mcc
|
|
value: [0.40824829 0.63245553 0.63245553 0.52223297 0.31622777 0.31622777
|
|
0.17407766 0.40824829 0.40824829 0.28867513]
|
|
|
|
mean value: 0.41070972259102206
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.64285714 0.78571429 0.78571429 0.71428571 0.64285714 0.64285714
|
|
0.57142857 0.64285714 0.64285714 0.64285714]
|
|
|
|
mean value: 0.6714285714285715
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.73684211 0.72727273 0.82352941 0.77777778 0.54545455 0.70588235
|
|
0.66666667 0.73684211 0.73684211 0.66666667]
|
|
|
|
mean value: 0.712377646433374
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.58333333 1. 0.7 0.63636364 0.75 0.6
|
|
0.54545455 0.58333333 0.58333333 0.625 ]
|
|
|
|
mean value: 0.6606818181818181
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.57142857 1. 1. 0.42857143 0.85714286
|
|
0.85714286 1. 1. 0.71428571]
|
|
|
|
mean value: 0.8428571428571429
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.64285714 0.78571429 0.78571429 0.71428571 0.64285714 0.64285714
|
|
0.57142857 0.64285714 0.64285714 0.64285714]
|
|
|
|
mean value: 0.6714285714285714
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.58333333 0.57142857 0.7 0.63636364 0.375 0.54545455
|
|
0.5 0.58333333 0.58333333 0.5 ]
|
|
|
|
mean value: 0.5578246753246753
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.0
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02925706 0.03722405 0.02771807 0.0325644 0.03255486 0.03256893
|
|
0.02743649 0.03248382 0.03251648 0.03256011]
|
|
|
|
mean value: 0.03168842792510986
|
|
|
|
key: score_time
|
|
value: [0.02246881 0.02184463 0.02256989 0.02331829 0.01608825 0.02025676
|
|
0.02089095 0.02181077 0.0225234 0.02335548]
|
|
|
|
mean value: 0.021512722969055174
|
|
|
|
key: test_mcc
|
|
value: [0.71428571 0.4472136 0.8660254 0.74535599 0.8660254 0.63245553
|
|
0.8660254 0.71428571 0.8660254 0.71428571]
|
|
|
|
mean value: 0.7431983878028462
|
|
|
|
key: train_mcc
|
|
value: [0.96825397 1. 0.96825397 0.96825397 0.96825397 0.96825397
|
|
0.98425098 0.95250095 0.98425098 0.96825397]
|
|
|
|
mean value: 0.9730526730528191
|
|
|
|
key: test_accuracy
|
|
value: [0.85714286 0.71428571 0.92857143 0.85714286 0.92857143 0.78571429
|
|
0.92857143 0.85714286 0.92857143 0.85714286]
|
|
|
|
mean value: 0.8642857142857143
|
|
|
|
key: train_accuracy
|
|
value: [0.98412698 1. 0.98412698 0.98412698 0.98412698 0.98412698
|
|
0.99206349 0.97619048 0.99206349 0.98412698]
|
|
|
|
mean value: 0.9865079365079364
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 0.75 0.93333333 0.875 0.93333333 0.82352941
|
|
0.93333333 0.85714286 0.92307692 0.85714286]
|
|
|
|
mean value: 0.8743034906270201
|
|
|
|
key: train_fscore
|
|
value: [0.98412698 1. 0.98412698 0.98412698 0.98412698 0.98412698
|
|
0.992 0.97637795 0.992 0.98412698]
|
|
|
|
mean value: 0.986513985751781
|
|
|
|
key: test_precision
|
|
value: [0.85714286 0.66666667 0.875 0.77777778 0.875 0.7
|
|
0.875 0.85714286 1. 0.85714286]
|
|
|
|
mean value: 0.8340873015873016
|
|
|
|
key: train_precision
|
|
value: [0.98412698 1. 0.98412698 0.98412698 0.98412698 0.98412698
|
|
1. 0.96875 1. 0.98412698]
|
|
|
|
mean value: 0.9873511904761905
|
|
|
|
key: test_recall
|
|
value: [0.85714286 0.85714286 1. 1. 1. 1.
|
|
1. 0.85714286 0.85714286 0.85714286]
|
|
|
|
mean value: 0.9285714285714286
|
|
|
|
key: train_recall
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./gid_sl.py:128: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
smnc_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./gid_sl.py:131: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
smnc_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[0.98412698 1. 0.98412698 0.98412698 0.98412698 0.98412698
|
|
0.98412698 0.98412698 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9857142857142857
|
|
|
|
key: test_roc_auc
|
|
value: [0.85714286 0.71428571 0.92857143 0.85714286 0.92857143 0.78571429
|
|
0.92857143 0.85714286 0.92857143 0.85714286]
|
|
|
|
mean value: 0.8642857142857143
|
|
|
|
key: train_roc_auc
|
|
value: [0.98412698 1. 0.98412698 0.98412698 0.98412698 0.98412698
|
|
0.99206349 0.97619048 0.99206349 0.98412698]
|
|
|
|
mean value: 0.9865079365079366
|
|
|
|
key: test_jcc
|
|
value: [0.75 0.6 0.875 0.77777778 0.875 0.7
|
|
0.875 0.75 0.85714286 0.75 ]
|
|
|
|
mean value: 0.7809920634920635
|
|
|
|
key: train_jcc
|
|
value: [0.96875 1. 0.96875 0.96875 0.96875 0.96875
|
|
0.98412698 0.95384615 0.98412698 0.96875 ]
|
|
|
|
mean value: 0.9734600122100122
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.26872206 0.24485159 0.19331622 0.19329 0.1993072 0.18969822
|
|
0.27916956 0.27131557 0.28865838 0.3392396 ]
|
|
|
|
mean value: 0.24675683975219725
|
|
|
|
key: score_time
|
|
value: [0.02235699 0.02243185 0.02210402 0.02205634 0.02375603 0.02349758
|
|
0.02142692 0.02302432 0.02361107 0.0212996 ]
|
|
|
|
mean value: 0.022556471824645995
|
|
|
|
key: test_mcc
|
|
value: [0.71428571 0.4472136 0.8660254 0.74535599 0.8660254 0.63245553
|
|
0.8660254 0.71428571 0.8660254 0.71428571]
|
|
|
|
mean value: 0.7431983878028462
|
|
|
|
key: train_mcc
|
|
value: [0.96825397 1. 0.96825397 0.96825397 0.96825397 0.96825397
|
|
0.98425098 0.95250095 0.98425098 0.96825397]
|
|
|
|
mean value: 0.9730526730528191
|
|
|
|
key: test_accuracy
|
|
value: [0.85714286 0.71428571 0.92857143 0.85714286 0.92857143 0.78571429
|
|
0.92857143 0.85714286 0.92857143 0.85714286]
|
|
|
|
mean value: 0.8642857142857143
|
|
|
|
key: train_accuracy
|
|
value: [0.98412698 1. 0.98412698 0.98412698 0.98412698 0.98412698
|
|
0.99206349 0.97619048 0.99206349 0.98412698]
|
|
|
|
mean value: 0.9865079365079364
|
|
|
|
key: test_fscore
|
|
value: [0.85714286 0.75 0.93333333 0.875 0.93333333 0.82352941
|
|
0.93333333 0.85714286 0.92307692 0.85714286]
|
|
|
|
mean value: 0.8743034906270201
|
|
|
|
key: train_fscore
|
|
value: [0.98412698 1. 0.98412698 0.98412698 0.98412698 0.98412698
|
|
0.992 0.97637795 0.992 0.98412698]
|
|
|
|
mean value: 0.986513985751781
|
|
|
|
key: test_precision
|
|
value: [0.85714286 0.66666667 0.875 0.77777778 0.875 0.7
|
|
0.875 0.85714286 1. 0.85714286]
|
|
|
|
mean value: 0.8340873015873016
|
|
|
|
key: train_precision
|
|
value: [0.98412698 1. 0.98412698 0.98412698 0.98412698 0.98412698
|
|
1. 0.96875 1. 0.98412698]
|
|
|
|
mean value: 0.9873511904761905
|
|
|
|
key: test_recall
|
|
value: [0.85714286 0.85714286 1. 1. 1. 1.
|
|
1. 0.85714286 0.85714286 0.85714286]
|
|
|
|
mean value: 0.9285714285714286
|
|
|
|
key: train_recall
|
|
value: [0.98412698 1. 0.98412698 0.98412698 0.98412698 0.98412698
|
|
0.98412698 0.98412698 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9857142857142857
|
|
|
|
key: test_roc_auc
|
|
value: [0.85714286 0.71428571 0.92857143 0.85714286 0.92857143 0.78571429
|
|
0.92857143 0.85714286 0.92857143 0.85714286]
|
|
|
|
mean value: 0.8642857142857143
|
|
|
|
key: train_roc_auc
|
|
value: [0.98412698 1. 0.98412698 0.98412698 0.98412698 0.98412698
|
|
0.99206349 0.97619048 0.99206349 0.98412698]
|
|
|
|
mean value: 0.9865079365079366
|
|
|
|
key: test_jcc
|
|
value: [0.75 0.6 0.875 0.77777778 0.875 0.7
|
|
0.875 0.75 0.85714286 0.75 ]
|
|
|
|
mean value: 0.7809920634920635
|
|
|
|
key: train_jcc
|
|
value: [0.96875 1. 0.96875 0.96875 0.96875 0.96875
|
|
0.98412698 0.95384615 0.98412698 0.96875 ]
|
|
|
|
mean value: 0.9734600122100122
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02622414 0.0280211 0.02811193 0.0279851 0.02787256 0.0258975
|
|
0.02756739 0.03350663 0.02549291 0.02850413]
|
|
|
|
mean value: 0.027918338775634766
|
|
|
|
key: score_time
|
|
value: [0.01168752 0.01170516 0.01172113 0.01167536 0.01167822 0.01169229
|
|
0.01169801 0.01164389 0.01171398 0.01167274]
|
|
|
|
mean value: 0.011688828468322754
|
|
|
|
key: test_mcc
|
|
value: [0.74535599 0.42857143 0.52223297 0.57735027 0.57735027 0.8660254
|
|
0.63245553 0.4472136 0.57735027 0.8660254 ]
|
|
|
|
mean value: 0.623993113160984
|
|
|
|
key: train_mcc
|
|
value: [0.88900089 0.9369802 0.87345612 0.9047619 0.93650794 0.90659109
|
|
0.95250095 0.9047619 0.88900089 0.9047619 ]
|
|
|
|
mean value: 0.9098323803395101
|
|
|
|
key: test_accuracy
|
|
value: [0.85714286 0.71428571 0.71428571 0.78571429 0.78571429 0.92857143
|
|
0.78571429 0.71428571 0.78571429 0.92857143]
|
|
|
|
mean value: 0.8
|
|
|
|
key: train_accuracy
|
|
value: [0.94444444 0.96825397 0.93650794 0.95238095 0.96825397 0.95238095
|
|
0.97619048 0.95238095 0.94444444 0.95238095]
|
|
|
|
mean value: 0.9547619047619047
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.71428571 0.77777778 0.8 0.76923077 0.93333333
|
|
0.82352941 0.66666667 0.8 0.93333333]
|
|
|
|
mean value: 0.8051490339725633
|
|
|
|
key: train_fscore
|
|
value: [0.94488189 0.96875 0.9375 0.95238095 0.96825397 0.95384615
|
|
0.97637795 0.95238095 0.94488189 0.95238095]
|
|
|
|
mean value: 0.9551634711526443
|
|
|
|
key: test_precision
|
|
value: [1. 0.71428571 0.63636364 0.75 0.83333333 0.875
|
|
0.7 0.8 0.75 0.875 ]
|
|
|
|
mean value: 0.7933982683982684
|
|
|
|
key: train_precision
|
|
value: [0.9375 0.95384615 0.92307692 0.95238095 0.96825397 0.92537313
|
|
0.96875 0.95238095 0.9375 0.95238095]
|
|
|
|
mean value: 0.947144303664826
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.71428571 1. 0.85714286 0.71428571 1.
|
|
1. 0.57142857 0.85714286 1. ]
|
|
|
|
mean value: 0.8428571428571429
|
|
|
|
key: train_recall
|
|
value: [0.95238095 0.98412698 0.95238095 0.95238095 0.96825397 0.98412698
|
|
0.98412698 0.95238095 0.95238095 0.95238095]
|
|
|
|
mean value: 0.9634920634920634
|
|
|
|
key: test_roc_auc
|
|
value: [0.85714286 0.71428571 0.71428571 0.78571429 0.78571429 0.92857143
|
|
0.78571429 0.71428571 0.78571429 0.92857143]
|
|
|
|
mean value: 0.8
|
|
|
|
key: train_roc_auc
|
|
value: [0.94444444 0.96825397 0.93650794 0.95238095 0.96825397 0.95238095
|
|
0.97619048 0.95238095 0.94444444 0.95238095]
|
|
|
|
mean value: 0.9547619047619047
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.55555556 0.63636364 0.66666667 0.625 0.875
|
|
0.7 0.5 0.66666667 0.875 ]
|
|
|
|
mean value: 0.6814538239538239
|
|
|
|
key: train_jcc
|
|
value: [0.89552239 0.93939394 0.88235294 0.90909091 0.93846154 0.91176471
|
|
0.95384615 0.90909091 0.89552239 0.90909091]
|
|
|
|
mean value: 0.9144136782152585
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.68336606 0.79824805 0.65278387 0.66261816 0.76574349 0.71132708
|
|
0.62404537 0.68007779 0.76493001 0.67316413]
|
|
|
|
mean value: 0.7016304016113282
|
|
|
|
key: score_time
|
|
value: [0.01466322 0.01438069 0.01444507 0.01473403 0.01458716 0.01554775
|
|
0.01198745 0.01623559 0.01437926 0.01435518]
|
|
|
|
mean value: 0.014531540870666503
|
|
|
|
key: test_mcc
|
|
value: [0.57735027 0.4472136 0.8660254 0.4472136 0.8660254 0.74535599
|
|
0.63245553 0.74535599 0.8660254 0.71428571]
|
|
|
|
mean value: 0.6907306902862108
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 0.98425098 1. 1. 1.
|
|
1. 0.98425098 0.98425098 0.98425098]
|
|
|
|
mean value: 0.9937003937005905
|
|
|
|
key: test_accuracy
|
|
value: [0.78571429 0.71428571 0.92857143 0.71428571 0.92857143 0.85714286
|
|
0.78571429 0.85714286 0.92857143 0.85714286]
|
|
|
|
mean value: 0.8357142857142857
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 0.99206349 1. 1. 1.
|
|
1. 0.99206349 0.99206349 0.99206349]
|
|
|
|
mean value: 0.9968253968253968
|
|
|
|
key: test_fscore
|
|
value: [0.76923077 0.75 0.93333333 0.75 0.93333333 0.875
|
|
0.82352941 0.875 0.93333333 0.85714286]
|
|
|
|
mean value: 0.8499903038138332
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 0.99212598 1. 1. 1.
|
|
1. 0.99212598 0.99212598 0.99212598]
|
|
|
|
mean value: 0.9968503937007874
|
|
|
|
key: test_precision
|
|
value: [0.83333333 0.66666667 0.875 0.66666667 0.875 0.77777778
|
|
0.7 0.77777778 0.875 0.85714286]
|
|
|
|
mean value: 0.7904365079365079
|
|
|
|
key: train_precision
|
|
value: [1. 1. 0.984375 1. 1. 1. 1. 0.984375
|
|
0.984375 0.984375]
|
|
|
|
mean value: 0.99375
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.85714286 1. 0.85714286 1. 1.
|
|
1. 1. 1. 0.85714286]
|
|
|
|
mean value: 0.9285714285714286
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.78571429 0.71428571 0.92857143 0.71428571 0.92857143 0.85714286
|
|
0.78571429 0.85714286 0.92857143 0.85714286]
|
|
|
|
mean value: 0.8357142857142857
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 0.99206349 1. 1. 1.
|
|
1. 0.99206349 0.99206349 0.99206349]
|
|
|
|
mean value: 0.9968253968253968
|
|
|
|
key: test_jcc
|
|
value: [0.625 0.6 0.875 0.6 0.875 0.77777778
|
|
0.7 0.77777778 0.875 0.75 ]
|
|
|
|
mean value: 0.7455555555555555
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 0.984375 1. 1. 1. 1. 0.984375
|
|
0.984375 0.984375]
|
|
|
|
mean value: 0.99375
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.0120337 0.00913382 0.00880861 0.00886416 0.00864506 0.00857353
|
|
0.00869036 0.00872898 0.00838923 0.00855565]
|
|
|
|
mean value: 0.009042310714721679
|
|
|
|
key: score_time
|
|
value: [0.01197982 0.00897551 0.00884962 0.00944877 0.00865412 0.00858283
|
|
0.00853062 0.00851512 0.00845766 0.00850487]
|
|
|
|
mean value: 0.00904989242553711
|
|
|
|
key: test_mcc
|
|
value: [ 0. -0.17407766 0. 0. 0.4472136 0.17407766
|
|
0.31622777 0. 0.14285714 0.17407766]
|
|
|
|
mean value: 0.10803761603296366
|
|
|
|
key: train_mcc
|
|
value: [0.37188796 0.43759497 0.40192095 0.52407367 0.42943789 0.51910855
|
|
0.41027734 0.59825454 0.52512211 0.40422604]
|
|
|
|
mean value: 0.46219040247551463
|
|
|
|
key: test_accuracy
|
|
value: [0.5 0.42857143 0.5 0.5 0.71428571 0.57142857
|
|
0.64285714 0.5 0.57142857 0.57142857]
|
|
|
|
mean value: 0.55
|
|
|
|
key: train_accuracy
|
|
value: [0.68253968 0.69047619 0.6984127 0.76190476 0.71428571 0.75396825
|
|
0.6984127 0.79365079 0.74603175 0.6984127 ]
|
|
|
|
mean value: 0.7238095238095238
|
|
|
|
key: test_fscore
|
|
value: [0.58823529 0.55555556 0.58823529 0.53333333 0.75 0.66666667
|
|
0.70588235 0. 0.57142857 0.66666667]
|
|
|
|
mean value: 0.5626003734827264
|
|
|
|
key: train_fscore
|
|
value: [0.71014493 0.75159236 0.72058824 0.75806452 0.72307692 0.77697842
|
|
0.73239437 0.77192982 0.78378378 0.72463768]
|
|
|
|
mean value: 0.7453191031692181
|
|
|
|
key: test_precision
|
|
value: [0.5 0.45454545 0.5 0.5 0.66666667 0.54545455
|
|
0.6 0. 0.57142857 0.54545455]
|
|
|
|
mean value: 0.4883549783549783
|
|
|
|
key: train_precision
|
|
value: [0.65333333 0.62765957 0.67123288 0.7704918 0.70149254 0.71052632
|
|
0.65822785 0.8627451 0.68235294 0.66666667]
|
|
|
|
mean value: 0.7004728994878962
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.71428571 0.71428571 0.57142857 0.85714286 0.85714286
|
|
0.85714286 0. 0.57142857 0.85714286]
|
|
|
|
mean value: 0.6714285714285714
|
|
|
|
key: train_recall
|
|
value: [0.77777778 0.93650794 0.77777778 0.74603175 0.74603175 0.85714286
|
|
0.82539683 0.6984127 0.92063492 0.79365079]
|
|
|
|
mean value: 0.807936507936508
|
|
|
|
key: test_roc_auc
|
|
value: [0.5 0.42857143 0.5 0.5 0.71428571 0.57142857
|
|
0.64285714 0.5 0.57142857 0.57142857]
|
|
|
|
mean value: 0.55
|
|
|
|
key: train_roc_auc
|
|
value: [0.68253968 0.69047619 0.6984127 0.76190476 0.71428571 0.75396825
|
|
0.6984127 0.79365079 0.74603175 0.6984127 ]
|
|
|
|
mean value: 0.7238095238095238
|
|
|
|
key: test_jcc
|
|
value: [0.41666667 0.38461538 0.41666667 0.36363636 0.6 0.5
|
|
0.54545455 0. 0.4 0.5 ]
|
|
|
|
mean value: 0.4127039627039627
|
|
|
|
key: train_jcc
|
|
value: [0.5505618 0.60204082 0.56321839 0.61038961 0.56626506 0.63529412
|
|
0.57777778 0.62857143 0.64444444 0.56818182]
|
|
|
|
mean value: 0.594674526213704
|
|
|
|
MCC on Blind test: 0.41
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.0091207 0.00891185 0.00888276 0.00885987 0.00969458 0.00874472
|
|
0.00878119 0.0088098 0.00976038 0.00879478]
|
|
|
|
mean value: 0.009036064147949219
|
|
|
|
key: score_time
|
|
value: [0.00873804 0.00874662 0.00861692 0.00858283 0.00940084 0.00852609
|
|
0.0085516 0.00952411 0.00927234 0.00853014]
|
|
|
|
mean value: 0.008848953247070312
|
|
|
|
key: test_mcc
|
|
value: [ 0.28867513 -0.1490712 0.4472136 0.31622777 0.4472136 0.8660254
|
|
0.4472136 0.14285714 0.57735027 0.57735027]
|
|
|
|
mean value: 0.39610555736323716
|
|
|
|
key: train_mcc
|
|
value: [0.70276422 0.68811011 0.65610499 0.72345771 0.65610499 0.72011523
|
|
0.67357531 0.76200076 0.70062273 0.71428571]
|
|
|
|
mean value: 0.6997141756428337
|
|
|
|
key: test_accuracy
|
|
value: [0.64285714 0.42857143 0.71428571 0.64285714 0.71428571 0.92857143
|
|
0.71428571 0.57142857 0.78571429 0.78571429]
|
|
|
|
mean value: 0.6928571428571428
|
|
|
|
key: train_accuracy
|
|
value: [0.84920635 0.84126984 0.82539683 0.85714286 0.82539683 0.85714286
|
|
0.83333333 0.88095238 0.84920635 0.85714286]
|
|
|
|
mean value: 0.8476190476190476
|
|
|
|
key: test_fscore
|
|
value: [0.61538462 0.33333333 0.75 0.70588235 0.66666667 0.93333333
|
|
0.75 0.57142857 0.8 0.8 ]
|
|
|
|
mean value: 0.6926028873087696
|
|
|
|
key: train_fscore
|
|
value: [0.85714286 0.85074627 0.81355932 0.86764706 0.8358209 0.86567164
|
|
0.84444444 0.88188976 0.85496183 0.85714286]
|
|
|
|
mean value: 0.8529026941398331
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.4 0.66666667 0.6 0.8 0.875
|
|
0.66666667 0.57142857 0.75 0.75 ]
|
|
|
|
mean value: 0.6746428571428571
|
|
|
|
key: train_precision
|
|
value: [0.81428571 0.8028169 0.87272727 0.80821918 0.78873239 0.81690141
|
|
0.79166667 0.875 0.82352941 0.85714286]
|
|
|
|
mean value: 0.825102180489476
|
|
|
|
key: test_recall
|
|
value: [0.57142857 0.28571429 0.85714286 0.85714286 0.57142857 1.
|
|
0.85714286 0.57142857 0.85714286 0.85714286]
|
|
|
|
mean value: 0.7285714285714285
|
|
|
|
key: train_recall
|
|
value: [0.9047619 0.9047619 0.76190476 0.93650794 0.88888889 0.92063492
|
|
0.9047619 0.88888889 0.88888889 0.85714286]
|
|
|
|
mean value: 0.8857142857142857
|
|
|
|
key: test_roc_auc
|
|
value: [0.64285714 0.42857143 0.71428571 0.64285714 0.71428571 0.92857143
|
|
0.71428571 0.57142857 0.78571429 0.78571429]
|
|
|
|
mean value: 0.6928571428571428
|
|
|
|
key: train_roc_auc
|
|
value: [0.84920635 0.84126984 0.82539683 0.85714286 0.82539683 0.85714286
|
|
0.83333333 0.88095238 0.84920635 0.85714286]
|
|
|
|
mean value: 0.8476190476190476
|
|
|
|
key: test_jcc
|
|
value: [0.44444444 0.2 0.6 0.54545455 0.5 0.875
|
|
0.6 0.4 0.66666667 0.66666667]
|
|
|
|
mean value: 0.5498232323232323
|
|
|
|
key: train_jcc
|
|
value: [0.75 0.74025974 0.68571429 0.76623377 0.71794872 0.76315789
|
|
0.73076923 0.78873239 0.74666667 0.75 ]
|
|
|
|
mean value: 0.7439482696695447
|
|
|
|
MCC on Blind test: 0.09
|
|
|
|
Accuracy on Blind test: 0.5
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00861692 0.00846934 0.0086894 0.00878334 0.00939131 0.00949144
|
|
0.00942993 0.00903368 0.00895691 0.00950742]
|
|
|
|
mean value: 0.009036970138549805
|
|
|
|
key: score_time
|
|
value: [0.0095551 0.00964808 0.0103519 0.009969 0.01026988 0.01024032
|
|
0.01040959 0.00944185 0.01325274 0.01023769]
|
|
|
|
mean value: 0.010337615013122558
|
|
|
|
key: test_mcc
|
|
value: [ 0.4472136 0.14285714 0.71428571 -0.28867513 0.42857143 0.57735027
|
|
0.1490712 0.28867513 0. 0.4472136 ]
|
|
|
|
mean value: 0.2906562944403813
|
|
|
|
key: train_mcc
|
|
value: [0.6415003 0.68132997 0.60942528 0.65376533 0.54304508 0.63887656
|
|
0.6032506 0.63692976 0.62029917 0.64482588]
|
|
|
|
mean value: 0.6273247932485556
|
|
|
|
key: test_accuracy
|
|
value: [0.71428571 0.57142857 0.85714286 0.35714286 0.71428571 0.78571429
|
|
0.57142857 0.64285714 0.5 0.71428571]
|
|
|
|
mean value: 0.6428571428571429
|
|
|
|
key: train_accuracy
|
|
value: [0.81746032 0.83333333 0.8015873 0.82539683 0.76984127 0.81746032
|
|
0.8015873 0.81746032 0.80952381 0.81746032]
|
|
|
|
mean value: 0.8111111111111111
|
|
|
|
key: test_fscore
|
|
value: [0.75 0.57142857 0.85714286 0.4 0.71428571 0.8
|
|
0.625 0.61538462 0.46153846 0.66666667]
|
|
|
|
mean value: 0.6461446886446887
|
|
|
|
key: train_fscore
|
|
value: [0.82962963 0.84892086 0.81481481 0.81666667 0.78195489 0.82706767
|
|
0.80314961 0.82442748 0.81538462 0.83211679]
|
|
|
|
mean value: 0.8194133021732467
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.57142857 0.85714286 0.375 0.71428571 0.75
|
|
0.55555556 0.66666667 0.5 0.8 ]
|
|
|
|
mean value: 0.6456746031746031
|
|
|
|
key: train_precision
|
|
value: [0.77777778 0.77631579 0.76388889 0.85964912 0.74285714 0.78571429
|
|
0.796875 0.79411765 0.79104478 0.77027027]
|
|
|
|
mean value: 0.7858510700967294
|
|
|
|
key: test_recall
|
|
value: [0.85714286 0.57142857 0.85714286 0.42857143 0.71428571 0.85714286
|
|
0.71428571 0.57142857 0.42857143 0.57142857]
|
|
|
|
mean value: 0.6571428571428571
|
|
|
|
key: train_recall
|
|
value: [0.88888889 0.93650794 0.87301587 0.77777778 0.82539683 0.87301587
|
|
0.80952381 0.85714286 0.84126984 0.9047619 ]
|
|
|
|
mean value: 0.8587301587301587
|
|
|
|
key: test_roc_auc
|
|
value: [0.71428571 0.57142857 0.85714286 0.35714286 0.71428571 0.78571429
|
|
0.57142857 0.64285714 0.5 0.71428571]
|
|
|
|
mean value: 0.6428571428571429
|
|
|
|
key: train_roc_auc
|
|
value: [0.81746032 0.83333333 0.8015873 0.82539683 0.76984127 0.81746032
|
|
0.8015873 0.81746032 0.80952381 0.81746032]
|
|
|
|
mean value: 0.8111111111111111
|
|
|
|
key: test_jcc
|
|
value: [0.6 0.4 0.75 0.25 0.55555556 0.66666667
|
|
0.45454545 0.44444444 0.3 0.5 ]
|
|
|
|
mean value: 0.4921212121212121
|
|
|
|
key: train_jcc
|
|
value: [0.70886076 0.7375 0.6875 0.69014085 0.64197531 0.70512821
|
|
0.67105263 0.7012987 0.68831169 0.7125 ]
|
|
|
|
mean value: 0.694426813952361
|
|
|
|
MCC on Blind test: -0.17
|
|
|
|
Accuracy on Blind test: 0.4
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0100832 0.00976014 0.00987196 0.01112866 0.0098362 0.00980902
|
|
0.00974154 0.00981092 0.00974679 0.00981164]
|
|
|
|
mean value: 0.009960007667541505
|
|
|
|
key: score_time
|
|
value: [0.00885224 0.00879025 0.00884819 0.00890374 0.00895476 0.00908732
|
|
0.00925803 0.00901294 0.0088799 0.00916076]
|
|
|
|
mean value: 0.008974814414978027
|
|
|
|
key: test_mcc
|
|
value: [0.4472136 0.28867513 0.52223297 0.57735027 0.74535599 0.8660254
|
|
0.63245553 0.4472136 0.4472136 0.63245553]
|
|
|
|
mean value: 0.5606191618503126
|
|
|
|
key: train_mcc
|
|
value: [0.81116045 0.89170166 0.85811633 0.84169408 0.81116045 0.84169408
|
|
0.84511128 0.85985517 0.84169408 0.81322028]
|
|
|
|
mean value: 0.8415407870156413
|
|
|
|
key: test_accuracy
|
|
value: [0.71428571 0.64285714 0.71428571 0.78571429 0.85714286 0.92857143
|
|
0.78571429 0.71428571 0.71428571 0.78571429]
|
|
|
|
mean value: 0.7642857142857142
|
|
|
|
key: train_accuracy
|
|
value: [0.9047619 0.94444444 0.92857143 0.92063492 0.9047619 0.92063492
|
|
0.92063492 0.92857143 0.92063492 0.9047619 ]
|
|
|
|
mean value: 0.9198412698412698
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.66666667 0.77777778 0.8 0.83333333 0.93333333
|
|
0.82352941 0.66666667 0.75 0.82352941]
|
|
|
|
mean value: 0.7741503267973856
|
|
|
|
key: train_fscore
|
|
value: [0.90769231 0.94656489 0.93023256 0.921875 0.90769231 0.921875
|
|
0.92424242 0.93129771 0.921875 0.90909091]
|
|
|
|
mean value: 0.922243810227733
|
|
|
|
key: test_precision
|
|
value: [0.8 0.625 0.63636364 0.75 1. 0.875
|
|
0.7 0.8 0.66666667 0.7 ]
|
|
|
|
mean value: 0.7553030303030303
|
|
|
|
key: train_precision
|
|
value: [0.88059701 0.91176471 0.90909091 0.90769231 0.88059701 0.90769231
|
|
0.88405797 0.89705882 0.90769231 0.86956522]
|
|
|
|
mean value: 0.895580857983614
|
|
|
|
key: test_recall
|
|
value: [0.57142857 0.71428571 1. 0.85714286 0.71428571 1.
|
|
1. 0.57142857 0.85714286 1. ]
|
|
|
|
mean value: 0.8285714285714285
|
|
|
|
key: train_recall
|
|
value: [0.93650794 0.98412698 0.95238095 0.93650794 0.93650794 0.93650794
|
|
0.96825397 0.96825397 0.93650794 0.95238095]
|
|
|
|
mean value: 0.9507936507936507
|
|
|
|
key: test_roc_auc
|
|
value: [0.71428571 0.64285714 0.71428571 0.78571429 0.85714286 0.92857143
|
|
0.78571429 0.71428571 0.71428571 0.78571429]
|
|
|
|
mean value: 0.7642857142857143
|
|
|
|
key: train_roc_auc
|
|
value: [0.9047619 0.94444444 0.92857143 0.92063492 0.9047619 0.92063492
|
|
0.92063492 0.92857143 0.92063492 0.9047619 ]
|
|
|
|
mean value: 0.9198412698412699
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.5 0.63636364 0.66666667 0.71428571 0.875
|
|
0.7 0.5 0.6 0.7 ]
|
|
|
|
mean value: 0.6392316017316018
|
|
|
|
key: train_jcc
|
|
value: [0.83098592 0.89855072 0.86956522 0.85507246 0.83098592 0.85507246
|
|
0.85915493 0.87142857 0.85507246 0.83333333]
|
|
|
|
mean value: 0.8559221998658618
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.51826239 0.60860467 0.53846049 0.5424521 0.57185245 0.70954967
|
|
0.54308391 0.59095526 0.54834795 0.65395856]
|
|
|
|
mean value: 0.5825527429580688
|
|
|
|
key: score_time
|
|
value: [0.01227689 0.01239133 0.01249623 0.01224232 0.01252151 0.01222205
|
|
0.02427602 0.01235914 0.01278067 0.01255131]
|
|
|
|
mean value: 0.013611745834350587
|
|
|
|
key: test_mcc
|
|
value: [0.57735027 0.42857143 0.8660254 0.57735027 0.8660254 0.74535599
|
|
0.31622777 0.57735027 0.8660254 0.57735027]
|
|
|
|
mean value: 0.6397632475200016
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.78571429 0.71428571 0.92857143 0.78571429 0.92857143 0.85714286
|
|
0.64285714 0.78571429 0.92857143 0.78571429]
|
|
|
|
mean value: 0.8142857142857143
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.76923077 0.71428571 0.93333333 0.8 0.93333333 0.875
|
|
0.70588235 0.76923077 0.93333333 0.8 ]
|
|
|
|
mean value: 0.823362960568843
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.83333333 0.71428571 0.875 0.75 0.875 0.77777778
|
|
0.6 0.83333333 0.875 0.75 ]
|
|
|
|
mean value: 0.7883730158730159
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.71428571 1. 0.85714286 1. 1.
|
|
0.85714286 0.71428571 1. 0.85714286]
|
|
|
|
mean value: 0.8714285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.78571429 0.71428571 0.92857143 0.78571429 0.92857143 0.85714286
|
|
0.64285714 0.78571429 0.92857143 0.78571429]
|
|
|
|
mean value: 0.8142857142857144
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.625 0.55555556 0.875 0.66666667 0.875 0.77777778
|
|
0.54545455 0.625 0.875 0.66666667]
|
|
|
|
mean value: 0.7087121212121212
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01565766 0.01219702 0.01145101 0.01032972 0.01085424 0.01028109
|
|
0.01057148 0.01089954 0.01067233 0.01110506]
|
|
|
|
mean value: 0.011401915550231933
|
|
|
|
key: score_time
|
|
value: [0.01228023 0.00890565 0.00876069 0.00860834 0.0085938 0.00857997
|
|
0.0085113 0.00854468 0.00855422 0.00840044]
|
|
|
|
mean value: 0.008973932266235352
|
|
|
|
key: test_mcc
|
|
value: [0.8660254 0.71428571 1. 0.8660254 1. 0.8660254
|
|
0.8660254 0.8660254 1. 1. ]
|
|
|
|
mean value: 0.9044412733207908
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.92857143 0.85714286 1. 0.92857143 1. 0.92857143
|
|
0.92857143 0.92857143 1. 1. ]
|
|
|
|
mean value: 0.95
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.85714286 1. 0.92307692 1. 0.93333333
|
|
0.93333333 0.93333333 1. 1. ]
|
|
|
|
mean value: 0.9513553113553114
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.875 0.85714286 1. 1. 1. 0.875
|
|
0.875 0.875 1. 1. ]
|
|
|
|
mean value: 0.9357142857142857
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.85714286 1. 0.85714286 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9714285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.92857143 0.85714286 1. 0.92857143 1. 0.92857143
|
|
0.92857143 0.92857143 1. 1. ]
|
|
|
|
mean value: 0.9500000000000001
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.75 1. 0.85714286 1. 0.875
|
|
0.875 0.875 1. 1. ]
|
|
|
|
mean value: 0.9107142857142857
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.67
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.08541059 0.08450389 0.08509231 0.08471632 0.08483028 0.08506131
|
|
0.09252977 0.09162402 0.09251976 0.08942318]
|
|
|
|
mean value: 0.0875711441040039
|
|
|
|
key: score_time
|
|
value: [0.01690626 0.01708364 0.01702738 0.01695395 0.01703334 0.01681542
|
|
0.01859069 0.01716471 0.01851535 0.01865768]
|
|
|
|
mean value: 0.017474842071533204
|
|
|
|
key: test_mcc
|
|
value: [0.74535599 0.57735027 0.8660254 0.63245553 0.71428571 0.74535599
|
|
0.28867513 0.63245553 1. 0.8660254 ]
|
|
|
|
mean value: 0.7067984974706242
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.85714286 0.78571429 0.92857143 0.78571429 0.85714286 0.85714286
|
|
0.64285714 0.78571429 1. 0.92857143]
|
|
|
|
mean value: 0.8428571428571429
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.8 0.93333333 0.82352941 0.85714286 0.875
|
|
0.66666667 0.72727273 1. 0.92307692]
|
|
|
|
mean value: 0.8439355252590547
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.75 0.875 0.7 0.85714286 0.77777778
|
|
0.625 1. 1. 1. ]
|
|
|
|
mean value: 0.8584920634920635
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.85714286 1. 1. 0.85714286 1.
|
|
0.71428571 0.57142857 1. 0.85714286]
|
|
|
|
mean value: 0.8571428571428571
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.85714286 0.78571429 0.92857143 0.78571429 0.85714286 0.85714286
|
|
0.64285714 0.78571429 1. 0.92857143]
|
|
|
|
mean value: 0.8428571428571429
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.66666667 0.875 0.7 0.75 0.77777778
|
|
0.5 0.57142857 1. 0.85714286]
|
|
|
|
mean value: 0.7412301587301587
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00889421 0.00980115 0.00881743 0.00876904 0.00982475 0.00965667
|
|
0.00928593 0.00959969 0.00983977 0.00961208]
|
|
|
|
mean value: 0.00941007137298584
|
|
|
|
key: score_time
|
|
value: [0.00972748 0.00937581 0.00877976 0.00850892 0.00884318 0.00931907
|
|
0.00854588 0.00929189 0.00882339 0.00881791]
|
|
|
|
mean value: 0.009003329277038574
|
|
|
|
key: test_mcc
|
|
value: [0.42857143 0.2773501 1. 0.63245553 0.4472136 0.63245553
|
|
0.14285714 0.71428571 0.71428571 0.57735027]
|
|
|
|
mean value: 0.556682502686955
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.71428571 0.57142857 1. 0.78571429 0.71428571 0.78571429
|
|
0.57142857 0.85714286 0.85714286 0.78571429]
|
|
|
|
mean value: 0.7642857142857142
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.71428571 0.7 1. 0.82352941 0.75 0.82352941
|
|
0.57142857 0.85714286 0.85714286 0.8 ]
|
|
|
|
mean value: 0.7897058823529411
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.71428571 0.53846154 1. 0.7 0.66666667 0.7
|
|
0.57142857 0.85714286 0.85714286 0.75 ]
|
|
|
|
mean value: 0.7355128205128205
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.71428571 1. 1. 1. 0.85714286 1.
|
|
0.57142857 0.85714286 0.85714286 0.85714286]
|
|
|
|
mean value: 0.8714285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.71428571 0.57142857 1. 0.78571429 0.71428571 0.78571429
|
|
0.57142857 0.85714286 0.85714286 0.78571429]
|
|
|
|
mean value: 0.7642857142857143
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.55555556 0.53846154 1. 0.7 0.6 0.7
|
|
0.4 0.75 0.75 0.66666667]
|
|
|
|
mean value: 0.6660683760683761
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.36
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.14242721 1.15457511 1.12844539 1.08648467 1.08412099 1.09782338
|
|
1.10941672 1.12080956 1.16718316 1.16564441]
|
|
|
|
mean value: 1.1256930589675904
|
|
|
|
key: score_time
|
|
value: [0.0904541 0.09107041 0.08707428 0.08665013 0.08670735 0.09008956
|
|
0.09070063 0.09385657 0.09427905 0.09441352]
|
|
|
|
mean value: 0.09052956104278564
|
|
|
|
key: test_mcc
|
|
value: [0.74535599 0.4472136 1. 0.63245553 0.8660254 0.74535599
|
|
0.63245553 1. 1. 0.8660254 ]
|
|
|
|
mean value: 0.7934887452136047
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.85714286 0.71428571 1. 0.78571429 0.92857143 0.85714286
|
|
0.78571429 1. 1. 0.92857143]
|
|
|
|
mean value: 0.8857142857142857
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.75 1. 0.82352941 0.93333333 0.875
|
|
0.82352941 1. 1. 0.92307692]
|
|
|
|
mean value: 0.8961802413273001
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 0.66666667 1. 0.7 0.875 0.77777778
|
|
0.7 1. 1. 1. ]
|
|
|
|
mean value: 0.8719444444444444
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.85714286 1. 1. 1. 1.
|
|
1. 1. 1. 0.85714286]
|
|
|
|
mean value: 0.9428571428571428
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.85714286 0.71428571 1. 0.78571429 0.92857143 0.85714286
|
|
0.78571429 1. 1. 0.92857143]
|
|
|
|
mean value: 0.8857142857142857
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
|
|
value: [0.71428571 0.6 1. 0.7 0.875 0.77777778
|
|
0.7 1. 1. 0.85714286]
|
|
|
|
mean value: 0.8224206349206349
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.8670609 0.91074848 0.91317868 0.92834425 0.89048648 0.88200378
|
|
0.89488196 0.90182328 0.87973428 0.86373258]
|
|
|
|
mean value: 0.8931994676589966
|
|
|
|
key: score_time
|
|
value: [0.22280693 0.19019127 0.17825437 0.22623491 0.11388898 0.20483541
|
|
0.16181397 0.23762107 0.23933649 0.2001493 ]
|
|
|
|
mean value: 0.19751327037811278
|
|
|
|
key: test_mcc
|
|
value: [0.74535599 0.57735027 1. 0.63245553 0.8660254 0.74535599
|
|
0.74535599 0.8660254 0.74535599 0.8660254 ]
|
|
|
|
mean value: 0.7789305982576338
|
|
|
|
key: train_mcc
|
|
value: [0.98425098 1. 0.98425098 0.98425098 0.98425098 0.96825397
|
|
0.98425098 0.96825397 0.96825397 0.98425098]
|
|
|
|
mean value: 0.9810267810270763
|
|
|
|
key: test_accuracy
|
|
value: [0.85714286 0.78571429 1. 0.78571429 0.92857143 0.85714286
|
|
0.85714286 0.92857143 0.85714286 0.92857143]
|
|
|
|
mean value: 0.8785714285714286
|
|
|
|
key: train_accuracy
|
|
value: [0.99206349 1. 0.99206349 0.99206349 0.99206349 0.98412698
|
|
0.99206349 0.98412698 0.98412698 0.99206349]
|
|
|
|
mean value: 0.9904761904761905
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.8 1. 0.82352941 0.93333333 0.875
|
|
0.875 0.93333333 0.875 0.92307692]
|
|
|
|
mean value: 0.8871606334841629
|
|
|
|
key: train_fscore
|
|
value: [0.992 1. 0.992 0.992 0.992 0.98412698
|
|
0.992 0.98412698 0.98412698 0.992 ]
|
|
|
|
mean value: 0.9904380952380951
|
|
|
|
key: test_precision
|
|
value: [1. 0.75 1. 0.7 0.875 0.77777778
|
|
0.77777778 0.875 0.77777778 1. ]
|
|
|
|
mean value: 0.8533333333333333
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 0.98412698
|
|
1. 0.98412698 0.98412698 1. ]
|
|
|
|
mean value: 0.9952380952380953
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.85714286 1. 1. 1. 1.
|
|
1. 1. 1. 0.85714286]
|
|
|
|
mean value: 0.9428571428571428
|
|
|
|
key: train_recall
|
|
value: [0.98412698 1. 0.98412698 0.98412698 0.98412698 0.98412698
|
|
0.98412698 0.98412698 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9857142857142857
|
|
|
|
key: test_roc_auc
|
|
value: [0.85714286 0.78571429 1. 0.78571429 0.92857143 0.85714286
|
|
0.85714286 0.92857143 0.85714286 0.92857143]
|
|
|
|
mean value: 0.8785714285714286
|
|
|
|
key: train_roc_auc
|
|
value: [0.99206349 1. 0.99206349 0.99206349 0.99206349 0.98412698
|
|
0.99206349 0.98412698 0.98412698 0.99206349]
|
|
|
|
mean value: 0.9904761904761905
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.66666667 1. 0.7 0.875 0.77777778
|
|
0.77777778 0.875 0.77777778 0.85714286]
|
|
|
|
mean value: 0.8021428571428572
|
|
|
|
key: train_jcc
|
|
value: [0.98412698 1. 0.98412698 0.98412698 0.98412698 0.96875
|
|
0.98412698 0.96875 0.96875 0.98412698]
|
|
|
|
mean value: 0.9811011904761905
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.0122056 0.0089159 0.00880027 0.00890183 0.00872207 0.00890827
|
|
0.00943804 0.00870562 0.00884724 0.00920796]
|
|
|
|
mean value: 0.009265279769897461
|
|
|
|
key: score_time
|
|
value: [0.01347136 0.00859499 0.00892377 0.00857043 0.00881839 0.00870991
|
|
0.00861859 0.00867581 0.00872064 0.00905681]
|
|
|
|
mean value: 0.009216070175170898
|
|
|
|
key: test_mcc
|
|
value: [ 0.28867513 -0.1490712 0.4472136 0.31622777 0.4472136 0.8660254
|
|
0.4472136 0.14285714 0.57735027 0.57735027]
|
|
|
|
mean value: 0.39610555736323716
|
|
|
|
key: train_mcc
|
|
value: [0.70276422 0.68811011 0.65610499 0.72345771 0.65610499 0.72011523
|
|
0.67357531 0.76200076 0.70062273 0.71428571]
|
|
|
|
mean value: 0.6997141756428337
|
|
|
|
key: test_accuracy
|
|
value: [0.64285714 0.42857143 0.71428571 0.64285714 0.71428571 0.92857143
|
|
0.71428571 0.57142857 0.78571429 0.78571429]
|
|
|
|
mean value: 0.6928571428571428
|
|
|
|
key: train_accuracy
|
|
value: [0.84920635 0.84126984 0.82539683 0.85714286 0.82539683 0.85714286
|
|
0.83333333 0.88095238 0.84920635 0.85714286]
|
|
|
|
mean value: 0.8476190476190476
|
|
|
|
key: test_fscore
|
|
value: [0.61538462 0.33333333 0.75 0.70588235 0.66666667 0.93333333
|
|
0.75 0.57142857 0.8 0.8 ]
|
|
|
|
mean value: 0.6926028873087696
|
|
|
|
key: train_fscore
|
|
value: [0.85714286 0.85074627 0.81355932 0.86764706 0.8358209 0.86567164
|
|
0.84444444 0.88188976 0.85496183 0.85714286]
|
|
|
|
mean value: 0.8529026941398331
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.4 0.66666667 0.6 0.8 0.875
|
|
0.66666667 0.57142857 0.75 0.75 ]
|
|
|
|
mean value: 0.6746428571428571
|
|
|
|
key: train_precision
|
|
value: [0.81428571 0.8028169 0.87272727 0.80821918 0.78873239 0.81690141
|
|
0.79166667 0.875 0.82352941 0.85714286]
|
|
|
|
mean value: 0.825102180489476
|
|
|
|
key: test_recall
|
|
value: [0.57142857 0.28571429 0.85714286 0.85714286 0.57142857 1.
|
|
0.85714286 0.57142857 0.85714286 0.85714286]
|
|
|
|
mean value: 0.7285714285714285
|
|
|
|
key: train_recall
|
|
value: [0.9047619 0.9047619 0.76190476 0.93650794 0.88888889 0.92063492
|
|
0.9047619 0.88888889 0.88888889 0.85714286]
|
|
|
|
mean value: 0.8857142857142857
|
|
|
|
key: test_roc_auc
|
|
value: [0.64285714 0.42857143 0.71428571 0.64285714 0.71428571 0.92857143
|
|
0.71428571 0.57142857 0.78571429 0.78571429]
|
|
|
|
mean value: 0.6928571428571428
|
|
|
|
key: train_roc_auc
|
|
value: [0.84920635 0.84126984 0.82539683 0.85714286 0.82539683 0.85714286
|
|
0.83333333 0.88095238 0.84920635 0.85714286]
|
|
|
|
mean value: 0.8476190476190476
|
|
|
|
key: test_jcc
|
|
value: [0.44444444 0.2 0.6 0.54545455 0.5 0.875
|
|
0.6 0.4 0.66666667 0.66666667]
|
|
|
|
mean value: 0.5498232323232323
|
|
|
|
key: train_jcc
|
|
value: [0.75 0.74025974 0.68571429 0.76623377 0.71794872 0.76315789
|
|
0.73076923 0.78873239 0.74666667 0.75 ]
|
|
|
|
mean value: 0.7439482696695447
|
|
|
|
MCC on Blind test: 0.09
|
|
|
|
Accuracy on Blind test: 0.5
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.06620526 0.04390788 0.08739209 0.05115843 0.04113436 0.04162669
|
|
0.05210567 0.04808497 0.05309319 0.05471087]
|
|
|
|
mean value: 0.0539419412612915
|
|
|
|
key: score_time
|
|
value: [0.01065469 0.0107336 0.01117682 0.01046491 0.01068664 0.01112556
|
|
0.01121974 0.01058078 0.0113976 0.01165676]
|
|
|
|
mean value: 0.010969710350036622
|
|
|
|
key: test_mcc
|
|
value: [0.8660254 0.71428571 1. 0.8660254 1. 0.8660254
|
|
0.8660254 0.8660254 1. 1. ]
|
|
|
|
mean value: 0.9044412733207908
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.92857143 0.85714286 1. 0.92857143 1. 0.92857143
|
|
0.92857143 0.92857143 1. 1. ]
|
|
|
|
mean value: 0.95
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.85714286 1. 0.92307692 1. 0.93333333
|
|
0.93333333 0.93333333 1. 1. ]
|
|
|
|
mean value: 0.9513553113553114
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.875 0.85714286 1. 1. 1. 0.875
|
|
0.875 0.875 1. 1. ]
|
|
|
|
mean value: 0.9357142857142857
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.85714286 1. 0.85714286 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9714285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.92857143 0.85714286 1. 0.92857143 1. 0.92857143
|
|
0.92857143 0.92857143 1. 1. ]
|
|
|
|
mean value: 0.9500000000000001
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.75 1. 0.85714286 1. 0.875
|
|
0.875 0.875 1. 1. ]
|
|
|
|
mean value: 0.9107142857142857
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 1.0
|
|
|
|
Accuracy on Blind test: 1.0
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.02775335 0.05391788 0.02138972 0.04810762 0.04875565 0.0398221
|
|
0.04783416 0.04795527 0.04840779 0.02121758]
|
|
|
|
mean value: 0.040516114234924315
|
|
|
|
key: score_time
|
|
value: [0.02279043 0.01194072 0.01184464 0.02045107 0.0215559 0.01637101
|
|
0.0214467 0.01568127 0.02402854 0.01179504]
|
|
|
|
mean value: 0.01779053211212158
|
|
|
|
key: test_mcc
|
|
value: [ 0.42857143 -0.31622777 0.4472136 0.4472136 0.17407766 0.57735027
|
|
0.71428571 0. 0.63245553 0.52223297]
|
|
|
|
mean value: 0.3627172992886314
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.71428571 0.35714286 0.71428571 0.71428571 0.57142857 0.78571429
|
|
0.85714286 0.5 0.78571429 0.71428571]
|
|
|
|
mean value: 0.6714285714285714
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.71428571 0.47058824 0.75 0.75 0.66666667 0.76923077
|
|
0.85714286 0.53333333 0.82352941 0.77777778]
|
|
|
|
mean value: 0.7112554765495942
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.71428571 0.4 0.66666667 0.66666667 0.54545455 0.83333333
|
|
0.85714286 0.5 0.7 0.63636364]
|
|
|
|
mean value: 0.651991341991342
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.57142857 0.85714286 0.85714286 0.85714286 0.71428571
|
|
0.85714286 0.57142857 1. 1. ]
|
|
|
|
mean value: 0.7999999999999999
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.71428571 0.35714286 0.71428571 0.71428571 0.57142857 0.78571429
|
|
0.85714286 0.5 0.78571429 0.71428571]
|
|
|
|
mean value: 0.6714285714285715
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.55555556 0.30769231 0.6 0.6 0.5 0.625
|
|
0.75 0.36363636 0.7 0.63636364]
|
|
|
|
mean value: 0.5638247863247863
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01432347 0.0088129 0.00857687 0.01004863 0.0096302 0.00874901
|
|
0.00938463 0.00930548 0.00913095 0.00847912]
|
|
|
|
mean value: 0.009644126892089844
|
|
|
|
key: score_time
|
|
value: [0.00970125 0.00875878 0.00872755 0.00919008 0.00907755 0.0087111
|
|
0.00946212 0.00932574 0.00867939 0.00881839]
|
|
|
|
mean value: 0.00904519557952881
|
|
|
|
key: test_mcc
|
|
value: [ 0.1490712 -0.31622777 0.1490712 0. 0.28867513 0.71428571
|
|
0.4472136 0. 0.31622777 0.4472136 ]
|
|
|
|
mean value: 0.2195530436880415
|
|
|
|
key: train_mcc
|
|
value: [0.42943789 0.39762767 0.38100038 0.45137812 0.4612481 0.42943789
|
|
0.38332594 0.5095438 0.42943789 0.39762767]
|
|
|
|
mean value: 0.42700653469334915
|
|
|
|
key: test_accuracy
|
|
value: [0.57142857 0.35714286 0.57142857 0.5 0.64285714 0.85714286
|
|
0.71428571 0.5 0.64285714 0.71428571]
|
|
|
|
mean value: 0.6071428571428571
|
|
|
|
key: train_accuracy
|
|
value: [0.71428571 0.6984127 0.69047619 0.72222222 0.73015873 0.71428571
|
|
0.69047619 0.75396825 0.71428571 0.6984127 ]
|
|
|
|
mean value: 0.7126984126984127
|
|
|
|
key: test_fscore
|
|
value: [0.5 0.47058824 0.625 0.53333333 0.61538462 0.85714286
|
|
0.75 0.22222222 0.54545455 0.75 ]
|
|
|
|
mean value: 0.5869125808831691
|
|
|
|
key: train_fscore
|
|
value: [0.72307692 0.70769231 0.69291339 0.74452555 0.73846154 0.72307692
|
|
0.70676692 0.74380165 0.72307692 0.70769231]
|
|
|
|
mean value: 0.7211084426534746
|
|
|
|
key: test_precision
|
|
value: [0.6 0.4 0.55555556 0.5 0.66666667 0.85714286
|
|
0.66666667 0.5 0.75 0.66666667]
|
|
|
|
mean value: 0.6162698412698413
|
|
|
|
key: train_precision
|
|
value: [0.70149254 0.68656716 0.6875 0.68918919 0.71641791 0.70149254
|
|
0.67142857 0.77586207 0.70149254 0.68656716]
|
|
|
|
mean value: 0.7018009680329547
|
|
|
|
key: test_recall
|
|
value: [0.42857143 0.57142857 0.71428571 0.57142857 0.57142857 0.85714286
|
|
0.85714286 0.14285714 0.42857143 0.85714286]
|
|
|
|
mean value: 0.6
|
|
|
|
key: train_recall
|
|
value: [0.74603175 0.73015873 0.6984127 0.80952381 0.76190476 0.74603175
|
|
0.74603175 0.71428571 0.74603175 0.73015873]
|
|
|
|
mean value: 0.7428571428571429
|
|
|
|
key: test_roc_auc
|
|
value: [0.57142857 0.35714286 0.57142857 0.5 0.64285714 0.85714286
|
|
0.71428571 0.5 0.64285714 0.71428571]
|
|
|
|
mean value: 0.6071428571428572
|
|
|
|
key: train_roc_auc
|
|
value: [0.71428571 0.6984127 0.69047619 0.72222222 0.73015873 0.71428571
|
|
0.69047619 0.75396825 0.71428571 0.6984127 ]
|
|
|
|
mean value: 0.7126984126984127
|
|
|
|
key: test_jcc
|
|
value: [0.33333333 0.30769231 0.45454545 0.36363636 0.44444444 0.75
|
|
0.6 0.125 0.375 0.6 ]
|
|
|
|
mean value: 0.43536519036519034
|
|
|
|
key: train_jcc
|
|
value: [0.56626506 0.54761905 0.53012048 0.59302326 0.58536585 0.56626506
|
|
0.54651163 0.59210526 0.56626506 0.54761905]
|
|
|
|
mean value: 0.564115975842606
|
|
|
|
MCC on Blind test: 0.82
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01048803 0.01484013 0.01470304 0.01385427 0.01441598 0.01442957
|
|
0.01361561 0.01406598 0.01326013 0.01453805]
|
|
|
|
mean value: 0.013821077346801759
|
|
|
|
key: score_time
|
|
value: [0.00835991 0.01159859 0.01156473 0.01146817 0.01151705 0.01143265
|
|
0.01151991 0.01145792 0.01147008 0.01149678]
|
|
|
|
mean value: 0.011188578605651856
|
|
|
|
key: test_mcc
|
|
value: [0.74535599 0.28867513 0.63245553 0.52223297 0.8660254 0.74535599
|
|
0.71428571 0.71428571 0.71428571 0.74535599]
|
|
|
|
mean value: 0.6688314158636953
|
|
|
|
key: train_mcc
|
|
value: [0.9369802 0.98425098 0.96825397 0.73251987 0.9369802 0.86248336
|
|
0.95346259 0.9369802 0.84813571 0.79772404]
|
|
|
|
mean value: 0.8957771140682494
|
|
|
|
key: test_accuracy
|
|
value: [0.85714286 0.64285714 0.78571429 0.71428571 0.92857143 0.85714286
|
|
0.85714286 0.85714286 0.85714286 0.85714286]
|
|
|
|
mean value: 0.8214285714285714
|
|
|
|
key: train_accuracy
|
|
value: [0.96825397 0.99206349 0.98412698 0.84920635 0.96825397 0.92857143
|
|
0.97619048 0.96825397 0.92063492 0.88888889]
|
|
|
|
mean value: 0.9444444444444444
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.66666667 0.82352941 0.77777778 0.93333333 0.875
|
|
0.85714286 0.85714286 0.85714286 0.875 ]
|
|
|
|
mean value: 0.8356069094304388
|
|
|
|
key: train_fscore
|
|
value: [0.96774194 0.99212598 0.98412698 0.86896552 0.96875 0.93233083
|
|
0.97560976 0.96774194 0.91525424 0.9 ]
|
|
|
|
mean value: 0.9472647177041439
|
|
|
|
key: test_precision
|
|
value: [1. 0.625 0.7 0.63636364 0.875 0.77777778
|
|
0.85714286 0.85714286 0.85714286 0.77777778]
|
|
|
|
mean value: 0.7963347763347763
|
|
|
|
key: train_precision
|
|
value: [0.98360656 0.984375 0.98412698 0.76829268 0.95384615 0.88571429
|
|
1. 0.98360656 0.98181818 0.81818182]
|
|
|
|
mean value: 0.9343568221368351
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.71428571 1. 1. 1. 1.
|
|
0.85714286 0.85714286 0.85714286 1. ]
|
|
|
|
mean value: 0.9
|
|
|
|
key: train_recall
|
|
value: [0.95238095 1. 0.98412698 1. 0.98412698 0.98412698
|
|
0.95238095 0.95238095 0.85714286 1. ]
|
|
|
|
mean value: 0.9666666666666667
|
|
|
|
key: test_roc_auc
|
|
value: [0.85714286 0.64285714 0.78571429 0.71428571 0.92857143 0.85714286
|
|
0.85714286 0.85714286 0.85714286 0.85714286]
|
|
|
|
mean value: 0.8214285714285715
|
|
|
|
key: train_roc_auc
|
|
value: [0.96825397 0.99206349 0.98412698 0.84920635 0.96825397 0.92857143
|
|
0.97619048 0.96825397 0.92063492 0.88888889]
|
|
|
|
mean value: 0.9444444444444444
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.5 0.7 0.63636364 0.875 0.77777778
|
|
0.75 0.75 0.75 0.77777778]
|
|
|
|
mean value: 0.7231204906204907
|
|
|
|
key: train_jcc
|
|
value: [0.9375 0.984375 0.96875 0.76829268 0.93939394 0.87323944
|
|
0.95238095 0.9375 0.84375 0.81818182]
|
|
|
|
mean value: 0.9023363829503257
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01333761 0.01313019 0.01358414 0.01327586 0.01317477 0.01350617
|
|
0.01306081 0.01340628 0.0130887 0.0129261 ]
|
|
|
|
mean value: 0.013249063491821289
|
|
|
|
key: score_time
|
|
value: [0.01067972 0.01146817 0.01150608 0.01141 0.01147485 0.01149297
|
|
0.0115993 0.01143336 0.01155019 0.01152277]
|
|
|
|
mean value: 0.01141374111175537
|
|
|
|
key: test_mcc
|
|
value: [0.74535599 0.4472136 0.57735027 0.4472136 0.8660254 0.40824829
|
|
0.52223297 1. 0.31622777 0.8660254 ]
|
|
|
|
mean value: 0.6195893284606143
|
|
|
|
key: train_mcc
|
|
value: [0.90659109 1. 0.92354815 0.69451634 0.95250095 0.43437224
|
|
1. 0.96825397 0.81110711 0.9216805 ]
|
|
|
|
mean value: 0.8612570347645649
|
|
|
|
key: test_accuracy
|
|
value: [0.85714286 0.71428571 0.78571429 0.71428571 0.92857143 0.64285714
|
|
0.71428571 1. 0.64285714 0.92857143]
|
|
|
|
mean value: 0.7928571428571429
|
|
|
|
key: train_accuracy
|
|
value: [0.95238095 1. 0.96031746 0.82539683 0.97619048 0.65873016
|
|
1. 0.98412698 0.8968254 0.96031746]
|
|
|
|
mean value: 0.9214285714285714
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.75 0.8 0.66666667 0.92307692 0.44444444
|
|
0.77777778 1. 0.54545455 0.93333333]
|
|
|
|
mean value: 0.7674087024087024
|
|
|
|
key: train_fscore
|
|
value: [0.95384615 1. 0.95867769 0.78846154 0.976 0.48192771
|
|
1. 0.98412698 0.88495575 0.96124031]
|
|
|
|
mean value: 0.8989236135518371
|
|
|
|
key: test_precision
|
|
value: [1. 0.66666667 0.75 0.8 1. 1.
|
|
0.63636364 1. 0.75 0.875 ]
|
|
|
|
mean value: 0.8478030303030303
|
|
|
|
key: train_precision
|
|
value: [0.92537313 1. 1. 1. 0.98387097 1.
|
|
1. 0.98412698 1. 0.93939394]
|
|
|
|
mean value: 0.9832765025591217
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.85714286 0.85714286 0.57142857 0.85714286 0.28571429
|
|
1. 1. 0.42857143 1. ]
|
|
|
|
mean value: 0.7571428571428571
|
|
|
|
key: train_recall
|
|
value: [0.98412698 1. 0.92063492 0.65079365 0.96825397 0.31746032
|
|
1. 0.98412698 0.79365079 0.98412698]
|
|
|
|
mean value: 0.8603174603174603
|
|
|
|
key: test_roc_auc
|
|
value: [0.85714286 0.71428571 0.78571429 0.71428571 0.92857143 0.64285714
|
|
0.71428571 1. 0.64285714 0.92857143]
|
|
|
|
mean value: 0.7928571428571429
|
|
|
|
key: train_roc_auc
|
|
value: [0.95238095 1. 0.96031746 0.82539683 0.97619048 0.65873016
|
|
1. 0.98412698 0.8968254 0.96031746]
|
|
|
|
mean value: 0.9214285714285715
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.6 0.66666667 0.5 0.85714286 0.28571429
|
|
0.63636364 1. 0.375 0.875 ]
|
|
|
|
mean value: 0.651017316017316
|
|
|
|
key: train_jcc
|
|
value: [0.91176471 1. 0.92063492 0.65079365 0.953125 0.31746032
|
|
1. 0.96875 0.79365079 0.92537313]
|
|
|
|
mean value: 0.8441552522750394
|
|
|
|
MCC on Blind test: 0.41
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10463905 0.09315062 0.09086013 0.09352469 0.09143782 0.0925982
|
|
0.09318709 0.09407496 0.09470487 0.09759974]
|
|
|
|
mean value: 0.0945777177810669
|
|
|
|
key: score_time
|
|
value: [0.01460838 0.015522 0.01450491 0.0146203 0.01513147 0.01525116
|
|
0.01474357 0.01484275 0.01491022 0.01478457]
|
|
|
|
mean value: 0.014891934394836426
|
|
|
|
key: test_mcc
|
|
value: [0.8660254 0.57735027 1. 0.8660254 1. 1.
|
|
0.8660254 0.8660254 1. 1. ]
|
|
|
|
mean value: 0.9041451884327381
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.92857143 0.78571429 1. 0.92857143 1. 1.
|
|
0.92857143 0.92857143 1. 1. ]
|
|
|
|
mean value: 0.95
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.8 1. 0.92307692 1. 1.
|
|
0.93333333 0.93333333 1. 1. ]
|
|
|
|
mean value: 0.9523076923076923
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.875 0.75 1. 1. 1. 1. 0.875 0.875 1. 1. ]
|
|
|
|
mean value: 0.9375
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.85714286 1. 0.85714286 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9714285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.92857143 0.78571429 1. 0.92857143 1. 1.
|
|
0.92857143 0.92857143 1. 1. ]
|
|
|
|
mean value: 0.9500000000000001
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.66666667 1. 0.85714286 1. 1.
|
|
0.875 0.875 1. 1. ]
|
|
|
|
mean value: 0.9148809523809524
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 1.0
|
|
|
|
Accuracy on Blind test: 1.0
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03174639 0.03029752 0.04592466 0.03525257 0.04778743 0.04697156
|
|
0.03519297 0.03099775 0.03000784 0.03634834]
|
|
|
|
mean value: 0.037052702903747556
|
|
|
|
key: score_time
|
|
value: [0.02208447 0.01594448 0.02029872 0.02367043 0.02881885 0.02223229
|
|
0.02290559 0.0230298 0.02333617 0.03644538]
|
|
|
|
mean value: 0.02387661933898926
|
|
|
|
key: test_mcc
|
|
value: [0.8660254 0.71428571 1. 0.8660254 1. 0.8660254
|
|
1. 0.8660254 1. 1. ]
|
|
|
|
mean value: 0.917838732942347
|
|
|
|
key: train_mcc
|
|
value: [0.98425098 1. 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9984250984251476
|
|
|
|
key: test_accuracy
|
|
value: [0.92857143 0.85714286 1. 0.92857143 1. 0.92857143
|
|
1. 0.92857143 1. 1. ]
|
|
|
|
mean value: 0.9571428571428572
|
|
|
|
key: train_accuracy
|
|
value: [0.99206349 1. 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9992063492063492
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.85714286 1. 0.92307692 1. 0.93333333
|
|
1. 0.93333333 1. 1. ]
|
|
|
|
mean value: 0.958021978021978
|
|
|
|
key: train_fscore
|
|
value: [0.99212598 1. 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9992125984251968
|
|
|
|
key: test_precision
|
|
value: [0.875 0.85714286 1. 1. 1. 0.875
|
|
1. 0.875 1. 1. ]
|
|
|
|
mean value: 0.9482142857142857
|
|
|
|
key: train_precision
|
|
value: [0.984375 1. 1. 1. 1. 1. 1. 1.
|
|
1. 1. ]
|
|
|
|
mean value: 0.9984375
|
|
|
|
key: test_recall
|
|
value: [1. 0.85714286 1. 0.85714286 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9714285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.92857143 0.85714286 1. 0.92857143 1. 0.92857143
|
|
1. 0.92857143 1. 1. ]
|
|
|
|
mean value: 0.9571428571428572
|
|
|
|
key: train_roc_auc
|
|
value: [0.99206349 1. 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9992063492063492
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.75 1. 0.85714286 1. 0.875
|
|
1. 0.875 1. 1. ]
|
|
|
|
mean value: 0.9232142857142858
|
|
|
|
key: train_jcc
|
|
value: [0.984375 1. 1. 1. 1. 1. 1. 1.
|
|
1. 1. ]
|
|
|
|
mean value: 0.9984375
|
|
|
|
MCC on Blind test: 1.0
|
|
|
|
Accuracy on Blind test: 1.0
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01603222 0.018996 0.01812768 0.01927161 0.04491544 0.03093719
|
|
0.01937819 0.02009773 0.04497838 0.04498601]
|
|
|
|
mean value: 0.027772045135498045
|
|
|
|
key: score_time
|
|
value: [0.01200271 0.01204848 0.01198554 0.01204681 0.01694942 0.01251435
|
|
0.01236963 0.01236773 0.02137852 0.02063894]
|
|
|
|
mean value: 0.01443021297454834
|
|
|
|
key: test_mcc
|
|
value: [0.74535599 0. 0.74535599 0.74535599 0.8660254 0.57735027
|
|
0.52223297 0.57735027 0.74535599 0.57735027]
|
|
|
|
mean value: 0.6101733149220129
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.85714286 0.5 0.85714286 0.85714286 0.92857143 0.78571429
|
|
0.71428571 0.78571429 0.85714286 0.78571429]
|
|
|
|
mean value: 0.7928571428571428
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.875 0.46153846 0.875 0.875 0.93333333 0.8
|
|
0.77777778 0.76923077 0.875 0.8 ]
|
|
|
|
mean value: 0.8041880341880342
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.77777778 0.5 0.77777778 0.77777778 0.875 0.75
|
|
0.63636364 0.83333333 0.77777778 0.75 ]
|
|
|
|
mean value: 0.7455808080808081
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.42857143 1. 1. 1. 0.85714286
|
|
1. 0.71428571 1. 0.85714286]
|
|
|
|
mean value: 0.8857142857142857
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.85714286 0.5 0.85714286 0.85714286 0.92857143 0.78571429
|
|
0.71428571 0.78571429 0.85714286 0.78571429]
|
|
|
|
mean value: 0.7928571428571429
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.77777778 0.3 0.77777778 0.77777778 0.875 0.66666667
|
|
0.63636364 0.625 0.77777778 0.66666667]
|
|
|
|
mean value: 0.6880808080808081
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.23829675 0.24501896 0.24500442 0.21486497 0.25001359 0.24446774
|
|
0.24729133 0.25173736 0.25330758 0.24612641]
|
|
|
|
mean value: 0.2436129093170166
|
|
|
|
key: score_time
|
|
value: [0.01009297 0.00921941 0.00917554 0.00900936 0.00912881 0.00920177
|
|
0.00989652 0.00910735 0.00983119 0.00928211]
|
|
|
|
mean value: 0.009394502639770508
|
|
|
|
key: test_mcc
|
|
value: [0.8660254 0.71428571 1. 0.8660254 1. 0.8660254
|
|
0.8660254 0.8660254 1. 1. ]
|
|
|
|
mean value: 0.9044412733207908
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.92857143 0.85714286 1. 0.92857143 1. 0.92857143
|
|
0.92857143 0.92857143 1. 1. ]
|
|
|
|
mean value: 0.95
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.93333333 0.85714286 1. 0.92307692 1. 0.93333333
|
|
0.93333333 0.93333333 1. 1. ]
|
|
|
|
mean value: 0.9513553113553114
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.875 0.85714286 1. 1. 1. 0.875
|
|
0.875 0.875 1. 1. ]
|
|
|
|
mean value: 0.9357142857142857
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.85714286 1. 0.85714286 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9714285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.92857143 0.85714286 1. 0.92857143 1. 0.92857143
|
|
0.92857143 0.92857143 1. 1. ]
|
|
|
|
mean value: 0.9500000000000001
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.875 0.75 1. 0.85714286 1. 0.875
|
|
0.875 0.875 1. 1. ]
|
|
|
|
mean value: 0.9107142857142857
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 1.0
|
|
|
|
Accuracy on Blind test: 1.0
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.0148778 0.02256608 0.01641893 0.01648545 0.01651692 0.01666284
|
|
0.01663566 0.01669383 0.01656461 0.01718378]
|
|
|
|
mean value: 0.01706058979034424
|
|
|
|
key: score_time
|
|
value: [0.0122633 0.01214314 0.01200151 0.01435137 0.01409698 0.01446939
|
|
0.01430678 0.0145371 0.01454878 0.01470637]
|
|
|
|
mean value: 0.013742470741271972
|
|
|
|
key: test_mcc
|
|
value: [0.74535599 0.52223297 0.74535599 0.74535599 0.74535599 0.74535599
|
|
0.63245553 0.63245553 0.8660254 0.74535599]
|
|
|
|
mean value: 0.7125305390718464
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.85714286 0.71428571 0.85714286 0.85714286 0.85714286 0.85714286
|
|
0.78571429 0.78571429 0.92857143 0.85714286]
|
|
|
|
mean value: 0.8357142857142856
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.6 0.83333333 0.83333333 0.83333333 0.83333333
|
|
0.72727273 0.72727273 0.92307692 0.83333333]
|
|
|
|
mean value: 0.7977622377622378
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.42857143 0.71428571 0.71428571 0.71428571 0.71428571
|
|
0.57142857 0.57142857 0.85714286 0.71428571]
|
|
|
|
mean value: 0.6714285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.85714286 0.71428571 0.85714286 0.85714286 0.85714286 0.85714286
|
|
0.78571429 0.78571429 0.92857143 0.85714286]
|
|
|
|
mean value: 0.8357142857142857
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.42857143 0.71428571 0.71428571 0.71428571 0.71428571
|
|
0.57142857 0.57142857 0.85714286 0.71428571]
|
|
|
|
mean value: 0.6714285714285714
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.0
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02142644 0.01297259 0.01400614 0.01322484 0.02798653 0.03346753
|
|
0.03264356 0.03279185 0.0330112 0.03324366]
|
|
|
|
mean value: 0.02547743320465088
|
|
|
|
key: score_time
|
|
value: [0.01180696 0.01169729 0.01171327 0.01168156 0.02339411 0.02246428
|
|
0.02254462 0.02227879 0.02086663 0.0208807 ]
|
|
|
|
mean value: 0.017932820320129394
|
|
|
|
key: test_mcc
|
|
value: [0.57735027 0.4472136 0.8660254 0.57735027 0.8660254 0.74535599
|
|
0.4472136 0.42857143 1. 0.8660254 ]
|
|
|
|
mean value: 0.6821131361803842
|
|
|
|
key: train_mcc
|
|
value: [0.96825397 0.98425098 0.95250095 0.96825397 0.96825397 0.95250095
|
|
0.98425098 0.96825397 0.96825397 0.96825397]
|
|
|
|
mean value: 0.9683027683029619
|
|
|
|
key: test_accuracy
|
|
value: [0.78571429 0.71428571 0.92857143 0.78571429 0.92857143 0.85714286
|
|
0.71428571 0.71428571 1. 0.92857143]
|
|
|
|
mean value: 0.8357142857142857
|
|
|
|
key: train_accuracy
|
|
value: [0.98412698 0.99206349 0.97619048 0.98412698 0.98412698 0.97619048
|
|
0.99206349 0.98412698 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9841269841269841
|
|
|
|
key: test_fscore
|
|
value: [0.76923077 0.75 0.93333333 0.8 0.93333333 0.875
|
|
0.75 0.71428571 1. 0.93333333]
|
|
|
|
mean value: 0.8458516483516484
|
|
|
|
key: train_fscore
|
|
value: [0.98412698 0.99212598 0.97637795 0.98412698 0.98412698 0.97637795
|
|
0.992 0.98412698 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9841643794525684
|
|
|
|
key: test_precision
|
|
value: [0.83333333 0.66666667 0.875 0.75 0.875 0.77777778
|
|
0.66666667 0.71428571 1. 0.875 ]
|
|
|
|
mean value: 0.8033730158730159
|
|
|
|
key: train_precision
|
|
value: [0.98412698 0.984375 0.96875 0.98412698 0.98412698 0.96875
|
|
1. 0.98412698 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9826636904761904
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.85714286 1. 0.85714286 1. 1.
|
|
0.85714286 0.71428571 1. 1. ]
|
|
|
|
mean value: 0.9
|
|
|
|
key: train_recall
|
|
value: [0.98412698 1. 0.98412698 0.98412698 0.98412698 0.98412698
|
|
0.98412698 0.98412698 0.98412698 0.98412698]
|
|
|
|
mean value:/home/tanu/git/LSHTM_analysis/scripts/ml/./gid_sl.py:148: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
ros_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./gid_sl.py:151: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
ros_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
0.9857142857142857
|
|
|
|
key: test_roc_auc
|
|
value: [0.78571429 0.71428571 0.92857143 0.78571429 0.92857143 0.85714286
|
|
0.71428571 0.71428571 1. 0.92857143]
|
|
|
|
mean value: 0.8357142857142857
|
|
|
|
key: train_roc_auc
|
|
value: [0.98412698 0.99206349 0.97619048 0.98412698 0.98412698 0.97619048
|
|
0.99206349 0.98412698 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9841269841269842
|
|
|
|
key: test_jcc
|
|
value: [0.625 0.6 0.875 0.66666667 0.875 0.77777778
|
|
0.6 0.55555556 1. 0.875 ]
|
|
|
|
mean value: 0.745
|
|
|
|
key: train_jcc
|
|
value: [0.96875 0.984375 0.95384615 0.96875 0.96875 0.95384615
|
|
0.98412698 0.96875 0.96875 0.96875 ]
|
|
|
|
mean value: 0.9688694291819292
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.10278153 0.20039511 0.21716428 0.104352 0.16163802 0.16399074
|
|
0.20072842 0.2223134 0.3059268 0.212677 ]
|
|
|
|
mean value: 0.18919672966003417
|
|
|
|
key: score_time
|
|
value: [0.02314091 0.02282786 0.02277565 0.01210189 0.01189661 0.02072382
|
|
0.02222943 0.02230954 0.01674795 0.02327585]
|
|
|
|
mean value: 0.019802951812744142
|
|
|
|
key: test_mcc
|
|
value: [0.57735027 0.4472136 0.8660254 0.57735027 0.8660254 0.74535599
|
|
0.4472136 0.42857143 0.74535599 0.8660254 ]
|
|
|
|
mean value: 0.6566487354303772
|
|
|
|
key: train_mcc
|
|
value: [0.96825397 0.98425098 0.95250095 0.96825397 0.96825397 0.95250095
|
|
0.98425098 0.96825397 0.98425098 0.96825397]
|
|
|
|
mean value: 0.9699024699027128
|
|
|
|
key: test_accuracy
|
|
value: [0.78571429 0.71428571 0.92857143 0.78571429 0.92857143 0.85714286
|
|
0.71428571 0.71428571 0.85714286 0.92857143]
|
|
|
|
mean value: 0.8214285714285714
|
|
|
|
key: train_accuracy
|
|
value: [0.98412698 0.99206349 0.97619048 0.98412698 0.98412698 0.97619048
|
|
0.99206349 0.98412698 0.99206349 0.98412698]
|
|
|
|
mean value: 0.9849206349206349
|
|
|
|
key: test_fscore
|
|
value: [0.76923077 0.75 0.93333333 0.8 0.93333333 0.875
|
|
0.75 0.71428571 0.875 0.93333333]
|
|
|
|
mean value: 0.8333516483516483
|
|
|
|
key: train_fscore
|
|
value: [0.98412698 0.99212598 0.97637795 0.98412698 0.98412698 0.97637795
|
|
0.992 0.98412698 0.99212598 0.98412698]
|
|
|
|
mean value: 0.9849642794650668
|
|
|
|
key: test_precision
|
|
value: [0.83333333 0.66666667 0.875 0.75 0.875 0.77777778
|
|
0.66666667 0.71428571 0.77777778 0.875 ]
|
|
|
|
mean value: 0.7811507936507937
|
|
|
|
key: train_precision
|
|
value: [0.98412698 0.984375 0.96875 0.98412698 0.98412698 0.96875
|
|
1. 0.98412698 0.984375 0.98412698]
|
|
|
|
mean value: 0.9826884920634921
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.85714286 1. 0.85714286 1. 1.
|
|
0.85714286 0.71428571 1. 1. ]
|
|
|
|
mean value: 0.9
|
|
|
|
key: train_recall
|
|
value: [0.98412698 1. 0.98412698 0.98412698 0.98412698 0.98412698
|
|
0.98412698 0.98412698 1. 0.98412698]
|
|
|
|
mean value: 0.9873015873015872
|
|
|
|
key: test_roc_auc
|
|
value: [0.78571429 0.71428571 0.92857143 0.78571429 0.92857143 0.85714286
|
|
0.71428571 0.71428571 0.85714286 0.92857143]
|
|
|
|
mean value: 0.8214285714285715
|
|
|
|
key: train_roc_auc
|
|
value: [0.98412698 0.99206349 0.97619048 0.98412698 0.98412698 0.97619048
|
|
0.99206349 0.98412698 0.99206349 0.98412698]
|
|
|
|
mean value: 0.984920634920635
|
|
|
|
key: test_jcc
|
|
value: [0.625 0.6 0.875 0.66666667 0.875 0.77777778
|
|
0.6 0.55555556 0.77777778 0.875 ]
|
|
|
|
mean value: 0.7227777777777777
|
|
|
|
key: train_jcc
|
|
value: [0.96875 0.984375 0.95384615 0.96875 0.96875 0.95384615
|
|
0.98412698 0.96875 0.984375 0.96875 ]
|
|
|
|
mean value: 0.9704319291819292
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02898908 0.02259541 0.02249479 0.02232766 0.02245951 0.02295685
|
|
0.02357912 0.02160358 0.03005314 0.02351069]
|
|
|
|
mean value: 0.02405698299407959
|
|
|
|
key: score_time
|
|
value: [0.01279664 0.01184416 0.01169562 0.01201034 0.0119729 0.01210237
|
|
0.01197553 0.02426434 0.01212287 0.01198077]
|
|
|
|
mean value: 0.0132765531539917
|
|
|
|
key: test_mcc
|
|
value: [1. 0.77459667 0.25819889 0.57735027 0. 0.77459667
|
|
0.77459667 0.5 0.09128709 0.73029674]
|
|
|
|
mean value: 0.5480923002918986
|
|
|
|
key: train_mcc
|
|
value: [0.94285714 0.91465912 0.91465912 0.91465912 0.91766294 0.94285714
|
|
0.97182532 0.94285714 0.91587302 0.88730159]
|
|
|
|
mean value: 0.9265211645315969
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.875 0.625 0.75 0.5 0.875
|
|
0.875 0.75 0.57142857 0.85714286]
|
|
|
|
mean value: 0.7678571428571428
|
|
|
|
key: train_accuracy
|
|
value: [0.97142857 0.95714286 0.95714286 0.95714286 0.95714286 0.97142857
|
|
0.98571429 0.97142857 0.95774648 0.94366197]
|
|
|
|
mean value: 0.9629979879275654
|
|
|
|
key: test_fscore
|
|
value: [1. 0.85714286 0.57142857 0.8 0.5 0.85714286
|
|
0.88888889 0.75 0.4 0.88888889]
|
|
|
|
mean value: 0.7513492063492063
|
|
|
|
key: train_fscore
|
|
value: [0.97142857 0.95774648 0.95652174 0.95652174 0.95522388 0.97142857
|
|
0.98550725 0.97142857 0.95774648 0.94285714]
|
|
|
|
mean value: 0.9626410420124032
|
|
|
|
key: test_precision
|
|
value: [1. 1. 0.66666667 0.66666667 0.5 1.
|
|
0.8 0.75 0.5 0.8 ]
|
|
|
|
mean value: 0.7683333333333333
|
|
|
|
key: train_precision
|
|
value: [0.97142857 0.94444444 0.97058824 0.97058824 1. 0.97142857
|
|
1. 0.97142857 0.97142857 0.94285714]
|
|
|
|
mean value: 0.9714192343604108
|
|
|
|
key: test_recall
|
|
value: [1. 0.75 0.5 1. 0.5 0.75
|
|
1. 0.75 0.33333333 1. ]
|
|
|
|
mean value: 0.7583333333333333
|
|
|
|
key: train_recall
|
|
value: [0.97142857 0.97142857 0.94285714 0.94285714 0.91428571 0.97142857
|
|
0.97142857 0.97142857 0.94444444 0.94285714]
|
|
|
|
mean value: 0.9544444444444444
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.875 0.625 0.75 0.5 0.875
|
|
0.875 0.75 0.54166667 0.83333333]
|
|
|
|
mean value: 0.7625
|
|
|
|
key: train_roc_auc
|
|
value: [0.97142857 0.95714286 0.95714286 0.95714286 0.95714286 0.97142857
|
|
0.98571429 0.97142857 0.95793651 0.94365079]
|
|
|
|
mean value: 0.9630158730158731
|
|
|
|
key: test_jcc
|
|
value: [1. 0.75 0.4 0.66666667 0.33333333 0.75
|
|
0.8 0.6 0.25 0.8 ]
|
|
|
|
mean value: 0.635
|
|
|
|
key: train_jcc
|
|
value: [0.94444444 0.91891892 0.91666667 0.91666667 0.91428571 0.94444444
|
|
0.97142857 0.94444444 0.91891892 0.89189189]
|
|
|
|
mean value: 0.9282110682110682
|
|
|
|
MCC on Blind test: 0.41
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.52006674 0.63721442 0.58264375 0.52532816 0.55562091 0.57994127
|
|
0.5116756 0.49916434 0.53129792 0.70804286]
|
|
|
|
mean value: 0.5650995969772339
|
|
|
|
key: score_time
|
|
value: [0.01221442 0.01496315 0.01473856 0.01227617 0.01471972 0.01287651
|
|
0.01455784 0.01221538 0.01225281 0.01236057]
|
|
|
|
mean value: 0.013317513465881347
|
|
|
|
key: test_mcc
|
|
value: [1. 0.5 0.25819889 0.25819889 0.25819889 0.77459667
|
|
0.77459667 0.5 0.41666667 0.73029674]
|
|
|
|
mean value: 0.5470753417731338
|
|
|
|
key: train_mcc
|
|
value: [0.94285714 1. 1. 0.94285714 1. 0.94285714
|
|
1. 0.94285714 0.97220047 1. ]
|
|
|
|
mean value: 0.9743629036957027
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.75 0.625 0.625 0.625 0.875
|
|
0.875 0.75 0.71428571 0.85714286]
|
|
|
|
mean value: 0.7696428571428572
|
|
|
|
key: train_accuracy
|
|
value: [0.97142857 1. 1. 0.97142857 1. 0.97142857
|
|
1. 0.97142857 0.98591549 1. ]
|
|
|
|
mean value: 0.9871629778672032
|
|
|
|
key: test_fscore
|
|
value: [1. 0.75 0.57142857 0.66666667 0.66666667 0.85714286
|
|
0.88888889 0.75 0.66666667 0.88888889]
|
|
|
|
mean value: 0.7706349206349206
|
|
|
|
key: train_fscore
|
|
value: [0.97142857 1. 1. 0.97142857 1. 0.97142857
|
|
1. 0.97142857 0.98630137 1. ]
|
|
|
|
mean value: 0.98720156555773
|
|
|
|
key: test_precision
|
|
value: [1. 0.75 0.66666667 0.6 0.6 1.
|
|
0.8 0.75 0.66666667 0.8 ]
|
|
|
|
mean value: 0.7633333333333333
|
|
|
|
key: train_precision
|
|
value: [0.97142857 1. 1. 0.97142857 1. 0.97142857
|
|
1. 0.97142857 0.97297297 1. ]
|
|
|
|
mean value: 0.9858687258687259
|
|
|
|
key: test_recall
|
|
value: [1. 0.75 0.5 0.75 0.75 0.75
|
|
1. 0.75 0.66666667 1. ]
|
|
|
|
mean value: 0.7916666666666666
|
|
|
|
key: train_recall
|
|
value: [0.97142857 1. 1. 0.97142857 1. 0.97142857
|
|
1. 0.97142857 1. 1. ]
|
|
|
|
mean value: 0.9885714285714285
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.75 0.625 0.625 0.625 0.875
|
|
0.875 0.75 0.70833333 0.83333333]
|
|
|
|
mean value: 0.7666666666666667
|
|
|
|
key: train_roc_auc
|
|
value: [0.97142857 1. 1. 0.97142857 1. 0.97142857
|
|
1. 0.97142857 0.98571429 1. ]
|
|
|
|
mean value: 0.9871428571428572
|
|
|
|
key: test_jcc
|
|
value: [1. 0.6 0.4 0.5 0.5 0.75 0.8 0.6 0.5 0.8 ]
|
|
|
|
mean value: 0.645
|
|
|
|
key: train_jcc
|
|
value: [0.94444444 1. 1. 0.94444444 1. 0.94444444
|
|
1. 0.94444444 0.97297297 1. ]
|
|
|
|
mean value: 0.9750750750750751
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01222157 0.01113677 0.00878263 0.00857139 0.00830865 0.00838494
|
|
0.00841308 0.00853992 0.00893879 0.00886178]
|
|
|
|
mean value: 0.009215950965881348
|
|
|
|
key: score_time
|
|
value: [0.01200318 0.00914693 0.00892782 0.00853491 0.00864744 0.00857234
|
|
0.00861883 0.00864673 0.00854182 0.00876498]
|
|
|
|
mean value: 0.009040498733520507
|
|
|
|
key: test_mcc
|
|
value: [ 0.25819889 0.25819889 0.25819889 0. -0.37796447 0.77459667
|
|
-0.25819889 0. -0.09128709 0.73029674]
|
|
|
|
mean value: 0.15520396261492722
|
|
|
|
key: train_mcc
|
|
value: [0.62882815 0.48154341 0.57166195 0.69954392 0.60395717 0.80829038
|
|
0.46188022 0.73370909 0.71961897 0.63383658]
|
|
|
|
mean value: 0.6342869837555468
|
|
|
|
key: test_accuracy
|
|
value: [0.625 0.625 0.625 0.5 0.375 0.875
|
|
0.375 0.5 0.42857143 0.85714286]
|
|
|
|
mean value: 0.5785714285714285
|
|
|
|
key: train_accuracy
|
|
value: [0.81428571 0.72857143 0.78571429 0.82857143 0.8 0.9
|
|
0.72857143 0.85714286 0.85915493 0.81690141]
|
|
|
|
mean value: 0.8118913480885311
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.66666667 0.57142857 0.5 0.54545455 0.88888889
|
|
0.44444444 0.33333333 0.5 0.88888889]
|
|
|
|
mean value: 0.6005772005772005
|
|
|
|
key: train_fscore
|
|
value: [0.8115942 0.7654321 0.78873239 0.79310345 0.78787879 0.89230769
|
|
0.74666667 0.83870968 0.85714286 0.8115942 ]
|
|
|
|
mean value: 0.8093162028619951
|
|
|
|
key: test_precision
|
|
value: [0.6 0.6 0.66666667 0.5 0.42857143 0.8
|
|
0.4 0.5 0.4 0.8 ]
|
|
|
|
mean value: 0.5695238095238095
|
|
|
|
key: train_precision
|
|
value: [0.82352941 0.67391304 0.77777778 1. 0.83870968 0.96666667
|
|
0.7 0.96296296 0.88235294 0.82352941]
|
|
|
|
mean value: 0.8449441893010905
|
|
|
|
key: test_recall
|
|
value: [0.75 0.75 0.5 0.5 0.75 1.
|
|
0.5 0.25 0.66666667 1. ]
|
|
|
|
mean value: 0.6666666666666666
|
|
|
|
key: train_recall
|
|
value: [0.8 0.88571429 0.8 0.65714286 0.74285714 0.82857143
|
|
0.8 0.74285714 0.83333333 0.8 ]
|
|
|
|
mean value: 0.7890476190476191
|
|
|
|
key: test_roc_auc
|
|
value: [0.625 0.625 0.625 0.5 0.375 0.875
|
|
0.375 0.5 0.45833333 0.83333333]
|
|
|
|
mean value: 0.5791666666666667
|
|
|
|
key: train_roc_auc
|
|
value: [0.81428571 0.72857143 0.78571429 0.82857143 0.8 0.9
|
|
0.72857143 0.85714286 0.85952381 0.81666667]
|
|
|
|
mean value: 0.8119047619047619
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.5 0.4 0.33333333 0.375 0.8
|
|
0.28571429 0.2 0.33333333 0.8 ]
|
|
|
|
mean value: 0.4527380952380953
|
|
|
|
key: train_jcc
|
|
value: [0.68292683 0.62 0.65116279 0.65714286 0.65 0.80555556
|
|
0.59574468 0.72222222 0.75 0.68292683]
|
|
|
|
mean value: 0.6817681765005958
|
|
|
|
MCC on Blind test: 0.82
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.0088594 0.00871515 0.00853276 0.00873947 0.00854588 0.00868845
|
|
0.00872922 0.00873923 0.00860667 0.00853539]
|
|
|
|
mean value: 0.008669161796569824
|
|
|
|
key: score_time
|
|
value: [0.00854993 0.00863361 0.00863743 0.00856376 0.00851822 0.00865054
|
|
0.00859189 0.00869846 0.00846481 0.00855732]
|
|
|
|
mean value: 0.008586597442626954
|
|
|
|
key: test_mcc
|
|
value: [ 0. 0. 0. 0.37796447 0.25819889 0.5
|
|
0. 0. -0.16666667 0.75 ]
|
|
|
|
mean value: 0.17194966960897218
|
|
|
|
key: train_mcc
|
|
value: [0.6350853 0.69282032 0.6614769 0.71545476 0.69282032 0.78301997
|
|
0.63089327 0.69282032 0.67079854 0.60683101]
|
|
|
|
mean value: 0.6782020711253509
|
|
|
|
key: test_accuracy
|
|
value: [0.5 0.5 0.5 0.625 0.625 0.75
|
|
0.5 0.5 0.42857143 0.85714286]
|
|
|
|
mean value: 0.5785714285714285
|
|
|
|
key: train_accuracy
|
|
value: [0.81428571 0.84285714 0.82857143 0.85714286 0.84285714 0.88571429
|
|
0.81428571 0.84285714 0.83098592 0.8028169 ]
|
|
|
|
mean value: 0.8362374245472837
|
|
|
|
key: test_fscore
|
|
value: [0.6 0.6 0.33333333 0.72727273 0.66666667 0.75
|
|
0.6 0.5 0.33333333 0.85714286]
|
|
|
|
mean value: 0.5967748917748917
|
|
|
|
key: train_fscore
|
|
value: [0.82666667 0.85333333 0.83783784 0.86111111 0.85333333 0.89473684
|
|
0.82191781 0.85333333 0.84615385 0.80555556]
|
|
|
|
mean value: 0.8453979667649458
|
|
|
|
key: test_precision
|
|
value: [0.5 0.5 0.5 0.57142857 0.6 0.75
|
|
0.5 0.5 0.33333333 1. ]
|
|
|
|
mean value: 0.5754761904761905
|
|
|
|
key: train_precision
|
|
value: [0.775 0.8 0.79487179 0.83783784 0.8 0.82926829
|
|
0.78947368 0.8 0.78571429 0.78378378]
|
|
|
|
mean value: 0.7995949679101155
|
|
|
|
key: test_recall
|
|
value: [0.75 0.75 0.25 1. 0.75 0.75
|
|
0.75 0.5 0.33333333 0.75 ]
|
|
|
|
mean value: 0.6583333333333333
|
|
|
|
key: train_recall
|
|
value: [0.88571429 0.91428571 0.88571429 0.88571429 0.91428571 0.97142857
|
|
0.85714286 0.91428571 0.91666667 0.82857143]
|
|
|
|
mean value: 0.8973809523809524
|
|
|
|
key: test_roc_auc
|
|
value: [0.5 0.5 0.5 0.625 0.625 0.75
|
|
0.5 0.5 0.41666667 0.875 ]
|
|
|
|
mean value: 0.5791666666666666
|
|
|
|
key: train_roc_auc
|
|
value: [0.81428571 0.84285714 0.82857143 0.85714286 0.84285714 0.88571429
|
|
0.81428571 0.84285714 0.8297619 0.8031746 ]
|
|
|
|
mean value: 0.8361507936507937
|
|
|
|
key: test_jcc
|
|
value: [0.42857143 0.42857143 0.2 0.57142857 0.5 0.6
|
|
0.42857143 0.33333333 0.2 0.75 ]
|
|
|
|
mean value: 0.444047619047619
|
|
|
|
key: train_jcc
|
|
value: [0.70454545 0.74418605 0.72093023 0.75609756 0.74418605 0.80952381
|
|
0.69767442 0.74418605 0.73333333 0.6744186 ]
|
|
|
|
mean value: 0.7329081553727044
|
|
|
|
MCC on Blind test: 0.09
|
|
|
|
Accuracy on Blind test: 0.5
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00916004 0.009094 0.0092001 0.00905657 0.00823712 0.00933886
|
|
0.00938821 0.00961804 0.00989199 0.00914526]
|
|
|
|
mean value: 0.009213018417358398
|
|
|
|
key: score_time
|
|
value: [0.00972176 0.00999951 0.0101099 0.00988507 0.00962853 0.00964975
|
|
0.01003718 0.00978231 0.00975752 0.00999331]
|
|
|
|
mean value: 0.009856486320495605
|
|
|
|
key: test_mcc
|
|
value: [0.25819889 0.25819889 0.25819889 0.25819889 0.25819889 0.5
|
|
0.25819889 0.37796447 0.75 0.41666667]
|
|
|
|
mean value: 0.359382447815886
|
|
|
|
key: train_mcc
|
|
value: [0.43139798 0.57166195 0.54374562 0.45883147 0.54374562 0.48891771
|
|
0.6 0.48650924 0.49285714 0.54972312]
|
|
|
|
mean value: 0.516738984896947
|
|
|
|
key: test_accuracy
|
|
value: [0.625 0.625 0.625 0.625 0.625 0.75
|
|
0.625 0.625 0.85714286 0.71428571]
|
|
|
|
mean value: 0.6696428571428571
|
|
|
|
key: train_accuracy
|
|
value: [0.71428571 0.78571429 0.77142857 0.72857143 0.77142857 0.74285714
|
|
0.8 0.74285714 0.74647887 0.77464789]
|
|
|
|
mean value: 0.7578269617706237
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.66666667 0.57142857 0.57142857 0.57142857 0.75
|
|
0.57142857 0.4 0.85714286 0.75 ]
|
|
|
|
mean value: 0.6376190476190476
|
|
|
|
key: train_fscore
|
|
value: [0.6969697 0.7826087 0.77777778 0.71641791 0.76470588 0.75675676
|
|
0.8 0.73529412 0.75 0.76470588]
|
|
|
|
mean value: 0.7545236719957108
|
|
|
|
key: test_precision
|
|
value: [0.6 0.6 0.66666667 0.66666667 0.66666667 0.75
|
|
0.66666667 1. 0.75 0.75 ]
|
|
|
|
mean value: 0.7116666666666667
|
|
|
|
key: train_precision
|
|
value: [0.74193548 0.79411765 0.75675676 0.75 0.78787879 0.71794872
|
|
0.8 0.75757576 0.75 0.78787879]
|
|
|
|
mean value: 0.7644091938968599
|
|
|
|
key: test_recall
|
|
value: [0.75 0.75 0.5 0.5 0.5 0.75 0.5 0.25 1. 0.75]
|
|
|
|
mean value: 0.625
|
|
|
|
key: train_recall
|
|
value: [0.65714286 0.77142857 0.8 0.68571429 0.74285714 0.8
|
|
0.8 0.71428571 0.75 0.74285714]
|
|
|
|
mean value: 0.7464285714285714
|
|
|
|
key: test_roc_auc
|
|
value: [0.625 0.625 0.625 0.625 0.625 0.75
|
|
0.625 0.625 0.875 0.70833333]
|
|
|
|
mean value: 0.6708333333333334
|
|
|
|
key: train_roc_auc
|
|
value: [0.71428571 0.78571429 0.77142857 0.72857143 0.77142857 0.74285714
|
|
0.8 0.74285714 0.74642857 0.77420635]
|
|
|
|
mean value: 0.7577777777777778
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.5 0.4 0.4 0.4 0.6 0.4 0.25 0.75 0.6 ]
|
|
|
|
mean value: 0.48
|
|
|
|
key: train_jcc
|
|
value: [0.53488372 0.64285714 0.63636364 0.55813953 0.61904762 0.60869565
|
|
0.66666667 0.58139535 0.6 0.61904762]
|
|
|
|
mean value: 0.606709694080776
|
|
|
|
MCC on Blind test: 0.17
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01008749 0.00999761 0.00995326 0.00994325 0.00952315 0.00990009
|
|
0.00955081 0.00994372 0.00989914 0.00985479]
|
|
|
|
mean value: 0.009865331649780273
|
|
|
|
key: score_time
|
|
value: [0.0110445 0.00936389 0.00931382 0.0096035 0.00935316 0.00944185
|
|
0.00971365 0.00931597 0.00903106 0.00951076]
|
|
|
|
mean value: 0.009569215774536132
|
|
|
|
key: test_mcc
|
|
value: [ 1. 0.5 0. 0.5 -0.25819889 0.37796447
|
|
0.5 0.25819889 0.09128709 0.73029674]
|
|
|
|
mean value: 0.3699548309266976
|
|
|
|
key: train_mcc
|
|
value: [0.82857143 0.80032673 0.82992752 0.82992752 0.860309 0.8871639
|
|
0.77651637 0.74316054 0.83214239 0.75346834]
|
|
|
|
mean value: 0.8141513733500897
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.75 0.5 0.75 0.375 0.625
|
|
0.75 0.625 0.57142857 0.85714286]
|
|
|
|
mean value: 0.6803571428571429
|
|
|
|
key: train_accuracy
|
|
value: [0.91428571 0.9 0.91428571 0.91428571 0.92857143 0.94285714
|
|
0.88571429 0.87142857 0.91549296 0.87323944]
|
|
|
|
mean value: 0.9060160965794768
|
|
|
|
key: test_fscore
|
|
value: [1. 0.75 0.33333333 0.75 0.28571429 0.4
|
|
0.75 0.57142857 0.4 0.88888889]
|
|
|
|
mean value: 0.612936507936508
|
|
|
|
key: train_fscore
|
|
value: [0.91428571 0.89855072 0.91176471 0.91176471 0.92537313 0.94117647
|
|
0.89189189 0.86956522 0.91891892 0.86153846]
|
|
|
|
mean value: 0.9044829945345272
|
|
|
|
key: test_precision
|
|
value: [1. 0.75 0.5 0.75 0.33333333 1.
|
|
0.75 0.66666667 0.5 0.8 ]
|
|
|
|
mean value: 0.705
|
|
|
|
key: train_precision
|
|
value: [0.91428571 0.91176471 0.93939394 0.93939394 0.96875 0.96969697
|
|
0.84615385 0.88235294 0.89473684 0.93333333]
|
|
|
|
mean value: 0.9199862231421829
|
|
|
|
key: test_recall
|
|
value: [1. 0.75 0.25 0.75 0.25 0.25
|
|
0.75 0.5 0.33333333 1. ]
|
|
|
|
mean value: 0.5833333333333334
|
|
|
|
key: train_recall
|
|
value: [0.91428571 0.88571429 0.88571429 0.88571429 0.88571429 0.91428571
|
|
0.94285714 0.85714286 0.94444444 0.8 ]
|
|
|
|
mean value: 0.8915873015873016
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.75 0.5 0.75 0.375 0.625
|
|
0.75 0.625 0.54166667 0.83333333]
|
|
|
|
mean value: 0.675
|
|
|
|
key: train_roc_auc
|
|
value: [0.91428571 0.9 0.91428571 0.91428571 0.92857143 0.94285714
|
|
0.88571429 0.87142857 0.91507937 0.87222222]
|
|
|
|
mean value: 0.9058730158730158
|
|
|
|
key: test_jcc
|
|
value: [1. 0.6 0.2 0.6 0.16666667 0.25
|
|
0.6 0.4 0.25 0.8 ]
|
|
|
|
mean value: 0.4866666666666667
|
|
|
|
key: train_jcc
|
|
value: [0.84210526 0.81578947 0.83783784 0.83783784 0.86111111 0.88888889
|
|
0.80487805 0.76923077 0.85 0.75675676]
|
|
|
|
mean value: 0.8264435987285794
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.42219353 0.38538766 0.37766552 0.38039589 0.32598734 0.59394097
|
|
0.33555698 0.37964678 0.39589477 0.39897919]
|
|
|
|
mean value: 0.39956486225128174
|
|
|
|
key: score_time
|
|
value: [0.01248217 0.01235509 0.01225305 0.01246142 0.01255774 0.01329589
|
|
0.0128777 0.01322794 0.0129385 0.01259947]
|
|
|
|
mean value: 0.012704896926879882
|
|
|
|
key: test_mcc
|
|
value: [ 0.5 0.77459667 0.5 0.5 -0.25819889 0.
|
|
0.5 0.5 0.16666667 0.73029674]
|
|
|
|
mean value: 0.39133611895012105
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.75 0.875 0.75 0.75 0.375 0.5
|
|
0.75 0.75 0.57142857 0.85714286]
|
|
|
|
mean value: 0.6928571428571428
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.75 0.85714286 0.75 0.75 0.28571429 0.5
|
|
0.75 0.75 0.57142857 0.88888889]
|
|
|
|
mean value: 0.6853174603174603
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.75 1. 0.75 0.75 0.33333333 0.5
|
|
0.75 0.75 0.5 0.8 ]
|
|
|
|
mean value: 0.6883333333333334
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.75 0.75 0.75 0.75 0.25 0.5
|
|
0.75 0.75 0.66666667 1. ]
|
|
|
|
mean value: 0.6916666666666667
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.875 0.75 0.75 0.375 0.5
|
|
0.75 0.75 0.58333333 0.83333333]
|
|
|
|
mean value: 0.6916666666666667
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.6 0.75 0.6 0.6 0.16666667 0.33333333
|
|
0.6 0.6 0.4 0.8 ]
|
|
|
|
mean value: 0.545
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.0
|
|
|
|
Accuracy on Blind test: 0.5
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01383638 0.01134634 0.01063728 0.01032615 0.0097909 0.01033807
|
|
0.01113963 0.01021552 0.01052022 0.01057673]
|
|
|
|
mean value: 0.010872721672058105
|
|
|
|
key: score_time
|
|
value: [0.01207805 0.00988507 0.00966096 0.00939751 0.00937295 0.00936341
|
|
0.0089221 0.00934076 0.00945401 0.00952339]
|
|
|
|
mean value: 0.009699821472167969
|
|
|
|
key: test_mcc
|
|
value: [0.77459667 1. 0.77459667 1. 0.77459667 1.
|
|
0.77459667 1. 1. 1. ]
|
|
|
|
mean value: 0.9098386676965934
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.875 1. 0.875 1. 0.875 1. 0.875 1. 1. 1. ]
|
|
|
|
mean value: 0.95
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 1. 0.85714286 1. 0.85714286 1.
|
|
0.85714286 1. 1. 1. ]
|
|
|
|
mean value: 0.946031746031746
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.8 1. 1. 1. 1. 1. 1. 1. 1. 1. ]
|
|
|
|
mean value: 0.98
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.75 1. 0.75 1. 0.75 1. 1. 1. ]
|
|
|
|
mean value: 0.925
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 1. 0.875 1. 0.875 1. 0.875 1. 1. 1. ]
|
|
|
|
mean value: 0.95
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 1. 0.75 1. 0.75 1. 0.75 1. 1. 1. ]
|
|
|
|
mean value: 0.905
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 1.0
|
|
|
|
Accuracy on Blind test: 1.0
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.09162545 0.0875473 0.08813334 0.08958077 0.0837934 0.08881235
|
|
0.08904314 0.08884525 0.0853529 0.08809972]
|
|
|
|
mean value: 0.08808336257934571
|
|
|
|
key: score_time
|
|
value: [0.01862025 0.01851726 0.01855969 0.0186305 0.01841116 0.01861596
|
|
0.01849103 0.01809096 0.01862431 0.01818585]
|
|
|
|
mean value: 0.018474698066711426
|
|
|
|
key: test_mcc
|
|
value: [ 0.77459667 1. 0.57735027 0.77459667 0. 0.
|
|
0.77459667 0.25819889 -0.16666667 0.41666667]
|
|
|
|
mean value: 0.4409339166661237
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.875 1. 0.75 0.875 0.5 0.5
|
|
0.875 0.625 0.42857143 0.71428571]
|
|
|
|
mean value: 0.7142857142857143
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 1. 0.66666667 0.88888889 0.5 0.5
|
|
0.88888889 0.57142857 0.33333333 0.75 ]
|
|
|
|
mean value: 0.6988095238095239
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.8 1. 1. 0.8 0.5 0.5
|
|
0.8 0.66666667 0.33333333 0.75 ]
|
|
|
|
mean value: 0.715
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.5 1. 0.5 0.5
|
|
1. 0.5 0.33333333 0.75 ]
|
|
|
|
mean value: 0.7083333333333334
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 1. 0.75 0.875 0.5 0.5
|
|
0.875 0.625 0.41666667 0.70833333]
|
|
|
|
mean value: 0.7125
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 1. 0.5 0.8 0.33333333 0.33333333
|
|
0.8 0.4 0.2 0.6 ]
|
|
|
|
mean value: 0.5766666666666667
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.41
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0095346 0.00932455 0.00948405 0.00941014 0.00933146 0.00964332
|
|
0.00960994 0.00947642 0.0096693 0.00891423]
|
|
|
|
mean value: 0.009439802169799805
|
|
|
|
key: score_time
|
|
value: [0.00931406 0.0092082 0.00925589 0.00923014 0.00934744 0.00957775
|
|
0.00921679 0.00915599 0.00987196 0.00928354]
|
|
|
|
mean value: 0.00934617519378662
|
|
|
|
key: test_mcc
|
|
value: [ 0.25819889 0. 0. 0.57735027 0. -0.25819889
|
|
-0.25819889 0.25819889 0.41666667 0.75 ]
|
|
|
|
mean value: 0.17440169358562926
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.625 0.5 0.5 0.75 0.5 0.375
|
|
0.375 0.625 0.71428571 0.85714286]
|
|
|
|
mean value: 0.5821428571428572
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.6 0.5 0.8 0.5 0.44444444
|
|
0.28571429 0.57142857 0.66666667 0.85714286]
|
|
|
|
mean value: 0.5892063492063492
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.6 0.5 0.5 0.66666667 0.5 0.4
|
|
0.33333333 0.66666667 0.66666667 1. ]
|
|
|
|
mean value: 0.5833333333333334
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.75 0.75 0.5 1. 0.5 0.5
|
|
0.25 0.5 0.66666667 0.75 ]
|
|
|
|
mean value: 0.6166666666666667
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.625 0.5 0.5 0.75 0.5 0.375
|
|
0.375 0.625 0.70833333 0.875 ]
|
|
|
|
mean value: 0.5833333333333334
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.42857143 0.33333333 0.66666667 0.33333333 0.28571429
|
|
0.16666667 0.4 0.5 0.75 ]
|
|
|
|
mean value: 0.43642857142857144
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.0
|
|
|
|
Accuracy on Blind test: 0.5
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.06088781 1.07806635 1.07123256 1.10185742 1.08816147 1.07904243
|
|
1.12222743 1.07182336 1.07926655 1.04409885]
|
|
|
|
mean value: 1.0796664237976075
|
|
|
|
key: score_time
|
|
value: [0.09509015 0.09501624 0.09373927 0.09523463 0.09523296 0.09541678
|
|
0.09467912 0.09589577 0.09356499 0.09200788]
|
|
|
|
mean value: 0.09458777904510499
|
|
|
|
key: test_mcc
|
|
value: [0.77459667 0.77459667 0.77459667 0.77459667 0. 0.
|
|
0.5 0.25819889 0.09128709 0.73029674]
|
|
|
|
mean value: 0.46781694029708437
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.875 0.875 0.875 0.875 0.5 0.5
|
|
0.75 0.625 0.57142857 0.85714286]
|
|
|
|
mean value: 0.7303571428571428
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.88888889 0.85714286 0.88888889 0.5 0.5
|
|
0.75 0.57142857 0.4 0.88888889]
|
|
|
|
mean value: 0.7134126984126985
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.8 0.8 1. 0.8 0.5 0.5
|
|
0.75 0.66666667 0.5 0.8 ]
|
|
|
|
mean value: 0.7116666666666667
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.75 1. 0.5 0.5
|
|
0.75 0.5 0.33333333 1. ]
|
|
|
|
mean value: 0.7333333333333333
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.875 0.875 0.875 0.5 0.5
|
|
0.75 0.625 0.54166667 0.83333333]
|
|
|
|
mean value: 0.725
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.8 0.75 0.8 0.33333333 0.33333333
|
|
0.6 0.4 0.25 0.8 ]
|
|
|
|
mean value: 0.5866666666666667
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.79908037 0.86214375 0.85928059 0.828789 0.82958341 0.90800166
|
|
0.87632132 0.92371035 0.8799963 0.87374616]
|
|
|
|
mean value: 0.8640652894973755
|
|
|
|
key: score_time
|
|
value: [0.21686745 0.18948936 0.22953916 0.2316184 0.22374058 0.22881055
|
|
0.22816801 0.13683963 0.22541785 0.20880628]
|
|
|
|
mean value: 0.21192972660064696
|
|
|
|
key: test_mcc
|
|
value: [ 0.77459667 0.77459667 0.77459667 0.77459667 -0.25819889 0.77459667
|
|
0.5 0.25819889 -0.16666667 0.73029674]
|
|
|
|
mean value: 0.49366134228809716
|
|
|
|
key: train_mcc
|
|
value: [0.97182532 0.97182532 0.97182532 1. 0.97182532 1.
|
|
0.97182532 1. 0.97222222 0.97220047]
|
|
|
|
mean value: 0.9803549266788427
|
|
|
|
key: test_accuracy
|
|
value: [0.875 0.875 0.875 0.875 0.375 0.875
|
|
0.75 0.625 0.42857143 0.85714286]
|
|
|
|
mean value: 0.7410714285714286
|
|
|
|
key: train_accuracy
|
|
value: [0.98571429 0.98571429 0.98571429 1. 0.98571429 1.
|
|
0.98571429 1. 0.98591549 0.98591549]
|
|
|
|
mean value: 0.9900402414486922
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.88888889 0.85714286 0.88888889 0.28571429 0.85714286
|
|
0.75 0.57142857 0.33333333 0.88888889]
|
|
|
|
mean value: 0.721031746031746
|
|
|
|
key: train_fscore
|
|
value: [0.98550725 0.98550725 0.98550725 1. 0.98550725 1.
|
|
0.98550725 1. 0.98591549 0.98550725]
|
|
|
|
mean value: 0.9898958971218615
|
|
|
|
key: test_precision
|
|
value: [0.8 0.8 1. 0.8 0.33333333 1.
|
|
0.75 0.66666667 0.33333333 0.8 ]
|
|
|
|
mean value: 0.7283333333333334
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.75 1. 0.25 0.75
|
|
0.75 0.5 0.33333333 1. ]
|
|
|
|
mean value: 0.7333333333333333
|
|
|
|
key: train_recall
|
|
value: [0.97142857 0.97142857 0.97142857 1. 0.97142857 1.
|
|
0.97142857 1. 0.97222222 0.97142857]
|
|
|
|
mean value: 0.9800793650793651
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.875 0.875 0.875 0.375 0.875
|
|
0.75 0.625 0.41666667 0.83333333]
|
|
|
|
mean value: 0.7375
|
|
|
|
key: train_roc_auc
|
|
value: [0.98571429 0.98571429 0.98571429 1. 0.98571429 1.
|
|
0.98571429 1. 0.98611111 0.98571429]
|
|
|
|
mean value: 0.9900396825396826
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.8 0.75 0.8 0.16666667 0.75
|
|
0.6 0.4 0.2 0.8 ]
|
|
|
|
mean value: 0.6066666666666667
|
|
|
|
key: train_jcc
|
|
value: [0.97142857 0.97142857 0.97142857 1. 0.97142857 1.
|
|
0.97142857 1. 0.97222222 0.97142857]
|
|
|
|
mean value: 0.9800793650793651
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.02411747 0.00868654 0.00863671 0.00892663 0.00904179 0.00923443
|
|
0.00946021 0.00934601 0.00884771 0.0092268 ]
|
|
|
|
mean value: 0.010552430152893066
|
|
|
|
key: score_time
|
|
value: [0.01154542 0.00867844 0.0087924 0.00879693 0.00910354 0.00886774
|
|
0.00883126 0.00899911 0.00942969 0.00934529]
|
|
|
|
mean value: 0.009238982200622558
|
|
|
|
key: test_mcc
|
|
value: [ 0. 0. 0. 0.37796447 0.25819889 0.5
|
|
0. 0. -0.16666667 0.75 ]
|
|
|
|
mean value: 0.17194966960897218
|
|
|
|
key: train_mcc
|
|
value: [0.6350853 0.69282032 0.6614769 0.71545476 0.69282032 0.78301997
|
|
0.63089327 0.69282032 0.67079854 0.60683101]
|
|
|
|
mean value: 0.6782020711253509
|
|
|
|
key: test_accuracy
|
|
value: [0.5 0.5 0.5 0.625 0.625 0.75
|
|
0.5 0.5 0.42857143 0.85714286]
|
|
|
|
mean value: 0.5785714285714285
|
|
|
|
key: train_accuracy
|
|
value: [0.81428571 0.84285714 0.82857143 0.85714286 0.84285714 0.88571429
|
|
0.81428571 0.84285714 0.83098592 0.8028169 ]
|
|
|
|
mean value: 0.8362374245472837
|
|
|
|
key: test_fscore
|
|
value: [0.6 0.6 0.33333333 0.72727273 0.66666667 0.75
|
|
0.6 0.5 0.33333333 0.85714286]
|
|
|
|
mean value: 0.5967748917748917
|
|
|
|
key: train_fscore
|
|
value: [0.82666667 0.85333333 0.83783784 0.86111111 0.85333333 0.89473684
|
|
0.82191781 0.85333333 0.84615385 0.80555556]
|
|
|
|
mean value: 0.8453979667649458
|
|
|
|
key: test_precision
|
|
value: [0.5 0.5 0.5 0.57142857 0.6 0.75
|
|
0.5 0.5 0.33333333 1. ]
|
|
|
|
mean value: 0.5754761904761905
|
|
|
|
key: train_precision
|
|
value: [0.775 0.8 0.79487179 0.83783784 0.8 0.82926829
|
|
0.78947368 0.8 0.78571429 0.78378378]
|
|
|
|
mean value: 0.7995949679101155
|
|
|
|
key: test_recall
|
|
value: [0.75 0.75 0.25 1. 0.75 0.75
|
|
0.75 0.5 0.33333333 0.75 ]
|
|
|
|
mean value: 0.6583333333333333
|
|
|
|
key: train_recall
|
|
value: [0.88571429 0.91428571 0.88571429 0.88571429 0.91428571 0.97142857
|
|
0.85714286 0.91428571 0.91666667 0.82857143]
|
|
|
|
mean value: 0.8973809523809524
|
|
|
|
key: test_roc_auc
|
|
value: [0.5 0.5 0.5 0.625 0.625 0.75
|
|
0.5 0.5 0.41666667 0.875 ]
|
|
|
|
mean value: 0.5791666666666666
|
|
|
|
key: train_roc_auc
|
|
value: [0.81428571 0.84285714 0.82857143 0.85714286 0.84285714 0.88571429
|
|
0.81428571 0.84285714 0.8297619 0.8031746 ]
|
|
|
|
mean value: 0.8361507936507937
|
|
|
|
key: test_jcc
|
|
value: [0.42857143 0.42857143 0.2 0.57142857 0.5 0.6
|
|
0.42857143 0.33333333 0.2 0.75 ]
|
|
|
|
mean value: 0.444047619047619
|
|
|
|
key: train_jcc
|
|
value: [0.70454545 0.74418605 0.72093023 0.75609756 0.74418605 0.80952381
|
|
0.69767442 0.74418605 0.73333333 0.6744186 ]
|
|
|
|
mean value: 0.7329081553727044
|
|
|
|
MCC on Blind test: 0.09
|
|
|
|
Accuracy on Blind test: 0.5
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.04690886 0.03529429 0.03520536 0.03485751 0.05215406 0.03321004
|
|
0.03399205 0.2213099 0.03269053 0.03631282]
|
|
|
|
mean value: 0.05619354248046875
|
|
|
|
key: score_time
|
|
value: [0.01023316 0.0105567 0.01112413 0.0102911 0.01223016 0.01042199
|
|
0.01048231 0.01189733 0.01079249 0.01114678]
|
|
|
|
mean value: 0.01091761589050293
|
|
|
|
key: test_mcc
|
|
value: [0.77459667 1. 0.77459667 1. 0.77459667 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9323790007724451
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.875 1. 0.875 1. 0.875 1. 1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9625
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 1. 0.85714286 1. 0.85714286 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9603174603174603
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.8 1. 1. 1. 1. 1. 1. 1. 1. 1. ]
|
|
|
|
mean value: 0.98
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.75 1. 0.75 1. 1. 1. 1. 1. ]
|
|
|
|
mean value: 0.95
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 1. 0.875 1. 0.875 1. 1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9625
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 1. 0.75 1. 0.75 1. 1. 1. 1. 1. ]
|
|
|
|
mean value: 0.93
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 1.0
|
|
|
|
Accuracy on Blind test: 1.0
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.03269625 0.03927255 0.04035974 0.03941774 0.0393281 0.03936481
|
|
0.03942204 0.03904486 0.03839588 0.04210567]
|
|
|
|
mean value: 0.03894076347351074
|
|
|
|
key: score_time
|
|
value: [0.02378345 0.01394677 0.02182436 0.02240849 0.02304435 0.02370811
|
|
0.01481962 0.01573515 0.02450657 0.02406669]
|
|
|
|
mean value: 0.0207843542098999
|
|
|
|
key: test_mcc
|
|
value: [0.5 0.5 0.37796447 1. 0.77459667 0.57735027
|
|
0.5 0.25819889 0. 0.73029674]
|
|
|
|
mean value: 0.5218407044527719
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.75 0.75 0.625 1. 0.875 0.75
|
|
0.75 0.625 0.57142857 0.85714286]
|
|
|
|
mean value: 0.7553571428571428
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.75 0.75 0.4 1. 0.88888889 0.8
|
|
0.75 0.57142857 0. 0.88888889]
|
|
|
|
mean value: 0.6799206349206349
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.75 0.75 1. 1. 0.8 0.66666667
|
|
0.75 0.66666667 0. 0.8 ]
|
|
|
|
mean value: 0.7183333333333334
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.75 0.75 0.25 1. 1. 1. 0.75 0.5 0. 1. ]
|
|
|
|
mean value: 0.7
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.75 0.625 1. 0.875 0.75
|
|
0.75 0.625 0.5 0.83333333]
|
|
|
|
mean value: 0.7458333333333333
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.6 0.6 0.25 1. 0.8 0.66666667
|
|
0.6 0.4 0. 0.8 ]
|
|
|
|
mean value: 0.5716666666666667
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01487851 0.00880241 0.00839996 0.00833797 0.00825596 0.00826812
|
|
0.00854874 0.00864625 0.00920296 0.00889826]
|
|
|
|
mean value: 0.00922391414642334
|
|
|
|
key: score_time
|
|
value: [0.0090704 0.0087707 0.00860238 0.00835609 0.00839686 0.00832748
|
|
0.00927544 0.00877452 0.00908804 0.0086596 ]
|
|
|
|
mean value: 0.008732151985168458
|
|
|
|
key: test_mcc
|
|
value: [ 0.5 0.5 0.5 0. 0. 0.57735027
|
|
0. -0.25819889 -0.54772256 0.73029674]
|
|
|
|
mean value: 0.200172556527752
|
|
|
|
key: train_mcc
|
|
value: [0.57166195 0.51449576 0.54374562 0.51961524 0.45732956 0.5161854
|
|
0.38152873 0.45883147 0.54920635 0.54972312]
|
|
|
|
mean value: 0.5062323194353353
|
|
|
|
key: test_accuracy
|
|
value: [0.75 0.75 0.75 0.5 0.5 0.75
|
|
0.5 0.375 0.28571429 0.85714286]
|
|
|
|
mean value: 0.6017857142857143
|
|
|
|
key: train_accuracy
|
|
value: [0.78571429 0.75714286 0.77142857 0.75714286 0.72857143 0.75714286
|
|
0.68571429 0.72857143 0.77464789 0.77464789]
|
|
|
|
mean value: 0.7520724346076458
|
|
|
|
key: test_fscore
|
|
value: [0.75 0.75 0.75 0.5 0.5 0.66666667
|
|
0.5 0.28571429 0. 0.88888889]
|
|
|
|
mean value: 0.5591269841269841
|
|
|
|
key: train_fscore
|
|
value: [0.78873239 0.76056338 0.76470588 0.73846154 0.72463768 0.74626866
|
|
0.71794872 0.73972603 0.77777778 0.76470588]
|
|
|
|
mean value: 0.7523527938814902
|
|
|
|
key: test_precision
|
|
value: [0.75 0.75 0.75 0.5 0.5 1.
|
|
0.5 0.33333333 0. 0.8 ]
|
|
|
|
mean value: 0.5883333333333334
|
|
|
|
key: train_precision
|
|
value: [0.77777778 0.75 0.78787879 0.8 0.73529412 0.78125
|
|
0.65116279 0.71052632 0.77777778 0.78787879]
|
|
|
|
mean value: 0.7559546355447339
|
|
|
|
key: test_recall
|
|
value: [0.75 0.75 0.75 0.5 0.5 0.5 0.5 0.25 0. 1. ]
|
|
|
|
mean value: 0.55
|
|
|
|
key: train_recall
|
|
value: [0.8 0.77142857 0.74285714 0.68571429 0.71428571 0.71428571
|
|
0.8 0.77142857 0.77777778 0.74285714]
|
|
|
|
mean value: 0.7520634920634921
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.75 0.75 0.5 0.5 0.75
|
|
0.5 0.375 0.25 0.83333333]
|
|
|
|
mean value: 0.5958333333333333
|
|
|
|
key: train_roc_auc
|
|
value: [0.78571429 0.75714286 0.77142857 0.75714286 0.72857143 0.75714286
|
|
0.68571429 0.72857143 0.77460317 0.77420635]
|
|
|
|
mean value: 0.7520238095238095
|
|
|
|
key: test_jcc
|
|
value: [0.6 0.6 0.6 0.33333333 0.33333333 0.5
|
|
0.33333333 0.16666667 0. 0.8 ]
|
|
|
|
mean value: 0.42666666666666664
|
|
|
|
key: train_jcc
|
|
value: [0.65116279 0.61363636 0.61904762 0.58536585 0.56818182 0.5952381
|
|
0.56 0.58695652 0.63636364 0.61904762]
|
|
|
|
mean value: 0.6035000317610493
|
|
|
|
MCC on Blind test: 1.0
|
|
|
|
Accuracy on Blind test: 1.0
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0105958 0.01289964 0.0131743 0.0132277 0.01292658 0.01252127
|
|
0.01212096 0.01317453 0.01325178 0.01340818]
|
|
|
|
mean value: 0.012730073928833009
|
|
|
|
key: score_time
|
|
value: [0.00837302 0.01156449 0.01140594 0.01190567 0.01152253 0.01143527
|
|
0.01140642 0.01145077 0.01135659 0.01158524]
|
|
|
|
mean value: 0.011200594902038574
|
|
|
|
key: test_mcc
|
|
value: [0.77459667 0.77459667 0. 0.57735027 0.25819889 0.25819889
|
|
0.77459667 0.5 0.09128709 0.73029674]
|
|
|
|
mean value: 0.4739121892666147
|
|
|
|
key: train_mcc
|
|
value: [0.97182532 0.97182532 0.94285714 0.91766294 0.91766294 0.91766294
|
|
0.94440028 0.97182532 0.97222222 0.94365079]
|
|
|
|
mean value: 0.9471595194202586
|
|
|
|
key: test_accuracy
|
|
value: [0.875 0.875 0.5 0.75 0.625 0.625
|
|
0.875 0.75 0.57142857 0.85714286]
|
|
|
|
mean value: 0.7303571428571428
|
|
|
|
key: train_accuracy
|
|
value: [0.98571429 0.98571429 0.97142857 0.95714286 0.95714286 0.95714286
|
|
0.97142857 0.98571429 0.98591549 0.97183099]
|
|
|
|
mean value: 0.9729175050301812
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.85714286 0.33333333 0.8 0.57142857 0.66666667
|
|
0.88888889 0.75 0.4 0.88888889]
|
|
|
|
mean value: 0.7045238095238096
|
|
|
|
key: train_fscore
|
|
value: [0.98591549 0.98550725 0.97142857 0.95890411 0.95522388 0.95890411
|
|
0.97058824 0.98550725 0.98591549 0.97142857]
|
|
|
|
mean value: 0.9729322956595473
|
|
|
|
key: test_precision
|
|
value: [0.8 1. 0.5 0.66666667 0.66666667 0.6
|
|
0.8 0.75 0.5 0.8 ]
|
|
|
|
mean value: 0.7083333333333334
|
|
|
|
key: train_precision
|
|
value: [0.97222222 1. 0.97142857 0.92105263 1. 0.92105263
|
|
1. 1. 1. 0.97142857]
|
|
|
|
mean value: 0.975718462823726
|
|
|
|
key: test_recall
|
|
value: [1. 0.75 0.25 1. 0.5 0.75
|
|
1. 0.75 0.33333333 1. ]
|
|
|
|
mean value: 0.7333333333333333
|
|
|
|
key: train_recall
|
|
value: [1. 0.97142857 0.97142857 1. 0.91428571 1.
|
|
0.94285714 0.97142857 0.97222222 0.97142857]
|
|
|
|
mean value: 0.9715079365079365
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.875 0.5 0.75 0.625 0.625
|
|
0.875 0.75 0.54166667 0.83333333]
|
|
|
|
mean value: 0.725
|
|
|
|
key: train_roc_auc
|
|
value: [0.98571429 0.98571429 0.97142857 0.95714286 0.95714286 0.95714286
|
|
0.97142857 0.98571429 0.98611111 0.9718254 ]
|
|
|
|
mean value: 0.972936507936508
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.75 0.2 0.66666667 0.4 0.5
|
|
0.8 0.6 0.25 0.8 ]
|
|
|
|
mean value: 0.5766666666666667
|
|
|
|
key: train_jcc
|
|
value: [0.97222222 0.97142857 0.94444444 0.92105263 0.91428571 0.92105263
|
|
0.94285714 0.97142857 0.97222222 0.94444444]
|
|
|
|
mean value: 0.9475438596491228
|
|
|
|
MCC on Blind test: 0.41
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01302481 0.01203656 0.01209736 0.01269269 0.01250601 0.01253843
|
|
0.01243162 0.01224613 0.01304078 0.01207447]
|
|
|
|
mean value: 0.012468886375427247
|
|
|
|
key: score_time
|
|
value: [0.01084328 0.01141644 0.0114181 0.01162481 0.01151371 0.01179433
|
|
0.01152134 0.01150966 0.01145768 0.01138949]
|
|
|
|
mean value: 0.011448884010314941
|
|
|
|
key: test_mcc
|
|
value: [1. 0.25819889 0.57735027 0.57735027 0.25819889 0.
|
|
0.77459667 0.25819889 1. 0.73029674]
|
|
|
|
mean value: 0.5434190620202439
|
|
|
|
key: train_mcc
|
|
value: [1. 0.79240582 0.47756693 0.81649658 1. 0.74535599
|
|
0.8660254 0.8340361 1. 0.89282857]
|
|
|
|
mean value: 0.8424715394204912
|
|
|
|
key: test_accuracy
|
|
value: [1. 0.625 0.75 0.75 0.625 0.5
|
|
0.875 0.625 1. 0.85714286]
|
|
|
|
mean value: 0.7607142857142857
|
|
|
|
key: train_accuracy
|
|
value: [1. 0.88571429 0.68571429 0.9 1. 0.85714286
|
|
0.92857143 0.91428571 1. 0.94366197]
|
|
|
|
mean value: 0.9115090543259557
|
|
|
|
key: test_fscore
|
|
value: [1. 0.66666667 0.8 0.8 0.66666667 0.6
|
|
0.88888889 0.57142857 1. 0.88888889]
|
|
|
|
mean value: 0.7882539682539682
|
|
|
|
key: train_fscore
|
|
value: [1. 0.8974359 0.76086957 0.90909091 1. 0.875
|
|
0.93333333 0.90909091 1. 0.93939394]
|
|
|
|
mean value: 0.922421455356238
|
|
|
|
key: test_precision
|
|
value: [1. 0.6 0.66666667 0.66666667 0.6 0.5
|
|
0.8 0.66666667 1. 0.8 ]
|
|
|
|
mean value: 0.73
|
|
|
|
key: train_precision
|
|
value: [1. 0.81395349 0.61403509 0.83333333 1. 0.77777778
|
|
0.875 0.96774194 1. 1. ]
|
|
|
|
mean value: 0.8881841622686374
|
|
|
|
key: test_recall
|
|
value: [1. 0.75 1. 1. 0.75 0.75 1. 0.5 1. 1. ]
|
|
|
|
mean value: 0.875
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 0.85714286 1. 0.88571429]
|
|
|
|
mean value: 0.9742857142857143
|
|
|
|
key: test_roc_auc
|
|
value: [1. 0.625 0.75 0.75 0.625 0.5
|
|
0.875 0.625 1. 0.83333333]
|
|
|
|
mean value: 0.7583333333333333
|
|
|
|
key: train_roc_auc
|
|
value: [1. 0.88571429 0.68571429 0.9 1. 0.85714286
|
|
0.92857143 0.91428571 1. 0.94285714]
|
|
|
|
mean value: 0.9114285714285715
|
|
|
|
key: test_jcc
|
|
value: [1. 0.5 0.66666667 0.66666667 0.5 0.42857143
|
|
0.8 0.4 1. 0.8 ]
|
|
|
|
mean value: 0.6761904761904762
|
|
|
|
key: train_jcc
|
|
value: [1. 0.81395349 0.61403509 0.83333333 1. 0.77777778
|
|
0.875 0.83333333 1. 0.88571429]
|
|
|
|
mean value: 0.8633147306250122
|
|
|
|
MCC on Blind test: 0.61
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.09036541 0.08348179 0.08240795 0.08388019 0.08449244 0.07931352
|
|
0.08360267 0.08131385 0.08257937 0.08250093]
|
|
|
|
mean value: 0.08339381217956543
|
|
|
|
key: score_time
|
|
value: [0.01578093 0.01575756 0.01484179 0.01599073 0.01581931 0.01575685
|
|
0.01468801 0.01461697 0.01501131 0.0157845 ]
|
|
|
|
mean value: 0.015404796600341797
|
|
|
|
key: test_mcc
|
|
value: [0.77459667 0.77459667 0.77459667 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9323790007724451
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.875 0.875 0.875 1. 1. 1. 1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9625
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.88888889 0.85714286 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9634920634920635
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.8 0.8 1. 1. 1. 1. 1. 1. 1. 1. ]
|
|
|
|
mean value: 0.96
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.75 1. 1. 1. 1. 1. 1. 1. ]
|
|
|
|
mean value: 0.975
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.875 0.875 1. 1. 1. 1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9625
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.8 0.75 1. 1. 1. 1. 1. 1. 1. ]
|
|
|
|
mean value: 0.935
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.82
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0357008 0.02489686 0.04498053 0.03115368 0.03102827 0.04373121
|
|
0.04515457 0.0286448 0.04884458 0.0265274 ]
|
|
|
|
mean value: 0.03606626987457275
|
|
|
|
key: score_time
|
|
value: [0.01645994 0.01881194 0.02027988 0.02257514 0.02798772 0.0293231
|
|
0.02213931 0.02669263 0.02155805 0.02561426]
|
|
|
|
mean value: 0.02314419746398926
|
|
|
|
key: test_mcc
|
|
value: [0.77459667 1. 0.77459667 1. 0.77459667 1.
|
|
1. 0.77459667 1. 1. ]
|
|
|
|
mean value: 0.9098386676965934
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.875 1. 0.875 1. 0.875 1. 1. 0.875 1. 1. ]
|
|
|
|
mean value: 0.95
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 1. 0.85714286 1. 0.85714286 1.
|
|
1. 0.85714286 1. 1. ]
|
|
|
|
mean value: 0.946031746031746
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.8 1. 1. 1. 1. 1. 1. 1. 1. 1. ]
|
|
|
|
mean value: 0.98
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.75 1. 0.75 1. 1. 0.75 1. 1. ]
|
|
|
|
mean value: 0.925
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 1. 0.875 1. 0.875 1. 1. 0.875 1. 1. ]
|
|
|
|
mean value: 0.95
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 1. 0.75 1. 0.75 1. 1. 0.75 1. 1. ]
|
|
|
|
mean value: 0.905
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 1.0
|
|
|
|
Accuracy on Blind test: 1.0
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01824498 0.01467395 0.01475954 0.01525283 0.01523209 0.01748681
|
|
0.01516032 0.01657128 0.01515055 0.01515698]
|
|
|
|
mean value: 0.015768933296203613
|
|
|
|
key: score_time
|
|
value: [0.01170707 0.01118374 0.01194334 0.01181722 0.01187658 0.01180911
|
|
0.01189971 0.01253724 0.01190686 0.01177239]
|
|
|
|
mean value: 0.01184532642364502
|
|
|
|
key: test_mcc
|
|
value: [ 0. 0. 0.57735027 0.57735027 0.25819889 0.
|
|
0.25819889 0.57735027 -0.54772256 -0.09128709]
|
|
|
|
mean value: 0.1609438936640506
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.5 0.5 0.75 0.75 0.625 0.5
|
|
0.625 0.75 0.28571429 0.42857143]
|
|
|
|
mean value: 0.5714285714285714
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.6 0.5 0.66666667 0.8 0.66666667 0.5
|
|
0.57142857 0.66666667 0. 0.33333333]
|
|
|
|
mean value: 0.5304761904761904
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.5 0.5 1. 0.66666667 0.6 0.5
|
|
0.66666667 1. 0. 0.5 ]
|
|
|
|
mean value: 0.5933333333333333
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.75 0.5 0.5 1. 0.75 0.5 0.5 0.5 0. 0.25]
|
|
|
|
mean value: 0.525
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.5 0.5 0.75 0.75 0.625 0.5
|
|
0.625 0.75 0.25 0.45833333]
|
|
|
|
mean value: 0.5708333333333333
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.42857143 0.33333333 0.5 0.66666667 0.5 0.33333333
|
|
0.4 0.5 0. 0.2 ]
|
|
|
|
mean value: 0.3861904761904762
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.41
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.14340568 0.11175632 0.1094327 0.10973072 0.11174297 0.11086988
|
|
0.12684155 0.10878468 0.10972738 0.10989308]
|
|
|
|
mean value: 0.11521849632263184
|
|
|
|
key: score_time
|
|
value: [0.00917768 0.00896311 0.00931168 0.00914812 0.0098567 0.00918627
|
|
0.00914383 0.00896716 0.00933671 0.00893211]
|
|
|
|
mean value: 0.009202337265014649
|
|
|
|
key: test_mcc
|
|
value: [0.77459667 1. 0.77459667 1. 0.77459667 1.
|
|
0.77459667 1. 0.75 1. ]
|
|
|
|
mean value: 0.8848386676965934
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.875 1. 0.875 1. 0.875 1.
|
|
0.875 1. 0.85714286 1. ]
|
|
|
|
mean value: 0.9357142857142857
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 1. 0.85714286 1. 0.85714286 1.
|
|
0.85714286 1. 0.85714286 1. ]
|
|
|
|
mean value: 0.9317460317460318
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.8 1. 1. 1. 1. 1. 1. 1. 0.75 1. ]
|
|
|
|
mean value: 0.955
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 1. 0.75 1. 0.75 1. 0.75 1. 1. 1. ]
|
|
|
|
mean value: 0.925
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 1. 0.875 1. 0.875 1. 0.875 1. 0.875 1. ]
|
|
|
|
mean value: 0.9375
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.8 1. 0.75 1. 0.75 1. 0.75 1. 0.75 1. ]
|
|
|
|
mean value: 0.88
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 1.0
|
|
|
|
Accuracy on Blind test: 1.0
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
[0.00942969 0.01349497 0.01381683 0.01315737 0.01314712 0.01347995
|
|
0.01326299 0.01326776 0.01324487 0.01427436]
|
|
|
|
mean value: 0.013057589530944824
|
|
|
|
key: score_time
|
|
value: [0.00895739 0.01176214 0.01184082 0.0119319 0.01179528 0.01169872
|
|
0.01167274 0.01186228 0.01510572 0.01546717]
|
|
|
|
mean value: 0.012209415435791016
|
|
|
|
key: test_mcc
|
|
value: [ 0. 0. 0.25819889 -0.57735027 0. 0.25819889
|
|
0. 0. -0.41666667 -0.41666667]
|
|
|
|
mean value: -0.0894285823028637
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.5 0.5 0.625 0.25 0.5 0.625
|
|
0.5 0.5 0.28571429 0.28571429]
|
|
|
|
mean value: 0.45714285714285713
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.5 0.33333333 0.57142857 0.4 0.5 0.66666667
|
|
0.5 0.33333333 0.28571429 0.28571429]
|
|
|
|
mean value: 0.43761904761904763
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.5 0.5 0.66666667 0.33333333 0.5 0.6
|
|
0.5 0.5 0.25 0.33333333]
|
|
|
|
mean value: 0.4683333333333333
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.5 0.25 0.5 0.5 0.5 0.75
|
|
0.5 0.25 0.33333333 0.25 ]
|
|
|
|
mean value: 0.43333333333333335
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.5 0.5 0.625 0.25 0.5 0.625
|
|
0.5 0.5 0.29166667 0.29166667]
|
|
|
|
mean value: 0.4583333333333333
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.33333333 0.2 0.4 0.25 0.33333333 0.5
|
|
0.33333333 0.2 0.16666667 0.16666667]
|
|
|
|
mean value: 0.28833333333333333
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.0
|
|
|
|
Accuracy on Blind test: 0.5
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.02254224 0.01239038 0.01327562 0.02693105 0.01684093 0.01240849
|
|
0.01239514 0.03621268 0.03258634 0.03197312]
|
|
|
|
mean value: 0.02175559997558594
|
|
|
|
key: score_time
|
|
value: [0.01179028 0.01146269 0.01154327 0.01174164 0.01170254 0.01150298
|
|
0.01150703 0.0206039 0.02236962 0.01990366]
|
|
|
|
mean value: 0.014412760734558105
|
|
|
|
key: test_mcc
|
|
value: [0.77459667 0.5 0.25819889 1. 0.25819889 0.5
|
|
0.5 0.5 0.75 0.73029674]
|
|
|
|
mean value: 0.5771291192076027
|
|
|
|
key: train_mcc
|
|
value: [1. 0.97182532 0.97182532 0.97182532 1. 0.97182532
|
|
1. 0.97182532 1. 0.97220047]
|
|
|
|
mean value: 0.9831327044566206
|
|
|
|
key: test_accuracy
|
|
value: [0.875 0.75 0.625 1. 0.625 0.75
|
|
0.75 0.75 0.85714286 0.85714286]
|
|
|
|
mean value: 0.7839285714285714
|
|
|
|
key: train_accuracy
|
|
value: [1. 0.98571429 0.98571429 0.98571429 1. 0.98571429
|
|
1. 0.98571429 1. 0.98591549]
|
|
|
|
mean value: 0.9914486921529175
|
|
|
|
key: test_fscore
|
|
value: [0.88888889 0.75 0.57142857 1. 0.66666667 0.75
|
|
0.75 0.75 0.85714286 0.88888889]
|
|
|
|
mean value: 0.7873015873015873
|
|
|
|
key: train_fscore
|
|
value: [1. 0.98550725 0.98550725 0.98550725 1. 0.98550725
|
|
1. 0.98550725 1. 0.98550725]
|
|
|
|
mean value: 0.9913043478260869
|
|
|
|
key: test_precision
|
|
value: [0.8 0.75 0.66666667 1. 0.6 0.75
|
|
0.75 0.75 0.75 0.8 ]
|
|
|
|
mean value: 0.7616666666666667
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.75 0.5 1. 0.75 0.75 0.75 0.75 1. 1. ]
|
|
|
|
mean value: 0.825
|
|
|
|
key: train_recall
|
|
value: [1. 0.97142857 0.97142857 0.97142857 1. 0.97142857
|
|
1. 0.97142857 1. 0.97142857]
|
|
|
|
mean value: 0.9828571428571429
|
|
|
|
key: test_roc_auc
|
|
value: [0.875 0.75 0.625 1. 0.625 0.75
|
|
0.75 0.75 0.875 0.83333333]
|
|
|
|
mean value: 0.7833333333333333
|
|
|
|
key: train_roc_auc
|
|
value: [1. 0.98571429 0.98571429 0.98571429 1. 0.98571429
|
|
1. 0.98571429 1. 0.98571429]
|
|
|
|
mean value: 0.9914285714285714
|
|
|
|
key: test_jcc
|
|
value: [0.8 0.6 0.4 1. 0.5 0.6 0.6 0.6 0.75 0.8 ]
|
|
|
|
mean value: 0.665
|
|
|
|
key: train_jcc
|
|
value: [1. 0.97142857 0.97142857 0.97142857 1. 0.97142857
|
|
1. 0.97142857 1. 0.97142857]
|
|
|
|
mean value: 0.9828571428571429
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: /home/tanu/git/LSHTM_analysis/scripts/ml/./gid_sl.py:168: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rus_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./gid_sl.py:171: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rus_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.11865163 0.11227584 0.10895681 0.11447453 0.1230824 0.10879159
|
|
0.10806894 0.12800884 0.14051032 0.13082147]
|
|
|
|
mean value: 0.11936423778533936
|
|
|
|
key: score_time
|
|
value: [0.0119226 0.01184297 0.01192427 0.01190925 0.01179338 0.01192069
|
|
0.01186323 0.01195812 0.01183772 0.01185822]
|
|
|
|
mean value: 0.011883044242858886
|
|
|
|
key: test_mcc
|
|
value: [0.57735027 0.5 0.25819889 0.57735027 0.25819889 0.5
|
|
0.5 0.5 1. 1. ]
|
|
|
|
mean value: 0.5671098317873574
|
|
|
|
key: train_mcc
|
|
value: [1. 0.97182532 0.97182532 1. 1. 0.97182532
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9915475947422651
|
|
|
|
key: test_accuracy
|
|
value: [0.75 0.75 0.625 0.75 0.625 0.75 0.75 0.75 1. 1. ]
|
|
|
|
mean value: 0.775
|
|
|
|
key: train_accuracy
|
|
value: [1. 0.98571429 0.98571429 1. 1. 0.98571429
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9957142857142858
|
|
|
|
key: test_fscore
|
|
value: [0.8 0.75 0.57142857 0.66666667 0.66666667 0.75
|
|
0.75 0.75 1. 1. ]
|
|
|
|
mean value: 0.7704761904761904
|
|
|
|
key: train_fscore
|
|
value: [1. 0.98550725 0.98550725 1. 1. 0.98550725
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9956521739130435
|
|
|
|
key: test_precision
|
|
value: [0.66666667 0.75 0.66666667 1. 0.6 0.75
|
|
0.75 0.75 1. 1. ]
|
|
|
|
mean value: 0.7933333333333333
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.75 0.5 0.5 0.75 0.75 0.75 0.75 1. 1. ]
|
|
|
|
mean value: 0.775
|
|
|
|
key: train_recall
|
|
value: [1. 0.97142857 0.97142857 1. 1. 0.97142857
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9914285714285714
|
|
|
|
key: test_roc_auc
|
|
value: [0.75 0.75 0.625 0.75 0.625 0.75 0.75 0.75 1. 1. ]
|
|
|
|
mean value: 0.775
|
|
|
|
key: train_roc_auc
|
|
value: [1. 0.98571429 0.98571429 1. 1. 0.98571429
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9957142857142858
|
|
|
|
key: test_jcc
|
|
value: [0.66666667 0.6 0.4 0.5 0.5 0.6
|
|
0.6 0.6 1. 1. ]
|
|
|
|
mean value: 0.6466666666666666
|
|
|
|
key: train_jcc
|
|
value: [1. 0.97142857 0.97142857 1. 1. 0.97142857
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9914285714285714
|
|
|
|
MCC on Blind test: 0.67
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Logistic Regression
|
|
Model func: LogisticRegression(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegression(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03346944 0.02624464 0.0318985 0.04968739 0.02914023 0.0289185
|
|
0.02685785 0.02582574 0.04446888 0.03540468]
|
|
|
|
mean value: 0.033191585540771486
|
|
|
|
key: score_time
|
|
value: [0.01189733 0.01162148 0.01215982 0.01187849 0.0119226 0.01188231
|
|
0.01174784 0.01169324 0.01198626 0.01169777]
|
|
|
|
mean value: 0.011848711967468261
|
|
|
|
key: test_mcc
|
|
value: [0.42857143 0.57735027 0.63245553 0.8660254 0.57735027 0.74535599
|
|
0.74535599 0.57735027 0.8660254 0.63245553]
|
|
|
|
mean value: 0.6648296092776395
|
|
|
|
key: train_mcc
|
|
value: [0.92075092 0.9047619 0.87345612 0.90521816 0.93650794 0.87345612
|
|
0.88900089 0.88900089 0.93650794 0.95250095]
|
|
|
|
mean value: 0.9081161839191141
|
|
|
|
key: test_accuracy
|
|
value: [0.71428571 0.78571429 0.78571429 0.92857143 0.78571429 0.85714286
|
|
0.85714286 0.78571429 0.92857143 0.78571429]
|
|
|
|
mean value: 0.8214285714285714
|
|
|
|
key: train_accuracy
|
|
value: [0.96031746 0.95238095 0.93650794 0.95238095 0.96825397 0.93650794
|
|
0.94444444 0.94444444 0.96825397 0.97619048]
|
|
|
|
mean value: 0.9539682539682539
|
|
|
|
key: test_fscore
|
|
value: [0.71428571 0.76923077 0.82352941 0.92307692 0.8 0.875
|
|
0.875 0.8 0.92307692 0.82352941]
|
|
|
|
mean value: 0.8326729153199741
|
|
|
|
key: train_fscore
|
|
value: [0.96 0.95238095 0.9375 0.9516129 0.96825397 0.9375
|
|
0.94488189 0.94488189 0.96825397 0.97637795]
|
|
|
|
mean value: 0.954164352439816
|
|
|
|
key: test_precision
|
|
value: [0.71428571 0.83333333 0.7 1. 0.75 0.77777778
|
|
0.77777778 0.75 1. 0.7 ]
|
|
|
|
mean value: 0.8003174603174603
|
|
|
|
key: train_precision
|
|
value: [0.96774194 0.95238095 0.92307692 0.96721311 0.96825397 0.92307692
|
|
0.9375 0.9375 0.96825397 0.96875 ]
|
|
|
|
mean value: 0.9513747785280704
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.71428571 1. 0.85714286 0.85714286 1.
|
|
1. 0.85714286 0.85714286 1. ]
|
|
|
|
mean value: 0.8857142857142857
|
|
|
|
key: train_recall
|
|
value: [0.95238095 0.95238095 0.95238095 0.93650794 0.96825397 0.95238095
|
|
0.95238095 0.95238095 0.96825397 0.98412698]
|
|
|
|
mean value: 0.9571428571428571
|
|
|
|
key: test_roc_auc
|
|
value: [0.71428571 0.78571429 0.78571429 0.92857143 0.78571429 0.85714286
|
|
0.85714286 0.78571429 0.92857143 0.78571429]
|
|
|
|
mean value: 0.8214285714285715
|
|
|
|
key: train_roc_auc
|
|
value: [0.96031746 0.95238095 0.93650794 0.95238095 0.96825397 0.93650794
|
|
0.94444444 0.94444444 0.96825397 0.97619048]
|
|
|
|
mean value: 0.9539682539682539
|
|
|
|
key: test_jcc
|
|
value: [0.55555556 0.625 0.7 0.85714286 0.66666667 0.77777778
|
|
0.77777778 0.66666667 0.85714286 0.7 ]
|
|
|
|
mean value: 0.7183730158730158
|
|
|
|
key: train_jcc
|
|
value: [0.92307692 0.90909091 0.88235294 0.90769231 0.93846154 0.88235294
|
|
0.89552239 0.89552239 0.93846154 0.95384615]
|
|
|
|
mean value: 0.9126380029101715
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Logistic RegressionCV
|
|
Model func: LogisticRegressionCV(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LogisticRegressionCV(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.70358634 0.80011916 0.68121314 0.6453855 0.65406132 0.78625679
|
|
0.69044399 0.66302991 0.7690618 0.63810492]
|
|
|
|
mean value: 0.7031262874603271
|
|
|
|
key: score_time
|
|
value: [0.01948142 0.01645517 0.0154233 0.01446366 0.0120368 0.0143435
|
|
0.0146184 0.01605535 0.01506495 0.0145781 ]
|
|
|
|
mean value: 0.015252065658569337
|
|
|
|
key: test_mcc
|
|
value: [0.1490712 0.57735027 0.8660254 1. 0.63245553 0.8660254
|
|
0.8660254 0.74535599 1. 0.52223297]
|
|
|
|
mean value: 0.7224542171443628
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 0.98425098 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9984250984251476
|
|
|
|
key: test_accuracy
|
|
value: [0.57142857 0.78571429 0.92857143 1. 0.78571429 0.92857143
|
|
0.92857143 0.85714286 1. 0.71428571]
|
|
|
|
mean value: 0.85
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 0.99206349 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9992063492063492
|
|
|
|
key: test_fscore
|
|
value: [0.625 0.8 0.93333333 1. 0.82352941 0.93333333
|
|
0.93333333 0.875 1. 0.77777778]
|
|
|
|
mean value: 0.8701307189542484
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/linear_model/_logistic.py:444: ConvergenceWarning: lbfgs failed to converge (status=1):
|
|
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
|
|
|
|
Increase the number of iterations (max_iter) or scale the data as shown in:
|
|
https://scikit-learn.org/stable/modules/preprocessing.html
|
|
Please also refer to the documentation for alternative solver options:
|
|
https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
|
|
n_iter_i = _check_optimize_result(
|
|
[1. 1. 1. 0.99212598 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9992125984251968
|
|
|
|
key: test_precision
|
|
value: [0.55555556 0.75 0.875 1. 0.7 0.875
|
|
0.875 0.77777778 1. 0.63636364]
|
|
|
|
mean value: 0.804469696969697
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 0.984375 1. 1. 1. 1.
|
|
1. 1. ]
|
|
|
|
mean value: 0.9984375
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.85714286 1. 1. 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9571428571428572
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.57142857 0.78571429 0.92857143 1. 0.78571429 0.92857143
|
|
0.92857143 0.85714286 1. 0.71428571]
|
|
|
|
mean value: 0.8500000000000001
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 0.99206349 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9992063492063492
|
|
|
|
key: test_jcc
|
|
value: [0.45454545 0.66666667 0.875 1. 0.7 0.875
|
|
0.875 0.77777778 1. 0.63636364]
|
|
|
|
mean value: 0.7860353535353535
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 0.984375 1. 1. 1. 1.
|
|
1. 1. ]
|
|
|
|
mean value: 0.9984375
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Gaussian NB
|
|
Model func: GaussianNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01240516 0.01079416 0.00873613 0.00856209 0.00837946 0.00852203
|
|
0.00998378 0.00867295 0.00848079 0.00854564]
|
|
|
|
mean value: 0.009308218955993652
|
|
|
|
key: score_time
|
|
value: [0.01174188 0.00898147 0.0090487 0.00852609 0.00863147 0.00845432
|
|
0.00860214 0.00853848 0.0084095 0.00898838]
|
|
|
|
mean value: 0.008992242813110351
|
|
|
|
key: test_mcc
|
|
value: [ 0.17407766 -0.17407766 0.1490712 0.14285714 0.57735027 0.
|
|
0.28867513 0. 0.1490712 0.31622777]
|
|
|
|
mean value: 0.16232527096583912
|
|
|
|
key: train_mcc
|
|
value: [0.57498891 0.4512753 0.40192095 0.38960673 0.49838198 0.46890221
|
|
0.59209474 0.42943789 0.44494921 0.42177569]
|
|
|
|
mean value: 0.46733336027843136
|
|
|
|
key: test_accuracy
|
|
value: [0.57142857 0.42857143 0.57142857 0.57142857 0.78571429 0.5
|
|
0.64285714 0.5 0.57142857 0.64285714]
|
|
|
|
mean value: 0.5785714285714285
|
|
|
|
key: train_accuracy
|
|
value: [0.78571429 0.6984127 0.6984127 0.68253968 0.74603175 0.73015873
|
|
0.79365079 0.71428571 0.72222222 0.70634921]
|
|
|
|
mean value: 0.7277777777777777
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.55555556 0.625 0.57142857 0.8 0.63157895
|
|
0.61538462 0.46153846 0.5 0.70588235]
|
|
|
|
mean value: 0.6133035170883467
|
|
|
|
key: train_fscore
|
|
value: [0.79699248 0.75641026 0.72058824 0.72972973 0.76470588 0.75362319
|
|
0.77966102 0.72307692 0.71544715 0.73381295]
|
|
|
|
mean value: 0.7474047817533758
|
|
|
|
key: test_precision
|
|
value: [0.54545455 0.45454545 0.55555556 0.57142857 0.75 0.5
|
|
0.66666667 0.5 0.6 0.6 ]
|
|
|
|
mean value: 0.5743650793650793
|
|
|
|
key: train_precision
|
|
value: [0.75714286 0.6344086 0.67123288 0.63529412 0.71232877 0.69333333
|
|
0.83636364 0.70149254 0.73333333 0.67105263]
|
|
|
|
mean value: 0.7045982692698753
|
|
|
|
key: test_recall
|
|
value: [0.85714286 0.71428571 0.71428571 0.57142857 0.85714286 0.85714286
|
|
0.57142857 0.42857143 0.42857143 0.85714286]
|
|
|
|
mean value: 0.6857142857142857
|
|
|
|
key: train_recall
|
|
value: [0.84126984 0.93650794 0.77777778 0.85714286 0.82539683 0.82539683
|
|
0.73015873 0.74603175 0.6984127 0.80952381]
|
|
|
|
mean value: 0.8047619047619048
|
|
|
|
key: test_roc_auc
|
|
value: [0.57142857 0.42857143 0.57142857 0.57142857 0.78571429 0.5
|
|
0.64285714 0.5 0.57142857 0.64285714]
|
|
|
|
mean value: 0.5785714285714286
|
|
|
|
key: train_roc_auc
|
|
value: [0.78571429 0.6984127 0.6984127 0.68253968 0.74603175 0.73015873
|
|
0.79365079 0.71428571 0.72222222 0.70634921]
|
|
|
|
mean value: 0.7277777777777779
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.38461538 0.45454545 0.4 0.66666667 0.46153846
|
|
0.44444444 0.3 0.33333333 0.54545455]
|
|
|
|
mean value: 0.44905982905982905
|
|
|
|
key: train_jcc
|
|
value: [0.6625 0.60824742 0.56321839 0.57446809 0.61904762 0.60465116
|
|
0.63888889 0.56626506 0.55696203 0.57954545]
|
|
|
|
mean value: 0.5973794109421473
|
|
|
|
MCC on Blind test: 0.41
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.00943851 0.00883722 0.00869012 0.00863743 0.0090723 0.00896025
|
|
0.00871539 0.00867486 0.00866628 0.00882387]
|
|
|
|
mean value: 0.00885162353515625
|
|
|
|
key: score_time
|
|
value: [0.00867724 0.00851727 0.00857329 0.00897312 0.00852656 0.00869584
|
|
0.00854874 0.00855994 0.00854278 0.00855541]
|
|
|
|
mean value: 0.008617019653320313
|
|
|
|
key: test_mcc
|
|
value: [ 0.28867513 -0.1490712 0.4472136 0.42857143 0. 0.31622777
|
|
0.28867513 0.28867513 0.71428571 0.57735027]
|
|
|
|
mean value: 0.3200602978848017
|
|
|
|
key: train_mcc
|
|
value: [0.66877624 0.71572981 0.68811011 0.67357531 0.71464592 0.68811011
|
|
0.63564173 0.78412547 0.70276422 0.70276422]
|
|
|
|
mean value: 0.6974243139367098
|
|
|
|
key: test_accuracy
|
|
value: [0.64285714 0.42857143 0.71428571 0.71428571 0.5 0.64285714
|
|
0.64285714 0.64285714 0.85714286 0.78571429]
|
|
|
|
mean value: 0.6571428571428571
|
|
|
|
key: train_accuracy
|
|
value: [0.83333333 0.85714286 0.84126984 0.83333333 0.85714286 0.84126984
|
|
0.81746032 0.88888889 0.84920635 0.84920635]
|
|
|
|
mean value: 0.8468253968253968
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.33333333 0.75 0.71428571 0.53333333 0.70588235
|
|
0.66666667 0.61538462 0.85714286 0.8 ]
|
|
|
|
mean value: 0.6642695539754363
|
|
|
|
key: train_fscore
|
|
value: [0.83969466 0.86153846 0.85074627 0.84444444 0.859375 0.85074627
|
|
0.82170543 0.89552239 0.85714286 0.85714286]
|
|
|
|
mean value: 0.8538058628486893
|
|
|
|
key: test_precision
|
|
value: [0.625 0.4 0.66666667 0.71428571 0.5 0.6
|
|
0.625 0.66666667 0.85714286 0.75 ]
|
|
|
|
mean value: 0.6404761904761904
|
|
|
|
key: train_precision
|
|
value: [0.80882353 0.8358209 0.8028169 0.79166667 0.84615385 0.8028169
|
|
0.8030303 0.84507042 0.81428571 0.81428571]
|
|
|
|
mean value: 0.816477089470851
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.28571429 0.85714286 0.71428571 0.57142857 0.85714286
|
|
0.71428571 0.57142857 0.85714286 0.85714286]
|
|
|
|
mean value: 0.7
|
|
|
|
key: train_recall
|
|
value: [0.87301587 0.88888889 0.9047619 0.9047619 0.87301587 0.9047619
|
|
0.84126984 0.95238095 0.9047619 0.9047619 ]
|
|
|
|
mean value: 0.8952380952380953
|
|
|
|
key: test_roc_auc
|
|
value: [0.64285714 0.42857143 0.71428571 0.71428571 0.5 0.64285714
|
|
0.64285714 0.64285714 0.85714286 0.78571429]
|
|
|
|
mean value: 0.6571428571428571
|
|
|
|
key: train_roc_auc
|
|
value: [0.83333333 0.85714286 0.84126984 0.83333333 0.85714286 0.84126984
|
|
0.81746032 0.88888889 0.84920635 0.84920635]
|
|
|
|
mean value: 0.8468253968253968
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.2 0.6 0.55555556 0.36363636 0.54545455
|
|
0.5 0.44444444 0.75 0.66666667]
|
|
|
|
mean value: 0.5125757575757576
|
|
|
|
key: train_jcc
|
|
value: [0.72368421 0.75675676 0.74025974 0.73076923 0.75342466 0.74025974
|
|
0.69736842 0.81081081 0.75 0.75 ]
|
|
|
|
mean value: 0.7453333567969473
|
|
|
|
MCC on Blind test: 0.09
|
|
|
|
Accuracy on Blind test: 0.5
|
|
|
|
Model_name: K-Nearest Neighbors
|
|
Model func: KNeighborsClassifier()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', KNeighborsClassifier())])
|
|
|
|
key: fit_time
|
|
value: [0.00975204 0.00937724 0.00941324 0.00952172 0.00948811 0.00852609
|
|
0.00859451 0.00985909 0.00878239 0.00956988]
|
|
|
|
mean value: 0.009288430213928223
|
|
|
|
key: score_time
|
|
value: [0.0101881 0.01011419 0.01017499 0.01032948 0.01011992 0.00958657
|
|
0.00983787 0.01024365 0.00955081 0.01083493]
|
|
|
|
mean value: 0.01009805202484131
|
|
|
|
key: test_mcc
|
|
value: [0.17407766 0.63245553 0.52223297 0. 0.31622777 0.4472136
|
|
0.57735027 0.57735027 0.1490712 0.14285714]
|
|
|
|
mean value: 0.3538836397109643
|
|
|
|
key: train_mcc
|
|
value: [0.66742381 0.60721047 0.60693274 0.58759776 0.57323678 0.64482588
|
|
0.53975054 0.62187434 0.65112184 0.63887656]
|
|
|
|
mean value: 0.6138850709215125
|
|
|
|
key: test_accuracy
|
|
value: [0.57142857 0.78571429 0.71428571 0.5 0.64285714 0.71428571
|
|
0.78571429 0.78571429 0.57142857 0.57142857]
|
|
|
|
mean value: 0.6642857142857143
|
|
|
|
key: train_accuracy
|
|
value: [0.83333333 0.79365079 0.8015873 0.79365079 0.78571429 0.81746032
|
|
0.76984127 0.80952381 0.82539683 0.81746032]
|
|
|
|
mean value: 0.8047619047619048
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.72727273 0.77777778 0.46153846 0.70588235 0.75
|
|
0.76923077 0.76923077 0.5 0.57142857]
|
|
|
|
mean value: 0.669902809608692
|
|
|
|
key: train_fscore
|
|
value: [0.8372093 0.81690141 0.81203008 0.79032258 0.79389313 0.83211679
|
|
0.77165354 0.81818182 0.828125 0.82706767]
|
|
|
|
mean value: 0.8127501315363415
|
|
|
|
key: test_precision
|
|
value: [0.54545455 1. 0.63636364 0.5 0.6 0.66666667
|
|
0.83333333 0.83333333 0.6 0.57142857]
|
|
|
|
mean value: 0.6786580086580086
|
|
|
|
key: train_precision
|
|
value: [0.81818182 0.73417722 0.77142857 0.80327869 0.76470588 0.77027027
|
|
0.765625 0.7826087 0.81538462 0.78571429]
|
|
|
|
mean value: 0.781137504269914
|
|
|
|
key: test_recall
|
|
value: [0.85714286 0.57142857 1. 0.42857143 0.85714286 0.85714286
|
|
0.71428571 0.71428571 0.42857143 0.57142857]
|
|
|
|
mean value: 0.7
|
|
|
|
key: train_recall
|
|
value: [0.85714286 0.92063492 0.85714286 0.77777778 0.82539683 0.9047619
|
|
0.77777778 0.85714286 0.84126984 0.87301587]
|
|
|
|
mean value: 0.8492063492063492
|
|
|
|
key: test_roc_auc
|
|
value: [0.57142857 0.78571429 0.71428571 0.5 0.64285714 0.71428571
|
|
0.78571429 0.78571429 0.57142857 0.57142857]
|
|
|
|
mean value: 0.6642857142857143
|
|
|
|
key: train_roc_auc
|
|
value: [0.83333333 0.79365079 0.8015873 0.79365079 0.78571429 0.81746032
|
|
0.76984127 0.80952381 0.82539683 0.81746032]
|
|
|
|
mean value: 0.8047619047619048
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.57142857 0.63636364 0.3 0.54545455 0.6
|
|
0.625 0.625 0.33333333 0.4 ]
|
|
|
|
mean value: 0.5136580086580087
|
|
|
|
key: train_jcc
|
|
value: [0.72 0.69047619 0.6835443 0.65333333 0.65822785 0.7125
|
|
0.62820513 0.69230769 0.70666667 0.70512821]
|
|
|
|
mean value: 0.685038936801595
|
|
|
|
MCC on Blind test: -0.17
|
|
|
|
Accuracy on Blind test: 0.4
|
|
|
|
Model_name: SVM
|
|
Model func: SVC(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SVC(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01123381 0.01120996 0.01115704 0.01081371 0.01120543 0.01074576
|
|
0.01096678 0.01101112 0.0098691 0.00978637]
|
|
|
|
mean value: 0.010799908638000488
|
|
|
|
key: score_time
|
|
value: [0.01048398 0.00968146 0.00980759 0.00937915 0.00958252 0.00935674
|
|
0.00955558 0.00967646 0.00892568 0.00904989]
|
|
|
|
mean value: 0.009549903869628906
|
|
|
|
key: test_mcc
|
|
value: [0.14285714 0.57735027 0.52223297 0.57735027 0.28867513 0.52223297
|
|
0.8660254 0.28867513 1. 0.74535599]
|
|
|
|
mean value: 0.5530755282444576
|
|
|
|
key: train_mcc
|
|
value: [0.85725086 0.84511128 0.84297067 0.81116045 0.82633424 0.84297067
|
|
0.84126984 0.82800868 0.88900089 0.84297067]
|
|
|
|
mean value: 0.8427048241806472
|
|
|
|
key: test_accuracy
|
|
value: [0.57142857 0.78571429 0.71428571 0.78571429 0.64285714 0.71428571
|
|
0.92857143 0.64285714 1. 0.85714286]
|
|
|
|
mean value: 0.7642857142857142
|
|
|
|
key: train_accuracy
|
|
value: [0.92857143 0.92063492 0.92063492 0.9047619 0.91269841 0.92063492
|
|
0.92063492 0.91269841 0.94444444 0.92063492]
|
|
|
|
mean value: 0.9206349206349206
|
|
|
|
key: test_fscore
|
|
value: [0.57142857 0.76923077 0.77777778 0.8 0.66666667 0.77777778
|
|
0.93333333 0.61538462 1. 0.875 ]
|
|
|
|
mean value: 0.7786599511599511
|
|
|
|
key: train_fscore
|
|
value: [0.92913386 0.92424242 0.92307692 0.90769231 0.91472868 0.92307692
|
|
0.92063492 0.91603053 0.94488189 0.92307692]
|
|
|
|
mean value: 0.9226575386353605
|
|
|
|
key: test_precision
|
|
value: [0.57142857 0.83333333 0.63636364 0.75 0.625 0.63636364
|
|
0.875 0.66666667 1. 0.77777778]
|
|
|
|
mean value: 0.7371933621933622
|
|
|
|
key: train_precision
|
|
value: [0.921875 0.88405797 0.89552239 0.88059701 0.89393939 0.89552239
|
|
0.92063492 0.88235294 0.9375 0.89552239]
|
|
|
|
mean value: 0.9007524405869756
|
|
|
|
key: test_recall
|
|
value: [0.57142857 0.71428571 1. 0.85714286 0.71428571 1.
|
|
1. 0.57142857 1. 1. ]
|
|
|
|
mean value: 0.8428571428571429
|
|
|
|
key: train_recall
|
|
value: [0.93650794 0.96825397 0.95238095 0.93650794 0.93650794 0.95238095
|
|
0.92063492 0.95238095 0.95238095 0.95238095]
|
|
|
|
mean value: 0.946031746031746
|
|
|
|
key: test_roc_auc
|
|
value: [0.57142857 0.78571429 0.71428571 0.78571429 0.64285714 0.71428571
|
|
0.92857143 0.64285714 1. 0.85714286]
|
|
|
|
mean value: 0.7642857142857143
|
|
|
|
key: train_roc_auc
|
|
value: [0.92857143 0.92063492 0.92063492 0.9047619 0.91269841 0.92063492
|
|
0.92063492 0.91269841 0.94444444 0.92063492]
|
|
|
|
mean value: 0.9206349206349206
|
|
|
|
key: test_jcc
|
|
value: [0.4 0.625 0.63636364 0.66666667 0.5 0.63636364
|
|
0.875 0.44444444 1. 0.77777778]
|
|
|
|
mean value: 0.6561616161616162
|
|
|
|
key: train_jcc
|
|
value: [0.86764706 0.85915493 0.85714286 0.83098592 0.84285714 0.85714286
|
|
0.85294118 0.84507042 0.89552239 0.85714286]
|
|
|
|
mean value: 0.8565607605245167
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: MLP
|
|
Model func: MLPClassifier(max_iter=500, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MLPClassifier(max_iter=500, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.49464512 0.49820232 0.5874176 0.53173232 0.50007153 0.56735539
|
|
0.59726977 0.5393312 0.50509214 0.54775929]
|
|
|
|
mean value: 0.5368876695632935
|
|
|
|
key: score_time
|
|
value: [0.01225948 0.0286274 0.012254 0.01226807 0.01234031 0.01249647
|
|
0.01239157 0.01230764 0.01273727 0.01596975]
|
|
|
|
mean value: 0.014365196228027344
|
|
|
|
key: test_mcc
|
|
value: [0.1490712 0.1490712 0.63245553 1. 0.74535599 0.8660254
|
|
0.74535599 0.8660254 1. 0.4472136 ]
|
|
|
|
mean value: 0.6600574317102343
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.57142857 0.57142857 0.78571429 1. 0.85714286 0.92857143
|
|
0.85714286 0.92857143 1. 0.71428571]
|
|
|
|
mean value: 0.8214285714285714
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.625 0.5 0.82352941 1. 0.875 0.93333333
|
|
0.875 0.92307692 1. 0.75 ]
|
|
|
|
mean value: 0.8304939668174962
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.55555556 0.6 0.7 1. 0.77777778 0.875
|
|
0.77777778 1. 1. 0.66666667]
|
|
|
|
mean value: 0.7952777777777778
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.42857143 1. 1. 1. 1.
|
|
1. 0.85714286 1. 0.85714286]
|
|
|
|
mean value: 0.8857142857142857
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.57142857 0.57142857 0.78571429 1. 0.85714286 0.92857143
|
|
0.85714286 0.92857143 1. 0.71428571]
|
|
|
|
mean value: 0.8214285714285714
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.45454545 0.33333333 0.7 1. 0.77777778 0.875
|
|
0.77777778 0.85714286 1. 0.6 ]
|
|
|
|
mean value: 0.73755772005772
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Decision Tree
|
|
Model func: DecisionTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', DecisionTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01468539 0.01246285 0.01255035 0.01075816 0.01069736 0.01113153
|
|
0.01100087 0.01145768 0.01119852 0.01153445]
|
|
|
|
mean value: 0.01174771785736084
|
|
|
|
key: score_time
|
|
value: [0.01175427 0.00928426 0.00931358 0.00870562 0.00871897 0.00880766
|
|
0.00885773 0.00903249 0.00884533 0.00887561]
|
|
|
|
mean value: 0.009219551086425781
|
|
|
|
key: test_mcc
|
|
value: [0.63245553 0.8660254 1. 0.8660254 0.8660254 0.63245553
|
|
1. 0.8660254 1. 0.8660254 ]
|
|
|
|
mean value: 0.8595038082989546
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.78571429 0.92857143 1. 0.92857143 0.92857143 0.78571429
|
|
1. 0.92857143 1. 0.92857143]
|
|
|
|
mean value: 0.9214285714285715
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.82352941 0.92307692 1. 0.92307692 0.93333333 0.82352941
|
|
1. 0.92307692 1. 0.93333333]
|
|
|
|
mean value: 0.9282956259426848
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.7 1. 1. 1. 0.875 0.7 1. 1. 1. 0.875]
|
|
|
|
mean value: 0.915
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.85714286 1. 0.85714286 1. 1.
|
|
1. 0.85714286 1. 1. ]
|
|
|
|
mean value: 0.9571428571428571
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.78571429 0.92857143 1. 0.92857143 0.92857143 0.78571429
|
|
1. 0.92857143 1. 0.92857143]
|
|
|
|
mean value: 0.9214285714285715
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.7 0.85714286 1. 0.85714286 0.875 0.7
|
|
1. 0.85714286 1. 0.875 ]
|
|
|
|
mean value: 0.8721428571428571
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.67
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Extra Trees
|
|
Model func: ExtraTreesClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreesClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.0882473 0.08731794 0.08704042 0.08775353 0.08557534 0.08603501
|
|
0.08768535 0.08798957 0.08857989 0.08624673]
|
|
|
|
mean value: 0.08724710941314698
|
|
|
|
key: score_time
|
|
value: [0.01715994 0.01705217 0.01723671 0.01793122 0.01705194 0.01739645
|
|
0.01781774 0.01779103 0.0172832 0.01782751]
|
|
|
|
mean value: 0.01745479106903076
|
|
|
|
key: test_mcc
|
|
value: [0.42857143 0.71428571 0.8660254 1. 0.4472136 0.8660254
|
|
0.74535599 0.74535599 1. 0.4472136 ]
|
|
|
|
mean value: 0.7260047126425796
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.71428571 0.85714286 0.92857143 1. 0.71428571 0.92857143
|
|
0.85714286 0.85714286 1. 0.71428571]
|
|
|
|
mean value: 0.8571428571428571
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.71428571 0.85714286 0.93333333 1. 0.75 0.93333333
|
|
0.83333333 0.83333333 1. 0.75 ]
|
|
|
|
mean value: 0.8604761904761905
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.71428571 0.85714286 0.875 1. 0.66666667 0.875
|
|
1. 1. 1. 0.66666667]
|
|
|
|
mean value: 0.8654761904761905
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.85714286 1. 1. 0.85714286 1.
|
|
0.71428571 0.71428571 1. 0.85714286]
|
|
|
|
mean value: 0.8714285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.71428571 0.85714286 0.92857143 1. 0.71428571 0.92857143
|
|
0.85714286 0.85714286 1. 0.71428571]
|
|
|
|
mean value: 0.8571428571428572
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.55555556 0.75 0.875 1. 0.6 0.875
|
|
0.71428571 0.71428571 1. 0.6 ]
|
|
|
|
mean value: 0.7684126984126984
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Extra Tree
|
|
Model func: ExtraTreeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', ExtraTreeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.00889707 0.00916862 0.00925422 0.00874209 0.00883031 0.00954938
|
|
0.00880361 0.0087378 0.00891757 0.00876474]
|
|
|
|
mean value: 0.008966541290283203
|
|
|
|
key: score_time
|
|
value: [0.00904775 0.00855803 0.00931263 0.00873899 0.00855589 0.00873303
|
|
0.00867724 0.00862074 0.00868821 0.008636 ]
|
|
|
|
mean value: 0.00875685214996338
|
|
|
|
key: test_mcc
|
|
value: [0.42857143 0.57735027 0.42857143 0.63245553 0.31622777 0.8660254
|
|
0.57735027 0.57735027 0.74535599 0.63245553]
|
|
|
|
mean value: 0.5781713891080292
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.71428571 0.78571429 0.71428571 0.78571429 0.64285714 0.92857143
|
|
0.78571429 0.78571429 0.85714286 0.78571429]
|
|
|
|
mean value: 0.7785714285714286
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.71428571 0.8 0.71428571 0.82352941 0.70588235 0.93333333
|
|
0.8 0.76923077 0.875 0.82352941]
|
|
|
|
mean value: 0.7959076707606119
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.71428571 0.75 0.71428571 0.7 0.6 0.875
|
|
0.75 0.83333333 0.77777778 0.7 ]
|
|
|
|
mean value: 0.741468253968254
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.85714286 0.71428571 1. 0.85714286 1.
|
|
0.85714286 0.71428571 1. 1. ]
|
|
|
|
mean value: 0.8714285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.71428571 0.78571429 0.71428571 0.78571429 0.64285714 0.92857143
|
|
0.78571429 0.78571429 0.85714286 0.78571429]
|
|
|
|
mean value: 0.7785714285714286
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.55555556 0.66666667 0.55555556 0.7 0.54545455 0.875
|
|
0.66666667 0.625 0.77777778 0.7 ]
|
|
|
|
mean value: 0.6667676767676768
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.36
|
|
|
|
Accuracy on Blind test: 0.7
|
|
|
|
Model_name: Random Forest
|
|
Model func: RandomForestClassifier(n_estimators=1000, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(n_estimators=1000, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [1.10285902 1.08632183 1.11166525 1.13373566 1.12290573 1.09281826
|
|
1.16202903 1.11271644 1.12481833 1.09064579]
|
|
|
|
mean value: 1.1140515327453613
|
|
|
|
key: score_time
|
|
value: [0.08765078 0.09237862 0.09297252 0.15036464 0.08707857 0.08621287
|
|
0.08813596 0.09109473 0.09038901 0.08686709]
|
|
|
|
mean value: 0.09531447887420655
|
|
|
|
key: test_mcc
|
|
value: [0.42857143 0.71428571 1. 0.8660254 0.74535599 0.74535599
|
|
1. 1. 1. 0.4472136 ]
|
|
|
|
mean value: 0.7946808127141399
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.71428571 0.85714286 1. 0.92857143 0.85714286 0.85714286
|
|
1. 1. 1. 0.71428571]
|
|
|
|
mean value: 0.8928571428571428
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.71428571 0.85714286 1. 0.92307692 0.875 0.875
|
|
1. 1. 1. 0.75 ]
|
|
|
|
mean value: 0.8994505494505495
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.71428571 0.85714286 1. 1. 0.77777778 0.77777778
|
|
1. 1. 1. 0.66666667]
|
|
|
|
mean value: 0.8793650793650793
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.85714286 1. 0.85714286 1. 1.
|
|
1. 1. 1. 0.85714286]
|
|
|
|
mean value: 0.9285714285714286
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.71428571 0.85714286 1. 0.92857143 0.85714286 0.85714286
|
|
1. 1. 1. 0.71428571]
|
|
|
|
mean value: 0.8928571428571429
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.55555556 0.75 1. 0.85714286 0.77777778 0.77777778
|
|
1. 1. 1. 0.6 ]
|
|
|
|
mean value: 0.8318253968253968
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Random Forest2
|
|
Model func: RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:427: FutureWarning: `max_features='auto'` has been deprecated in 1.1 and will be removed in 1.3. To keep the past behaviour, explicitly set `max_features='sqrt'` or remove this parameter as it is also the default value for RandomForestClassifiers and ExtraTreesClassifiers.
|
|
warn(
|
|
Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000...05', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10,
|
|
oob_score=True, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.82384944 0.92019367 0.84741855 0.8575809 0.95540094 0.89151525
|
|
0.84008265 0.88904452 0.86206412 0.88456702]
|
|
|
|
mean value: 0.8771717071533203
|
|
|
|
key: score_time
|
|
value: [0.19765759 0.22397804 0.18344784 0.11897373 0.20639133 0.18086195
|
|
0.21533513 0.23305535 0.18315172 0.21732092]
|
|
|
|
mean value: 0.19601736068725586
|
|
|
|
key: test_mcc
|
|
value: [0.28867513 0.71428571 0.8660254 0.8660254 0.74535599 0.4472136
|
|
1. 1. 1. 0.57735027]
|
|
|
|
mean value: 0.7504931513638918
|
|
|
|
key: train_mcc
|
|
value: [0.98425098 1. 0.98425098 0.96825397 0.98425098 0.93650794
|
|
0.98425098 0.98425098 0.98425098 0.98425098]
|
|
|
|
mean value: 0.9794518794522239
|
|
|
|
key: test_accuracy
|
|
value: [0.64285714 0.85714286 0.92857143 0.92857143 0.85714286 0.71428571
|
|
1. 1. 1. 0.78571429]
|
|
|
|
mean value: 0.8714285714285714
|
|
|
|
key: train_accuracy
|
|
value: [0.99206349 1. 0.99206349 0.98412698 0.99206349 0.96825397
|
|
0.99206349 0.99206349 0.99206349 0.99206349]
|
|
|
|
mean value: 0.9896825396825397
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.85714286 0.93333333 0.93333333 0.875 0.75
|
|
1. 1. 1. 0.8 ]
|
|
|
|
mean value: 0.881547619047619
|
|
|
|
key: train_fscore
|
|
value: [0.992 1. 0.992 0.98412698 0.992 0.96825397
|
|
0.992 0.992 0.992 0.992 ]
|
|
|
|
mean value: 0.9896380952380952
|
|
|
|
key: test_precision
|
|
value: [0.625 0.85714286 0.875 0.875 0.77777778 0.66666667
|
|
1. 1. 1. 0.75 ]
|
|
|
|
mean value: 0.8426587301587302
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 0.98412698 1. 0.96825397
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9952380952380953
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.85714286 1. 1. 1. 0.85714286
|
|
1. 1. 1. 0.85714286]
|
|
|
|
mean value: 0.9285714285714286
|
|
|
|
key: train_recall
|
|
value: [0.98412698 1. 0.98412698 0.98412698 0.98412698 0.96825397
|
|
0.98412698 0.98412698 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9841269841269841
|
|
|
|
key: test_roc_auc
|
|
value: [0.64285714 0.85714286 0.92857143 0.92857143 0.85714286 0.71428571
|
|
1. 1. 1. 0.78571429]
|
|
|
|
mean value: 0.8714285714285714
|
|
|
|
key: train_roc_auc
|
|
value: [0.99206349 1. 0.99206349 0.98412698 0.99206349 0.96825397
|
|
0.99206349 0.99206349 0.99206349 0.99206349]
|
|
|
|
mean value: 0.9896825396825397
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.75 0.875 0.875 0.77777778 0.6
|
|
1. 1. 1. 0.66666667]
|
|
|
|
mean value: 0.8044444444444444
|
|
|
|
key: train_jcc
|
|
value: [0.98412698 1. 0.98412698 0.96875 0.98412698 0.93846154
|
|
0.98412698 0.98412698 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9796100427350427
|
|
|
|
MCC on Blind test: 0.8
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Naive Bayes
|
|
Model func: BernoulliNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', BernoulliNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01110482 0.00873327 0.00868821 0.00868058 0.00864291 0.00879645
|
|
0.00917721 0.00874853 0.00868726 0.0091517 ]
|
|
|
|
mean value: 0.009041094779968261
|
|
|
|
key: score_time
|
|
value: [0.00973678 0.00864267 0.00871468 0.00847268 0.00851536 0.00854301
|
|
0.00959444 0.00855827 0.00854039 0.0086143 ]
|
|
|
|
mean value: 0.008793258666992187
|
|
|
|
key: test_mcc
|
|
value: [ 0.28867513 -0.1490712 0.4472136 0.42857143 0. 0.31622777
|
|
0.28867513 0.28867513 0.71428571 0.57735027]
|
|
|
|
mean value: 0.3200602978848017
|
|
|
|
key: train_mcc
|
|
value: [0.66877624 0.71572981 0.68811011 0.67357531 0.71464592 0.68811011
|
|
0.63564173 0.78412547 0.70276422 0.70276422]
|
|
|
|
mean value: 0.6974243139367098
|
|
|
|
key: test_accuracy
|
|
value: [0.64285714 0.42857143 0.71428571 0.71428571 0.5 0.64285714
|
|
0.64285714 0.64285714 0.85714286 0.78571429]
|
|
|
|
mean value: 0.6571428571428571
|
|
|
|
key: train_accuracy
|
|
value: [0.83333333 0.85714286 0.84126984 0.83333333 0.85714286 0.84126984
|
|
0.81746032 0.88888889 0.84920635 0.84920635]
|
|
|
|
mean value: 0.8468253968253968
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.33333333 0.75 0.71428571 0.53333333 0.70588235
|
|
0.66666667 0.61538462 0.85714286 0.8 ]
|
|
|
|
mean value: 0.6642695539754363
|
|
|
|
key: train_fscore
|
|
value: [0.83969466 0.86153846 0.85074627 0.84444444 0.859375 0.85074627
|
|
0.82170543 0.89552239 0.85714286 0.85714286]
|
|
|
|
mean value: 0.8538058628486893
|
|
|
|
key: test_precision
|
|
value: [0.625 0.4 0.66666667 0.71428571 0.5 0.6
|
|
0.625 0.66666667 0.85714286 0.75 ]
|
|
|
|
mean value: 0.6404761904761904
|
|
|
|
key: train_precision
|
|
value: [0.80882353 0.8358209 0.8028169 0.79166667 0.84615385 0.8028169
|
|
0.8030303 0.84507042 0.81428571 0.81428571]
|
|
|
|
mean value: 0.816477089470851
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.28571429 0.85714286 0.71428571 0.57142857 0.85714286
|
|
0.71428571 0.57142857 0.85714286 0.85714286]
|
|
|
|
mean value: 0.7
|
|
|
|
key: train_recall
|
|
value: [0.87301587 0.88888889 0.9047619 0.9047619 0.87301587 0.9047619
|
|
0.84126984 0.95238095 0.9047619 0.9047619 ]
|
|
|
|
mean value: 0.8952380952380953
|
|
|
|
key: test_roc_auc
|
|
value: [0.64285714 0.42857143 0.71428571 0.71428571 0.5 0.64285714
|
|
0.64285714 0.64285714 0.85714286 0.78571429]
|
|
|
|
mean value: 0.6571428571428571
|
|
|
|
key: train_roc_auc
|
|
value: [0.83333333 0.85714286 0.84126984 0.83333333 0.85714286 0.84126984
|
|
0.81746032 0.88888889 0.84920635 0.84920635]
|
|
|
|
mean value: 0.8468253968253968
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.2 0.6 0.55555556 0.36363636 0.54545455
|
|
0.5 0.44444444 0.75 0.66666667]
|
|
|
|
mean value: 0.5125757575757576
|
|
|
|
key: train_jcc
|
|
value: [0.72368421 0.75675676 0.74025974 0.73076923 0.75342466 0.74025974
|
|
0.69736842 0.81081081 0.75 0.75 ]
|
|
|
|
mean value: 0.7453333567969473
|
|
|
|
MCC on Blind test: 0.09
|
|
|
|
Accuracy on Blind test: 0.5
|
|
|
|
Model_name: XGBoost
|
|
Model func: XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=None, booster=None, colsample_bylevel=None,
|
|
colsample_bynode=None, colsample_bytree=None,
|
|
enable_categorical=False, gamma=None, gpu_id=None,
|
|
importance_type=None, interaction_constraints=None,
|
|
learning_rate=None, max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan, monotone_constraints=None,
|
|
n_estimators=100, n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None, reg_lambda=None,
|
|
scale_pos_weight=None, subsample=None, tree_method=None,
|
|
use_label_encoder=False, validate_parameters=None, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000...
|
|
interaction_constraints=None, learning_rate=None,
|
|
max_delta_step=None, max_depth=None,
|
|
min_child_weight=None, missing=nan,
|
|
monotone_constraints=None, n_estimators=100,
|
|
n_jobs=None, num_parallel_tree=None,
|
|
predictor=None, random_state=42, reg_alpha=None,
|
|
reg_lambda=None, scale_pos_weight=None,
|
|
subsample=None, tree_method=None,
|
|
use_label_encoder=False,
|
|
validate_parameters=None, verbosity=0))])
|
|
|
|
key: fit_time
|
|
value: [0.05601382 0.04525709 0.04223561 0.04140973 0.04675841 0.04774737
|
|
0.05823922 0.04861522 0.07092834 0.04147816]
|
|
|
|
mean value: 0.049868297576904294
|
|
|
|
key: score_time
|
|
value: [0.01060367 0.01014566 0.01036358 0.01017857 0.01060271 0.01053643
|
|
0.01062751 0.01058984 0.01028037 0.0101583 ]
|
|
|
|
mean value: 0.010408663749694824
|
|
|
|
key: test_mcc
|
|
value: [0.63245553 0.8660254 1. 0.8660254 0.8660254 1.
|
|
1. 1. 1. 0.8660254 ]
|
|
|
|
mean value: 0.9096557147171431
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.78571429 0.92857143 1. 0.92857143 0.92857143 1.
|
|
1. 1. 1. 0.92857143]
|
|
|
|
mean value: 0.95
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.82352941 0.92307692 1. 0.92307692 0.93333333 1.
|
|
1. 1. 1. 0.93333333]
|
|
|
|
mean value: 0.9536349924585219
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.7 1. 1. 1. 0.875 1. 1. 1. 1. 0.875]
|
|
|
|
mean value: 0.945
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.85714286 1. 0.85714286 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9714285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.78571429 0.92857143 1. 0.92857143 0.92857143 1.
|
|
1. 1. 1. 0.92857143]
|
|
|
|
mean value: 0.95
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.7 0.85714286 1. 0.85714286 0.875 1.
|
|
1. 1. 1. 0.875 ]
|
|
|
|
mean value: 0.9164285714285714
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 1.0
|
|
|
|
Accuracy on Blind test: 1.0
|
|
|
|
Model_name: LDA
|
|
Model func: LinearDiscriminantAnalysis()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', LinearDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.02006388 0.02124453 0.02162552 0.03351927 0.02162671 0.02885532
|
|
0.0461483 0.0459609 0.0458951 0.04586172]
|
|
|
|
mean value: 0.033080124855041505
|
|
|
|
key: score_time
|
|
value: [0.01186085 0.01175904 0.01173973 0.01177478 0.01200271 0.01466799
|
|
0.01861906 0.02242827 0.02122092 0.02015448]
|
|
|
|
mean value: 0.015622782707214355
|
|
|
|
key: test_mcc
|
|
value: [0. 0.4472136 0.57735027 0.71428571 0.52223297 0.28867513
|
|
0.4472136 0.71428571 0.8660254 0.31622777]
|
|
|
|
mean value: 0.4893510161024153
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.5 0.71428571 0.78571429 0.85714286 0.71428571 0.64285714
|
|
0.71428571 0.85714286 0.92857143 0.64285714]
|
|
|
|
mean value: 0.7357142857142858
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.58823529 0.75 0.8 0.85714286 0.77777778 0.66666667
|
|
0.75 0.85714286 0.93333333 0.70588235]
|
|
|
|
mean value: 0.7686181139122316
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.5 0.66666667 0.75 0.85714286 0.63636364 0.625
|
|
0.66666667 0.85714286 0.875 0.6 ]
|
|
|
|
mean value: 0.7033982683982684
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.85714286 0.85714286 0.85714286 1. 0.71428571
|
|
0.85714286 0.85714286 1. 0.85714286]
|
|
|
|
mean value: 0.8571428571428571
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.5 0.71428571 0.78571429 0.85714286 0.71428571 0.64285714
|
|
0.71428571 0.85714286 0.92857143 0.64285714]
|
|
|
|
mean value: 0.7357142857142858
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.41666667 0.6 0.66666667 0.75 0.63636364 0.5
|
|
0.6 0.75 0.875 0.54545455]
|
|
|
|
mean value: 0.6340151515151515
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.1
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Multinomial
|
|
Model func: MultinomialNB()
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', MultinomialNB())])
|
|
|
|
key: fit_time
|
|
value: [0.01531363 0.008986 0.00917101 0.00857091 0.00846648 0.00874424
|
|
0.00847507 0.01050377 0.00918341 0.00854468]
|
|
|
|
mean value: 0.009595918655395507
|
|
|
|
key: score_time
|
|
value: [0.00934434 0.00968599 0.00924802 0.00841284 0.00842881 0.00845051
|
|
0.00878501 0.0092361 0.00860786 0.00842166]
|
|
|
|
mean value: 0.008862113952636719
|
|
|
|
key: test_mcc
|
|
value: [0.1490712 0.14285714 0.1490712 0.28867513 0.4472136 0.17407766
|
|
0.57735027 0. 0.1490712 0.31622777]
|
|
|
|
mean value: 0.23936151596140331
|
|
|
|
key: train_mcc
|
|
value: [0.50851338 0.33605377 0.38138504 0.36976727 0.47625048 0.38100038
|
|
0.43052839 0.4612481 0.41400434 0.38332594]
|
|
|
|
mean value: 0.4142077086835709
|
|
|
|
key: test_accuracy
|
|
value: [0.57142857 0.57142857 0.57142857 0.64285714 0.71428571 0.57142857
|
|
0.78571429 0.5 0.57142857 0.64285714]
|
|
|
|
mean value: 0.6142857142857143
|
|
|
|
key: train_accuracy
|
|
value: [0.75396825 0.66666667 0.69047619 0.68253968 0.73809524 0.69047619
|
|
0.71428571 0.73015873 0.70634921 0.69047619]
|
|
|
|
mean value: 0.7063492063492064
|
|
|
|
key: test_fscore
|
|
value: [0.5 0.57142857 0.625 0.61538462 0.66666667 0.66666667
|
|
0.8 0.22222222 0.5 0.70588235]
|
|
|
|
mean value: 0.5873251095309918
|
|
|
|
key: train_fscore
|
|
value: [0.75968992 0.68656716 0.69767442 0.70588235 0.74015748 0.688
|
|
0.72727273 0.72131148 0.71755725 0.70676692]
|
|
|
|
mean value: 0.7150879710404706
|
|
|
|
key: test_precision
|
|
value: [0.6 0.57142857 0.55555556 0.66666667 0.8 0.54545455
|
|
0.75 0.5 0.6 0.6 ]
|
|
|
|
mean value: 0.6189105339105339
|
|
|
|
key: train_precision
|
|
value: [0.74242424 0.64788732 0.68181818 0.65753425 0.734375 0.69354839
|
|
0.69565217 0.74576271 0.69117647 0.67142857]
|
|
|
|
mean value: 0.696160730965246
|
|
|
|
key: test_recall
|
|
value: [0.42857143 0.57142857 0.71428571 0.57142857 0.57142857 0.85714286
|
|
0.85714286 0.14285714 0.42857143 0.85714286]
|
|
|
|
mean value: 0.6
|
|
|
|
key: train_recall
|
|
value: [0.77777778 0.73015873 0.71428571 0.76190476 0.74603175 0.68253968
|
|
0.76190476 0.6984127 0.74603175 0.74603175]
|
|
|
|
mean value: 0.7365079365079366
|
|
|
|
key: test_roc_auc
|
|
value: [0.57142857 0.57142857 0.57142857 0.64285714 0.71428571 0.57142857
|
|
0.78571429 0.5 0.57142857 0.64285714]
|
|
|
|
mean value: 0.6142857142857143
|
|
|
|
key: train_roc_auc
|
|
value: [0.75396825 0.66666667 0.69047619 0.68253968 0.73809524 0.69047619
|
|
0.71428571 0.73015873 0.70634921 0.69047619]
|
|
|
|
mean value: 0.7063492063492064
|
|
|
|
key: test_jcc
|
|
value: [0.33333333 0.4 0.45454545 0.44444444 0.5 0.5
|
|
0.66666667 0.125 0.33333333 0.54545455]
|
|
|
|
mean value: 0.43027777777777776
|
|
|
|
key: train_jcc
|
|
value: [0.6125 0.52272727 0.53571429 0.54545455 0.5875 0.52439024
|
|
0.57142857 0.56410256 0.55952381 0.54651163]
|
|
|
|
mean value: 0.5569852920760465
|
|
|
|
MCC on Blind test: 0.82
|
|
|
|
Accuracy on Blind test: 0.9
|
|
|
|
Model_name: Passive Aggresive
|
|
Model func: PassiveAggressiveClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
PassiveAggressiveClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01009989 0.01338983 0.01457953 0.01464343 0.01477599 0.0141077
|
|
0.0144155 0.01425958 0.01422143 0.01673651]
|
|
|
|
mean value: 0.014122939109802246
|
|
|
|
key: score_time
|
|
value: [0.00842857 0.01145411 0.011446 0.01153588 0.01144266 0.01150966
|
|
0.01145935 0.01150632 0.01145983 0.01159811]
|
|
|
|
mean value: 0.011184048652648926
|
|
|
|
key: test_mcc
|
|
value: [0.42857143 0.57735027 0.74535599 0.8660254 0.8660254 0.74535599
|
|
0.8660254 0.57735027 1. 0.63245553]
|
|
|
|
mean value: 0.7304515695337532
|
|
|
|
key: train_mcc
|
|
value: [0.95250095 0.98425098 0.95346259 0.96825397 0.98425098 0.90659109
|
|
0.96874225 0.96825397 0.96825397 0.98425098]
|
|
|
|
mean value: 0.9638811739095032
|
|
|
|
key: test_accuracy
|
|
value: [0.71428571 0.78571429 0.85714286 0.92857143 0.92857143 0.85714286
|
|
0.92857143 0.78571429 1. 0.78571429]
|
|
|
|
mean value: 0.8571428571428571
|
|
|
|
key: train_accuracy
|
|
value: [0.97619048 0.99206349 0.97619048 0.98412698 0.99206349 0.95238095
|
|
0.98412698 0.98412698 0.98412698 0.99206349]
|
|
|
|
mean value: 0.9817460317460317
|
|
|
|
key: test_fscore
|
|
value: [0.71428571 0.76923077 0.875 0.92307692 0.93333333 0.875
|
|
0.93333333 0.8 1. 0.82352941]
|
|
|
|
mean value: 0.864678948502478
|
|
|
|
key: train_fscore
|
|
value: [0.97637795 0.99212598 0.97674419 0.98412698 0.992 0.95384615
|
|
0.984375 0.98412698 0.98412698 0.992 ]
|
|
|
|
mean value: 0.9819850229281492
|
|
|
|
key: test_precision
|
|
value: [0.71428571 0.83333333 0.77777778 1. 0.875 0.77777778
|
|
0.875 0.75 1. 0.7 ]
|
|
|
|
mean value: 0.8303174603174603
|
|
|
|
key: train_precision
|
|
value: [0.96875 0.984375 0.95454545 0.98412698 1. 0.92537313
|
|
0.96923077 0.98412698 0.98412698 1. ]
|
|
|
|
mean value: 0.9754655310485534
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.71428571 1. 0.85714286 1. 1.
|
|
1. 0.85714286 1. 1. ]
|
|
|
|
mean value: 0.9142857142857143
|
|
|
|
key: train_recall
|
|
value: [0.98412698 1. 1. 0.98412698 0.98412698 0.98412698
|
|
1. 0.98412698 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9888888888888888
|
|
|
|
key: test_roc_auc
|
|
value: [0.71428571 0.78571429 0.85714286 0.92857143 0.92857143 0.85714286
|
|
0.92857143 0.78571429 1. 0.78571429]
|
|
|
|
mean value: 0.8571428571428572
|
|
|
|
key: train_roc_auc
|
|
value: [0.97619048 0.99206349 0.97619048 0.98412698 0.99206349 0.95238095
|
|
0.98412698 0.98412698 0.98412698 0.99206349]
|
|
|
|
mean value: 0.9817460317460318
|
|
|
|
key: test_jcc
|
|
value: [0.55555556 0.625 0.77777778 0.85714286 0.875 0.77777778
|
|
0.875 0.66666667 1. 0.7 ]
|
|
|
|
mean value: 0.7709920634920635
|
|
|
|
key: train_jcc
|
|
value: [0.95384615 0.984375 0.95454545 0.96875 0.98412698 0.91176471
|
|
0.96923077 0.96875 0.96875 0.98412698]
|
|
|
|
mean value: 0.9648266051758698
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Stochastic GDescent
|
|
Model func: SGDClassifier(n_jobs=10, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', SGDClassifier(n_jobs=10, random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.01382709 0.01273179 0.0128994 0.01273489 0.01316762 0.01284289
|
|
0.01315284 0.01348805 0.01302576 0.01242471]
|
|
|
|
mean value: 0.01302950382232666
|
|
|
|
key: score_time
|
|
value: [0.01025319 0.01141953 0.01149654 0.01144791 0.01151443 0.01140189
|
|
0.01157451 0.01143384 0.01143146 0.0115211 ]
|
|
|
|
mean value: 0.01134943962097168
|
|
|
|
key: test_mcc
|
|
value: [-0.17407766 0.71428571 0.52223297 0.8660254 0.71428571 0.74535599
|
|
0.8660254 0.71428571 0.52223297 0.2773501 ]
|
|
|
|
mean value: 0.5768002320817054
|
|
|
|
key: train_mcc
|
|
value: [0.78446454 0.98425098 0.69451634 0.96825397 1. 0.95250095
|
|
0.89442719 0.88014083 0.71977239 0.68199434]
|
|
|
|
mean value: 0.8560321533026929
|
|
|
|
key: test_accuracy
|
|
value: [0.42857143 0.85714286 0.71428571 0.92857143 0.85714286 0.85714286
|
|
0.92857143 0.85714286 0.71428571 0.57142857]
|
|
|
|
mean value: 0.7714285714285715
|
|
|
|
key: train_accuracy
|
|
value: [0.88095238 0.99206349 0.82539683 0.98412698 1. 0.97619048
|
|
0.94444444 0.93650794 0.84126984 0.81746032]
|
|
|
|
mean value: 0.9198412698412698
|
|
|
|
key: test_fscore
|
|
value: [0.55555556 0.85714286 0.77777778 0.92307692 0.85714286 0.875
|
|
0.93333333 0.85714286 0.6 0.7 ]
|
|
|
|
mean value: 0.7936172161172161
|
|
|
|
key: train_fscore
|
|
value: [0.89361702 0.99212598 0.85135135 0.98412698 1. 0.97637795
|
|
0.94736842 0.93220339 0.81132075 0.84563758]
|
|
|
|
mean value: 0.9234129443255544
|
|
|
|
key: test_precision
|
|
value: [0.45454545 0.85714286 0.63636364 1. 0.85714286 0.77777778
|
|
0.875 0.85714286 1. 0.53846154]
|
|
|
|
mean value: 0.7853576978576978
|
|
|
|
key: train_precision
|
|
value: [0.80769231 0.984375 0.74117647 0.98412698 1. 0.96875
|
|
0.9 1. 1. 0.73255814]
|
|
|
|
mean value: 0.9118678901942411
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.85714286 1. 0.85714286 0.85714286 1.
|
|
1. 0.85714286 0.42857143 1. ]
|
|
|
|
mean value: 0.8571428571428571
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 0.98412698 1. 0.98412698
|
|
1. 0.87301587 0.68253968 1. ]
|
|
|
|
mean value: 0.9523809523809523
|
|
|
|
key: test_roc_auc
|
|
value: [0.42857143 0.85714286 0.71428571 0.92857143 0.85714286 0.85714286
|
|
0.92857143 0.85714286 0.71428571 0.57142857]
|
|
|
|
mean value: 0.7714285714285715
|
|
|
|
key: train_roc_auc
|
|
value: [0.88095238 0.99206349 0.82539683 0.98412698 1. 0.97619048
|
|
0.94444444 0.93650794 0.84126984 0.81746032]
|
|
|
|
mean value: 0.9198412698412698
|
|
|
|
key: test_jcc
|
|
value: [0.38461538 0.75 0.63636364 0.85714286 0.75 0.77777778
|
|
0.875 0.75 0.42857143 0.53846154]
|
|
|
|
mean value: 0.6747932622932623
|
|
|
|
key: train_jcc
|
|
value: [0.80769231 0.984375 0.74117647 0.96875 1. 0.95384615
|
|
0.9 0.87301587 0.68253968 0.73255814]
|
|
|
|
mean value: 0.8643953627217136
|
|
|
|
MCC on Blind test: 0.41
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: AdaBoost Classifier
|
|
Model func: AdaBoostClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', AdaBoostClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.10028696 0.08939171 0.09007287 0.08964419 0.08995485 0.09116292
|
|
0.08989072 0.09051132 0.08987451 0.09023666]
|
|
|
|
mean value: 0.09110267162322998
|
|
|
|
key: score_time
|
|
value: [0.01456475 0.01453018 0.01480341 0.01456952 0.01471496 0.01463842
|
|
0.01459575 0.01482821 0.01475573 0.01459885]
|
|
|
|
mean value: 0.014659976959228516
|
|
|
|
key: test_mcc
|
|
value: [0.63245553 0.71428571 1. 0.8660254 0.74535599 0.8660254
|
|
1. 1. 1. 0.8660254 ]
|
|
|
|
mean value: 0.8690173450172636
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.78571429 0.85714286 1. 0.92857143 0.85714286 0.92857143
|
|
1. 1. 1. 0.92857143]
|
|
|
|
mean value: 0.9285714285714286
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.82352941 0.85714286 1. 0.92307692 0.875 0.93333333
|
|
1. 1. 1. 0.93333333]
|
|
|
|
mean value: 0.9345415858651153
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.7 0.85714286 1. 1. 0.77777778 0.875
|
|
1. 1. 1. 0.875 ]
|
|
|
|
mean value: 0.9084920634920635
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.85714286 1. 0.85714286 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9714285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.78571429 0.85714286 1. 0.92857143 0.85714286 0.92857143
|
|
1. 1. 1. 0.92857143]
|
|
|
|
mean value: 0.9285714285714286
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.7 0.75 1. 0.85714286 0.77777778 0.875
|
|
1. 1. 1. 0.875 ]
|
|
|
|
mean value: 0.8834920634920634
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 1.0
|
|
|
|
Accuracy on Blind test: 1.0
|
|
|
|
Model_name: Bagging Classifier
|
|
Model func: BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model',
|
|
BaggingClassifier(n_jobs=10, oob_score=True,
|
|
random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03448319 0.02736044 0.02808571 0.05007601 0.04261303 0.04503179
|
|
0.0342381 0.04089355 0.04537654 0.02863741]
|
|
|
|
mean value: 0.0376795768737793
|
|
|
|
key: score_time
|
|
value: [0.01859689 0.02061486 0.02387714 0.03524613 0.02392983 0.01751757
|
|
0.021662 0.03886604 0.02194452 0.02438903]
|
|
|
|
mean value: 0.02466440200805664
|
|
|
|
key: test_mcc
|
|
value: [0.63245553 0.8660254 1. 0.8660254 0.8660254 0.63245553
|
|
1. 1. 1. 0.8660254 ]
|
|
|
|
mean value: 0.8729012679205107
|
|
|
|
key: train_mcc
|
|
value: [1. 0.98425098 1. 1. 0.98425098 1.
|
|
1. 0.98425098 1. 1. ]
|
|
|
|
mean value: 0.9952752952754429
|
|
|
|
key: test_accuracy
|
|
value: [0.78571429 0.92857143 1. 0.92857143 0.92857143 0.78571429
|
|
1. 1. 1. 0.92857143]
|
|
|
|
mean value: 0.9285714285714286
|
|
|
|
key: train_accuracy
|
|
value: [1. 0.99206349 1. 1. 0.99206349 1.
|
|
1. 0.99206349 1. 1. ]
|
|
|
|
mean value: 0.9976190476190476
|
|
|
|
key: test_fscore
|
|
value: [0.82352941 0.92307692 1. 0.92307692 0.93333333 0.82352941
|
|
1. 1. 1. 0.93333333]
|
|
|
|
mean value: 0.9359879336349924
|
|
|
|
key: train_fscore
|
|
value: [1. 0.99212598 1. 1. 0.99212598 1.
|
|
1. 0.992 1. 1. ]
|
|
|
|
mean value: 0.9976251968503937
|
|
|
|
key: test_precision
|
|
value: [0.7 1. 1. 1. 0.875 0.7 1. 1. 1. 0.875]
|
|
|
|
mean value: 0.915
|
|
|
|
key: train_precision
|
|
value: [1. 0.984375 1. 1. 0.984375 1. 1. 1.
|
|
1. 1. ]
|
|
|
|
mean value: 0.996875
|
|
|
|
key: test_recall
|
|
value: [1. 0.85714286 1. 0.85714286 1. 1.
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9714285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1.
|
|
1. 0.98412698 1. 1. ]
|
|
|
|
mean value: 0.9984126984126984
|
|
|
|
key: test_roc_auc
|
|
value: [0.78571429 0.92857143 1. 0.92857143 0.92857143 0.78571429
|
|
1. 1. 1. 0.92857143]
|
|
|
|
mean value: 0.9285714285714286
|
|
|
|
key: train_roc_auc
|
|
value: [1. 0.99206349 1. 1. 0.99206349 1.
|
|
1. 0.99206349 1. 1. ]
|
|
|
|
mean value: 0.9976190476190476
|
|
|
|
key: test_jcc
|
|
value: [0.7 0.85714286 1. 0.85714286 0.875 0.7
|
|
1. 1. 1. 0.875 ]
|
|
|
|
mean value: 0.8864285714285715
|
|
|
|
key: train_jcc
|
|
value: [1. 0.984375 1. 1. 0.984375 1.
|
|
1. 0.98412698 1. 1. ]
|
|
|
|
mean value: 0.9952876984126984
|
|
|
|
MCC on Blind test: 1.0
|
|
|
|
Accuracy on Blind test: 1.0
|
|
|
|
Model_name: Gaussian Process
|
|
Model func: GaussianProcessClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GaussianProcessClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03126049 0.05000067 0.05051756 0.046978 0.04926348 0.0447371
|
|
0.04575491 0.05056047 0.05645704 0.04601097]
|
|
|
|
mean value: 0.04715406894683838
|
|
|
|
key: score_time
|
|
value: [0.02221918 0.02074003 0.02268863 0.0243206 0.01368237 0.02017021
|
|
0.02368927 0.02083921 0.02071238 0.02281642]
|
|
|
|
mean value: 0.021187829971313476
|
|
|
|
key: test_mcc
|
|
value: [0.52223297 0.31622777 0.63245553 0.63245553 0.63245553 0.4472136
|
|
0.8660254 0.74535599 1. 0.57735027]
|
|
|
|
mean value: 0.6371772590958912
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.71428571 0.64285714 0.78571429 0.78571429 0.78571429 0.71428571
|
|
0.92857143 0.85714286 1. 0.78571429]
|
|
|
|
mean value: 0.8
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.77777778 0.54545455 0.82352941 0.82352941 0.82352941 0.75
|
|
0.93333333 0.83333333 1. 0.8 ]
|
|
|
|
mean value: 0.8110487225193107
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.63636364 0.75 0.7 0.7 0.7 0.66666667
|
|
0.875 1. 1. 0.75 ]
|
|
|
|
mean value: 0.7778030303030303
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.42857143 1. 1. 1. 0.85714286
|
|
1. 0.71428571 1. 0.85714286]
|
|
|
|
mean value: 0.8857142857142857
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.71428571 0.64285714 0.78571429 0.78571429 0.78571429 0.71428571
|
|
0.92857143 0.85714286 1. 0.78571429]
|
|
|
|
mean value: 0.8
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.63636364 0.375 0.7 0.7 0.7 0.6
|
|
0.875 0.71428571 1. 0.66666667]
|
|
|
|
mean value: 0.6967316017316018
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.25
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Gradient Boosting
|
|
Model func: GradientBoostingClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', GradientBoostingClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.22272778 0.25291204 0.25486708 0.21611547 0.24843001 0.24562812
|
|
0.24883819 0.25277472 0.25142169 0.24883938]
|
|
|
|
mean value: 0.2442554473876953
|
|
|
|
key: score_time
|
|
value: [0.00956607 0.00925088 0.00964093 0.00936627 0.00943518 0.00998449
|
|
0.00920153 0.00911522 0.01006389 0.00919318]
|
|
|
|
mean value: 0.00948176383972168
|
|
|
|
key: test_mcc
|
|
value: [0.63245553 0.8660254 1. 0.8660254 1. 0.8660254
|
|
1. 1. 1. 0.8660254 ]
|
|
|
|
mean value: 0.9096557147171431
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.78571429 0.92857143 1. 0.92857143 1. 0.92857143
|
|
1. 1. 1. 0.92857143]
|
|
|
|
mean value: 0.95
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.82352941 0.92307692 1. 0.92307692 1. 0.92307692
|
|
1. 1. 1. 0.93333333]
|
|
|
|
mean value: 0.9526093514328808
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [0.7 1. 1. 1. 1. 1. 1. 1. 1. 0.875]
|
|
|
|
mean value: 0.9575
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [1. 0.85714286 1. 0.85714286 1. 0.85714286
|
|
1. 1. 1. 1. ]
|
|
|
|
mean value: 0.9571428571428571
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.78571429 0.92857143 1. 0.92857143 1. 0.92857143
|
|
1. 1. 1. 0.92857143]
|
|
|
|
mean value: 0.95
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.7 0.85714286 1. 0.85714286 1. 0.85714286
|
|
1. 1. 1. 0.875 ]
|
|
|
|
mean value: 0.9146428571428571
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 1.0
|
|
|
|
Accuracy on Blind test: 1.0
|
|
|
|
Model_name: QDA
|
|
Model func: QuadraticDiscriminantAnalysis()
|
|
List of models: /home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/discriminant_analysis.py:887: UserWarning: Variables are collinear
|
|
warnings.warn("Variables are collinear")
|
|
/home/tanu/anaconda3/envs/UQ/lib/python3.9/site-packages/sklearn/metrics/_classification.py:1327: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
|
|
_warn_prf(average, modifier, msg_start, len(result))
|
|
[('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', QuadraticDiscriminantAnalysis())])
|
|
|
|
key: fit_time
|
|
value: [0.01456022 0.01583648 0.01625037 0.01607895 0.01611233 0.01637053
|
|
0.01687026 0.0162425 0.01670003 0.01632261]
|
|
|
|
mean value: 0.016134428977966308
|
|
|
|
key: score_time
|
|
value: [0.01166463 0.01188278 0.01198721 0.01194906 0.01391625 0.01442289
|
|
0.01482201 0.01400399 0.01370358 0.01407051]
|
|
|
|
mean value: 0.013242292404174804
|
|
|
|
key: test_mcc
|
|
value: [0.74535599 0.52223297 0.74535599 0.74535599 0.74535599 0.74535599
|
|
0.63245553 0.63245553 0.8660254 0.74535599]
|
|
|
|
mean value: 0.7125305390718464
|
|
|
|
key: train_mcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_accuracy
|
|
value: [0.85714286 0.71428571 0.85714286 0.85714286 0.85714286 0.85714286
|
|
0.78571429 0.78571429 0.92857143 0.85714286]
|
|
|
|
mean value: 0.8357142857142856
|
|
|
|
key: train_accuracy
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_fscore
|
|
value: [0.83333333 0.6 0.83333333 0.83333333 0.83333333 0.83333333
|
|
0.72727273 0.72727273 0.92307692 0.83333333]
|
|
|
|
mean value: 0.7977622377622378
|
|
|
|
key: train_fscore
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: train_precision
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.42857143 0.71428571 0.71428571 0.71428571 0.71428571
|
|
0.57142857 0.57142857 0.85714286 0.71428571]
|
|
|
|
mean value: 0.6714285714285714
|
|
|
|
key: train_recall
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_roc_auc
|
|
value: [0.85714286 0.71428571 0.85714286 0.85714286 0.85714286 0.85714286
|
|
0.78571429 0.78571429 0.92857143 0.85714286]
|
|
|
|
mean value: 0.8357142857142857
|
|
|
|
key: train_roc_auc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
key: test_jcc
|
|
value: [0.71428571 0.42857143 0.71428571 0.71428571 0.71428571 0.71428571
|
|
0.57142857 0.57142857 0.85714286 0.71428571]
|
|
|
|
mean value: 0.6714285714285714
|
|
|
|
key: train_jcc
|
|
value: [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
|
|
|
|
mean value: 1.0
|
|
|
|
MCC on Blind test: 0.0
|
|
|
|
Accuracy on Blind test: 0.6
|
|
|
|
Model_name: Ridge Classifier
|
|
Model func: RidgeClassifier(random_state=42)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifier(random_state=42))])
|
|
|
|
key: fit_time
|
|
value: [0.03757238 0.03940368 0.0314343 0.03359485 0.03294992 0.03301167
|
|
0.03303957 0.03291583 0.0330925 0.03312182]
|
|
|
|
mean value: 0.03401365280151367
|
|
|
|
key: score_time
|
|
value: [0.02323055 0.02016068 0.01181483 0.02312255 0.02275276 0.02220821
|
|
0.02225208 0.02244258 0.02231526 0.02217078]
|
|
|
|
mean value: 0.021247029304504395
|
|
|
|
key: test_mcc
|
|
value: [0.28867513 0.71428571 0.8660254 0.8660254 0.74535599 0.8660254
|
|
0.74535599 0.71428571 1. 0.63245553]
|
|
|
|
mean value: 0.7438490291553094
|
|
|
|
key: train_mcc
|
|
value: [0.96825397 0.98425098 0.95250095 0.96825397 0.98425098 0.95250095
|
|
0.96825397 0.95250095 0.96825397 0.96825397]
|
|
|
|
mean value: 0.966727466727708
|
|
|
|
key: test_accuracy
|
|
value: [0.64285714 0.85714286 0.92857143 0.92857143 0.85714286 0.92857143
|
|
0.85714286 0.85714286 1. 0.78571429]
|
|
|
|
mean value: 0.8642857142857143
|
|
|
|
key: train_accuracy
|
|
value: [0.98412698 0.99206349 0.97619048 0.98412698 0.99206349 0.97619048
|
|
0.98412698 0.97619048 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9833333333333333
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.85714286 0.93333333 0.92307692 0.875 0.93333333
|
|
0.875 0.85714286 1. 0.82352941]
|
|
|
|
mean value: 0.8744225382460676
|
|
|
|
key: train_fscore
|
|
value: [0.98412698 0.99212598 0.97637795 0.98412698 0.992 0.97637795
|
|
0.98412698 0.97637795 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9833894763154605
|
|
|
|
key: test_precision
|
|
value: [0.625 0.85714286 0.875 1. 0.77777778 0.875
|
|
0.77777778 0.85714286 1. 0.7 ]
|
|
|
|
mean value: 0.834484126984127
|
|
|
|
key: train_precision
|
|
value: [0.98412698 0.984375 0.96875 0.98412698 1. 0.96875
|
|
0.98412698 0.96875 0.98412698 0.98412698]
|
|
|
|
mean value: 0.981125992063492
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.85714286 1. 0.85714286 1. 1.
|
|
1. 0.85714286 1. 1. ]
|
|
|
|
mean value: 0.9285714285714286
|
|
|
|
key: train_recall
|
|
value: [0.98412698 1. 0.98412698 0.98412698 0.98412698 0.98412698
|
|
0.98412698 0.98412698 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9857142857142857
|
|
|
|
key: test_roc_auc
|
|
value: [0.64285714 0.85714286 0.92857143 0.92857143 0.85714286 0.92857143
|
|
0.85714286 0.85714286 1. 0.78571429]
|
|
|
|
mean value: 0.8642857142857143
|
|
|
|
key: train_roc_auc
|
|
value: [0.98412698 0.99206349 0.97619048 0.98412698 0.99206349 0.97619048
|
|
0.98412698 0.97619048 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9833333333333334
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.75 0.875 0.85714286 0.77777778 0.875
|
|
0.77777778 0.75 1. 0.7 ]
|
|
|
|
mean value: 0.7862698412698412
|
|
|
|
key: train_jcc
|
|
value: [0.96875 0.984375 0.95384615 0.96875 0.98412698 0.95384615
|
|
0.96875 0.95384615 0.96875 0.96875 ]
|
|
|
|
mean value: 0.9673790445665446
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|
|
|
|
Model_name: Ridge ClassifierCV
|
|
Model func: RidgeClassifierCV(cv=10)
|
|
List of models: [('Logistic Regression', LogisticRegression(random_state=42)), ('Logistic RegressionCV', LogisticRegressionCV(random_state=42)), ('Gaussian NB', GaussianNB()), ('Naive Bayes', BernoulliNB()), ('K-Nearest Neighbors', KNeighborsClassifier()), ('SVM', SVC(random_state=42)), ('MLP', MLPClassifier(max_iter=500, random_state=42)), ('Decision Tree', DecisionTreeClassifier(random_state=42)), ('Extra Trees', ExtraTreesClassifier(random_state=42)), ('Extra Tree', ExtraTreeClassifier(random_state=42)), ('Random Forest', RandomForestClassifier(n_estimators=1000, random_state=42)), ('Random Forest2', RandomForestClassifier(max_features='auto', min_samples_leaf=5,
|
|
n_estimators=1000, n_jobs=10, oob_score=True,
|
|
random_state=42)), ('Naive Bayes', BernoulliNB()), ('XGBoost', XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
|
|
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
|
|
gamma=0, gpu_id=-1, importance_type=None,
|
|
interaction_constraints='', learning_rate=0.300000012,
|
|
max_delta_step=0, max_depth=6, min_child_weight=1, missing=nan,
|
|
monotone_constraints='()', n_estimators=100, n_jobs=12,
|
|
num_parallel_tree=1, predictor='auto', random_state=42,
|
|
reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
|
|
tree_method='exact', use_label_encoder=False,
|
|
validate_parameters=1, verbosity=0)), ('LDA', LinearDiscriminantAnalysis()), ('Multinomial', MultinomialNB()), ('Passive Aggresive', PassiveAggressiveClassifier(n_jobs=10, random_state=42)), ('Stochastic GDescent', SGDClassifier(n_jobs=10, random_state=42)), ('AdaBoost Classifier', AdaBoostClassifier(random_state=42)), ('Bagging Classifier', BaggingClassifier(n_jobs=10, oob_score=True, random_state=42)), ('Gaussian Process', GaussianProcessClassifier(random_state=42)), ('Gradient Boosting', GradientBoostingClassifier(random_state=42)), ('QDA', QuadraticDiscriminantAnalysis()), ('Ridge Classifier', RidgeClassifier(random_state=42)), ('Ridge ClassifierCV', RidgeClassifierCV(cv=10))]
|
|
Running model pipeline: Pipeline(steps=[('prep',
|
|
ColumnTransformer(remainder='passthrough',
|
|
transformers=[('num', MinMaxScaler(),
|
|
Index(['ligand_distance', 'ligand_affinity_change', 'duet_stability_change',
|
|
'ddg_foldx', 'deepddg', 'ddg_dynamut2', 'mmcsm_lig', 'contacts',
|
|
'mcsm_na_affinity', 'rsa',
|
|
...
|
|
'VENM980101', 'VOGG950101', 'WEIL970101', 'WEIL970102', 'ZHAC000101',
|
|
'ZHAC000102', 'ZHAC000103', 'ZHAC000104', 'ZHAC000105', 'ZHAC000106'],
|
|
dtype='object', length=167)),
|
|
('cat', OneHotEncoder(),
|
|
Index(['ss_class', 'aa_prop_change', 'electrostatics_change',
|
|
'polarity_change', 'water_change', 'drtype_mode_labels', 'active_site'],
|
|
dtype='object'))])),
|
|
('model', RidgeClassifierCV(cv=10))])
|
|
|
|
key: fit_time
|
|
value: [0.23806024 0.24101663 0.23600936 0.27989411 0.26349449 0.21895146
|
|
0.12900424 0.16086125 0.30366659 0.30462027]
|
|
|
|
mean value: 0.23755786418914795
|
|
|
|
key: score_time
|
|
value: [0.0202539 0.02047062 0.02365375 0.02144051 0.021348 0.0231626
|
|
0.01186275 0.02951169 0.02209163 0.016994 ]
|
|
|
|
mean value: 0.021078944206237793
|
|
|
|
key: test_mcc
|
|
value: [0.28867513 0.71428571 0.8660254 0.8660254 0.74535599 0.8660254
|
|
0.74535599 0.71428571 1. 0.63245553]
|
|
|
|
mean value: 0.7438490291553094
|
|
|
|
key: train_mcc
|
|
value: [0.96825397 0.98425098 0.95250095 0.96825397 0.98425098 0.95250095
|
|
0.96825397 0.95250095 0.96825397 0.96825397]
|
|
|
|
mean value: 0.966727466727708
|
|
|
|
key: test_accuracy
|
|
value: [0.64285714 0.85714286 0.92857143 0.92857143 0.85714286 0.92857143
|
|
0.85714286 0.85714286 1. 0.78571429]
|
|
|
|
mean value: 0.8642857142857143
|
|
|
|
key: train_accuracy
|
|
value: [0.98412698 0.99206349 0.97619048 0.98412698 0.99206349 0.97619048
|
|
0.98412698 0.97619048 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9833333333333333
|
|
|
|
key: test_fscore
|
|
value: [0.66666667 0.85714286 0.93333333 0.92307692 0.875 0.93333333
|
|
0.875 0.85714286 1. 0.82352941]
|
|
|
|
mean value: 0.8744225382460676
|
|
|
|
key: train_fscore
|
|
value: /home/tanu/git/LSHTM_analysis/scripts/ml/./gid_sl.py:188: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rouC_CT.sort_values(by = ['test_mcc'], ascending = False, inplace = True)
|
|
/home/tanu/git/LSHTM_analysis/scripts/ml/./gid_sl.py:191: SettingWithCopyWarning:
|
|
A value is trying to be set on a copy of a slice from a DataFrame
|
|
|
|
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
|
|
rouC_BT.sort_values(by = ['bts_mcc'], ascending = False, inplace = True)
|
|
[0.98412698 0.99212598 0.97637795 0.98412698 0.992 0.97637795
|
|
0.98412698 0.97637795 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9833894763154605
|
|
|
|
key: test_precision
|
|
value: [0.625 0.85714286 0.875 1. 0.77777778 0.875
|
|
0.77777778 0.85714286 1. 0.7 ]
|
|
|
|
mean value: 0.834484126984127
|
|
|
|
key: train_precision
|
|
value: [0.98412698 0.984375 0.96875 0.98412698 1. 0.96875
|
|
0.98412698 0.96875 0.98412698 0.98412698]
|
|
|
|
mean value: 0.981125992063492
|
|
|
|
key: test_recall
|
|
value: [0.71428571 0.85714286 1. 0.85714286 1. 1.
|
|
1. 0.85714286 1. 1. ]
|
|
|
|
mean value: 0.9285714285714286
|
|
|
|
key: train_recall
|
|
value: [0.98412698 1. 0.98412698 0.98412698 0.98412698 0.98412698
|
|
0.98412698 0.98412698 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9857142857142857
|
|
|
|
key: test_roc_auc
|
|
value: [0.64285714 0.85714286 0.92857143 0.92857143 0.85714286 0.92857143
|
|
0.85714286 0.85714286 1. 0.78571429]
|
|
|
|
mean value: 0.8642857142857143
|
|
|
|
key: train_roc_auc
|
|
value: [0.98412698 0.99206349 0.97619048 0.98412698 0.99206349 0.97619048
|
|
0.98412698 0.97619048 0.98412698 0.98412698]
|
|
|
|
mean value: 0.9833333333333334
|
|
|
|
key: test_jcc
|
|
value: [0.5 0.75 0.875 0.85714286 0.77777778 0.875
|
|
0.77777778 0.75 1. 0.7 ]
|
|
|
|
mean value: 0.7862698412698412
|
|
|
|
key: train_jcc
|
|
value: [0.96875 0.984375 0.95384615 0.96875 0.98412698 0.95384615
|
|
0.96875 0.95384615 0.96875 0.96875 ]
|
|
|
|
mean value: 0.9673790445665446
|
|
|
|
MCC on Blind test: 0.58
|
|
|
|
Accuracy on Blind test: 0.8
|